Transcription Services

What Will My Transcript Look Like? Using Transcription Services

By Thomas Carter on March, 28 2019
Thomas Carter

 

When first investigating transcription services, most people realise that they aren’t quite sure exactly what their investment in a transcript will deliver.

Following a little bit of investigation, people often become even more confused after seeing the wide variation in pricing that exists between transcription companies and different transcription services.

You are likely to wonder, do these price points reflect serious differences in outcome? Will a £3 a minute transcript look different from one that costs 10p?  

The quick answer is yes — these price points reflect differences in how your transcript will be presented, along with the actual content of your transcript. Fundamentally, your transcript will be a text-based reflection of whatever recording you are having transcribed. But, what that really means covers a lot of variety — anything from a document that resembles the script of a play, replete with timestamps, notes on tone, laughter and pauses, all of the way to a pile of barely legible gibberish.

The reality is that your transcript can look like anything you want, it just comes down to cost. The best transcription for you depends on your transcription needs.

Let’s explain.

Human Transcription Services and Automatic Speech Recognition (ASR) Software

The real divide in audio transcription services is between those completed by a human transcriptionist and automated software-based transcription services. Speech-to-text software has come a long way in recent years. But, it is still a work in progress, currently unable to deliver the same level of quality, consistency or reliability that human-based transcription services can offer.

What you get when using ASR programs is highly dependent on the quality of your audio recording. A high-quality recording of someone speaking slowly and clearly without background noise is likely to produce relatively clean results. But, even under ideal conditions, you can generally expect error rates close to 20% when using ASR. Although most ASR services offer speaker identification features, this is something that the software generally struggles to execute. However, you will be able to get accurate timestamping — something that is probably advisable and will help you go back through and clean up the transcript.

Fundamentally, you have few options when it comes to ASR. You will simply get what you get — a best effort at a full-verbatim transcript of your recording. However, ASR services are cheap, some are even free. Paid services generally deliver slightly better results and security features like encrypted storage. But, you should not expect to pay more than 10p per minute of recorded audio when using ASR. It is, therefore, worth investigating the outcomes firsthand. That way, you can get a good sense of ASR outcomes based on different types of audio quality. 

New call-to-action

Customised Options with Human Transcription Services

Human transcription services are considerably more expensive — starting around £0.70 per minute. Realistically, however, complexities in pricing mean that you should expect to pay closer to £2 per minute. You might be able to get that starting rate down closer to £1 if you are willing to wait for long turnarounds and don’t require things like timestamping or the identification of speakers. The lowest priced services charge extra for these features, along with operating variable pricing models based on audio quality and other complicating factors.

If looking for a low-cost human transcription service, audio quality matters almost as much as it does with ASR. The difference, however, is that with human transcription service, poor audio quality will simply drive up costs, not result in poor outcomes. Fundamentally, however, your human created transcript can look like anything you want. It is simply a matter of cost.

Although you will often be able to include specific notes about formatting, important segments of the recording, how you would like speakers identified (or anything else you can think of!), there are three broad categories of formatting and level of detail delivered by human transcription services: verbatim, full-verbatim and detailed notes.

Full-Verbatim Transcripts

Full-verbatim transcripts include every detail. This is the accurate version of what software-based services attempt to deliver —  a word-for-word, exact account of an audio file. For an extra fee, you can get an annotated transcript that includes notes on everything from tone, laughter, pauses and more. For some, this is the only solution fit for the job. But, under most circumstances, this type of transcript will be overkill, costing more than you need to spend and delivering a transcript that is challenging to read.

Verbatim Transcripts

Verbatim transcripts (also known as intelligent verbatim) are an abridged account of the recording that still delivers details. Standard verbatim transcripts edit convoluted sentence structures and remove redundancies, ‘ums’, ‘errs’ and other quirks of speech that are difficult to read — creating a smooth reading experience that still retains a high level of detail and authenticity. Verbatim transcripts are also easier to produce and cost less per minute than full-verbatim transcripts. This is the most versatile and popular type of transcript.   

Detailed Notes Transcripts

Detailed note transcripts take the editing of a verbatim transcript one step further. For interviews, questions are summarised and all off-topic chit chat is removed. You are often able to provide notes on the type of information that is most important, or opt to have segments transcribed with more detail -- highlighting specific quotes or subject matter. Per minute of audio, these are generally the cheapest form of human-based transcription service.

Examples of What Your Transcript Might Look Like

Full-Verbatim Transcript

(00:01)

Reader: This is great (asked incredulously), but, umm (...) what I really need is (..) what, an example would be great. What I need is to be able to look at this as an example (.) I am a visual learner, you know. Abstracted (.) abstract, err, explanations are just not nearly as good at conveying meaning as, as simply being (...) It would be easier if I could just see this in action.   

(00:03)

Thomas: No problem! (stated with serious enthusiasm) I can, I can perfectly understand. Full-verbatim transcripts deliver everything that is said, in exact detail.

(00:04)

Reader: I do (..) and speaker identification will cost more?

(00:04)

Thomas: Yeah. And so do the annotations (said with understanding)  — but, what matters is what is important to the end result.

(00:05)

Reader: It would, I understand, it would be more time-consuming to produce. (slight laugh)

(00:05)

Thomas: As you can (.) as you can see, with full-verbatim, full-verbatim transcripts include the complexities of speech and annotations on pauses represented (.)

(00:06)

Reader: Clearly!

(00:06)

Thomas: … by full stops. With simple verbatim, that wouldn’t be there with verbatim. And a detailed notes transcript would be quite different (..) The, the frequency of timestamps and the presence of speaker identification are optional.

New call-to-action

Verbatim Transcript

(00:01)

Reader: That is great. But, what I really need is an example. I am a visual learner. Abstract explanations are just not nearly as good at conveying meaning as simply being able to see something in action.

Thomas: No problem, I can perfectly understand. Full-verbatim transcripts deliver everything that is said, in exact detail.

Reader: And speaker identification will cost more?

Thomas: Yeah. And so do the annotations  — but, what matters is what is important to the end result.

(00:05)

Reader: I understand, it would be more time-consuming to produce.  

Thomas: As you can see, full-verbatim transcripts include the complexities of speech and annotations on pauses represented by full stops. With simple verbatim, that wouldn’t be there. And a detailed notes transcript would be quite different. The frequency of timestamps and the presence of speaker identification are optional.    

 

Detailed Notes Transcript

(00:01)

Reader: Requests to see examples to better understand

Thomas: No problem, I can perfectly understand. Full-verbatim transcripts deliver everything that is said, in exact detail.

Reader: Speaker identification costs more?

Thomas: Yeah. And so do the annotations  — but, what matters is what is important to the end result. As you can see, full-verbatim transcripts include the complexities of speech and annotations on pauses represented by full stops. With simple verbatim, that wouldn’t be there. And a detailed notes transcript would be quite different. The frequency of timestamps and the presence of speaker identification are optional.

 

ASR (medium audio quality)

(00:01)

A: This grate butter I rally need is what an example wood be great. What I needle to be able to look at this as an example I am a visual learner you know. Abstracted abstracter explanations art just not nearly as good at conveying means aws, as simple being It would be easy it I cold just see thesis in action.No problem I can can perfect stand. Fuel verbatim transcript deliver anything that is saiding exact detail I do and speaker identification will cost more.

B: yeah and so do the annotations but it just matters on what is important to the end result it wood,

(00:05)

A: It understand it would be more time-consuming to produce.  

B: As you can as you can see, with full-verbatim, full-verbatim transcripts include the complexities of speech and annotations on pauses represented cleary by full stops. With smile verbatim, that wouldn’t be there with verbatim. And a detailed notes transcript would be quite different. Tree present of times and speaking identical are operational.  

Summary: Your Transcript Can Look Like Anything You Need

Fundamentally, there is a lot of variability in what quality transcription services deliver.

The first decision you need to make is whether or not you can tolerate the errors and issues with speaker identification present in ASR services. There is a huge cost differential, and your willingness to spend time editing can save you a lot of money while still getting accurate transcriptions. But, you lose the other options available with human transcription services. The better your audio quality, the more success you will have with ASR.

Splashing out for human transcription services not only delivers quality, it delivers options. You can choose levels of detail, getting anything from an annotated verbatim transcript to an edited transcript that solely focuses on the important information. The industry standard, however, is verbatim, which delivers a slightly edited, but much easier to read account of the recording. Except in instances where every single hiccup and turn of phrase is crucial, verbatim probably delivers the best value — providing a transcript that is accurate and actually easy to read.

The next choices are extras and special requests. Things like timestamps and speaker identification aren't always included in the base rate. They may be completely unimportant to you or critical to the outcome you need. Then there are more choices like the interval length between timestamps. Equally, with speaker identification, some services will allow you to provide names, while others simply demarcate differences using generic letters. When it comes to special formatting requests and human transcription services, the sky is really the limit, but you should expect to pay more.


You have been learning about the basics of quality transcription services and what your transcript might look like. If you want to learn more, getting to grips with the variables in pricing, security options and industry-specific (medical transcription, legal transcription and market research focus groups), we have written the Ultimate Guide to Transcription Services just for you!   

 

Submit a Comment

Stay up to date