Question 1

What is the difference between transcription and translation?

Accepted Answer

Transcription converts spoken words into written text in the same language — a spoken English interview becomes a written English transcript. Translation converts content from one language to another — a written English document becomes written Spanish. Some services combine both: transcription + translation produces a written transcript in a different language than the original speech.

Question 2

What is verbatim transcription?

Accepted Answer

Verbatim transcription captures every spoken word exactly as said, including filler words (um, uh, like), false starts, repetitions, laughter, pauses, and non-verbal sounds. It’s used in legal proceedings, psychological research, and linguistic studies where the exact manner of speech is important. Most business transcription uses ‘clean verbatim’ instead, which removes fillers while keeping all meaningful content.

Question 3

What is the difference between transcription, captions, and subtitles?

Accepted Answer

Transcription is a text document without timing data — for reading and searching. Captions are timed text synchronized with audio/video, shown in the same language as the spoken content — designed for accessibility (deaf and hard-of-hearing audiences). Subtitles are also timed and synchronized, but typically in a different language than the audio — used for translation. NovaScribe exports all three formats: TXT/DOCX for transcripts, SRT/VTT for captions and subtitles.

Question 4

How long does transcription take?

Accepted Answer

AI transcription processes 1 hour of audio in 2–5 minutes. Human transcription typically takes 4–6 hours per hour of audio for a standard turnaround, or 12–24 hours for rush service. The actual time depends on audio quality, number of speakers, and subject complexity. AI services like NovaScribe are near-instant regardless of file length.

Question 5

What file formats can I get my transcript in?

Accepted Answer

Common transcript formats include: TXT (plain text, universally compatible), DOCX (Microsoft Word, most common for editing), PDF (read-only sharing), SRT (SubRip — timed subtitles for video platforms), VTT (WebVTT — web captions for HTML5 video), and JSON (structured data for developers). NovaScribe exports TXT, DOCX, PDF, SRT, and VTT.

Question 6

Is transcription important for accessibility?

Accepted Answer

Yes — transcription is a key accessibility tool. The Americans with Disabilities Act (ADA) and Web Content Accessibility Guidelines (WCAG 2.1 AA) require that audio and video content be accessible to deaf and hard-of-hearing users. Transcripts and captions fulfill this requirement. Universities, government agencies, and companies subject to Section 508 compliance must provide transcripts or captions for all recorded audio/video content.

Question 7

How accurate is AI transcription?

Accepted Answer

AI transcription reaches 95–98% accuracy in ideal conditions — clear audio, single speaker, standard accent, general vocabulary. In challenging conditions (multiple speakers, background noise, heavy accents, technical jargon), accuracy typically falls to 70–90%. For most business use cases like meeting notes, podcast show notes, and YouTube captions, AI accuracy is more than sufficient.

Question 8

What is the difference between transcription and dictation?

Accepted Answer

Dictation is the real-time process of speaking for immediate capture — like speaking to a voice assistant or dictating a letter. Transcription is the conversion of pre-recorded audio into text after the fact. The key difference is timing: dictation happens live, transcription happens later. Many AI transcription tools can also handle dictation (real-time speech-to-text), but the primary use case is post-recording conversion.

Type	What It Includes	Used For	Example
Full Verbatim	Every word, filler (um, uh), pauses, laughter, non-verbal sounds	Legal, psychological research, linguistics	[laughs] Um, I— I think the, the contract...
Clean Verbatim	All meaningful words, fillers removed, grammar intact	Business meetings, journalism, most professional use	I think the contract covers this.
Edited	Grammar corrected, restructured for readability	Publishing, marketing, blog posts	The contract covers this scenario.
Phonetic	Sound-based notation of speech sounds	Linguistic analysis, dialect research	Not used in commercial transcription

	Transcription	Captions	Subtitles
Has timestamps	No (optional)	Yes	Yes
Language	Same as audio	Same as audio	Different language
Purpose	Reading / searching	Accessibility	Translation
File format	TXT, DOCX, PDF	SRT, VTT	SRT, VTT
Used on	Documents	Video players	Video players

What Is Transcription? Audio & Video Explained

What Is Transcription? (The Simple Definition)

The 4 Types of Transcription

What Is a Transcript Used For?

Legal

Research

Journalism

Content Creation

Accessibility

Education

Transcription vs Captions vs Subtitles: What's the Difference?

How Transcription Works: Manual vs AI

Manual (Human) Transcription

AI Transcription

Transcription File Formats Explained

Plain Text

Microsoft Word

PDF Document

SubRip Subtitles

WebVTT Captions

When Is Transcription Legally Required?

ADA (Americans with Disabilities Act)

Section 508 (US Federal)

WCAG 2.1 Level AA

How NovaScribe Transcription Works

Upload Your File

AI Transcribes

Export Your Transcript

Affordable Pricing

Transcription FAQ

AI vs Human Transcription

Transcription Software

SRT Generator

Interview Transcription