Berea

How Transcription Works

The three stages that turn your sermon audio into a polished transcript.

3 min read

Transcription in Berea happens in three stages, each building on the last to produce the most accurate, readable result possible.

Stage 1: Live transcription

During a live recording, audio is streamed to Deepgram's WebSocket API in real-time. Words appear on screen within 300 milliseconds of being spoken. This gives you an immediate, usable transcript — even before the recording ends.

When a Wi-Fi or cellular connection isn't available, Berea falls back to Apple's on-device Speech framework, which provides offline transcription with slightly lower accuracy for specialized vocabulary.

Stage 2: Deepgram batch upgrade

Once the recording ends, Berea sends the full audio file to Deepgram's batch API. This produces a significantly more accurate transcript than the streaming version — especially for proper nouns, scripture references, and overlapping speech.

The batch upgrade also adds word-level timestamps to every word in the transcript. These timestamps are what power the synchronized playback feature — tap any word and the audio jumps to that exact moment.

Stage 3: transcript polishing

The final stage sends the transcript to GPT for editing. This step:

  • Fixes punctuation, capitalization, and paragraph breaks.
  • Corrects common transcription errors ("eye" → "I", "pray fur" → "pray for").
  • Standardizes scripture references ("Romans eight" → "Romans 8").
  • Applies context from your Faith Profile (denominational vocabulary, pastor's name).

Tip

If the polished transcript looks worse than the original (e.g., meaning changed or something was lost), you can tap the Re-polish button in the transcript toolbar to run it again. If audio quality was very poor, Berea will re-run the Deepgram transcription from scratch.

Transcript accuracy factors

  • Microphone proximity — closer is almost always better.
  • Room acoustics — reverberant churches can introduce echo artifacts.
  • Speaker clarity — accents, fast speech, or heavy background music all reduce accuracy.
  • Connection quality — a strong Wi-Fi signal improves the live streaming stage.

More in Transcripts & Translation