New: STT orchestration for speaker-attributed transcription

New: STT orchestration for speaker-attributed transcription

STT orchestration is now available. This feature aligns diarization and transcription in a unified workflow.

What it does:

STT orchestration orchestrates pyannoteAI diarization with transcription services. Instead of running diarization and transcription separately, then reconciling outputs manually, you make one API call and receive speaker-attributed transcripts.

Technical details:

  • Supports Precision-2 pyannoteAI diarization models

  • Connects to OSS transcription models: Nemo Parakeet

  • Returns structured output: start/end timestamps, speaker IDs, transcribed text

  • Reduces timestamp reconciliation errors and ambiguous segments

Use case:

  • Clean speaker-attributed transcripts for downstream summarization or analytics.

Benchmarks show improved tcpWER metrics compared to typical STT providers.

Read our blog post for more details about the feature and its usage.
Explore the technical documentation and tutorials.
The tutorial notebooks shown in the video are now available on our GitHub.

Speaker Intelligence Platform for developers

Detect, segment, label and separate speakers in any language.

Speaker Intelligence Platform for developers

Detect, segment, label and separate speakers in any language.

Make the most of conversational speech
with AI

Detect, segment, label and separate speakers in any language.