Streaming Diarization - Beta access

Streaming Diarization - Beta access

What it does:

You stream live audio over WebSocket, and we return timestamped speaker labels in real time. Providing the same accuracy as Precision-2.

  • ~300 ms diarization latency

  • A maximum of 8 speakers and up to 10 parallel streams is supported

  • Input: 16 kHz mono audio over WebSocket, in 100ms chunks per message

  • Output: start/end time, speaker label

  • Built to handle noisy environments, overlapping speakers, and variable audio quality

  • Free for the full beta period.

👉 Request access to the Streaming Diarization beta

Speaker Intelligence Platform for developers

Detect, segment, label and separate speakers in any language.

Speaker Intelligence Platform for developers

Detect, segment, label and separate speakers in any language.

Make the most of conversational speech
with AI

Detect, segment, label and separate speakers in any language.