Pinecall

STT Providers

Speech-to-text providers, models, and tuning parameters.

Quick reference#

// Shortcuts
{ stt: "deepgram-flux" }
{ stt: "deepgram" }
{ stt: "deepgram:nova-3:fr" }   // provider:model:language
{ stt: "gladia" }
{ stt: "transcribe" }

Naming convention#

Configuration objects that pass through to providers keep snake_case to mirror what the receiving side expects (endpointing_ms, interim_results, etc.). This avoids an unnecessary translation layer and lets you copy-paste from provider docs directly.

Best for real-time voice agents. Turn detection and VAD are auto-derived — no configuration needed.

stt: {
  provider: "deepgram-flux",
  keyterms: ["pinecall"],      // boost recognition for specific terms
  eot_threshold: 0.5,          // end-of-turn sensitivity (0-1)
  eager_eot_threshold: 0.7,    // eager turn threshold
  eot_timeout_ms: 2000,
}

Shortcut: "deepgram-flux"

Auto-derived: Flux → native turn detection + native VAD. No need to specify turnDetection.

Deepgram Nova#

Classic STT. Turn detection and VAD auto-derived (smart_turn + silero).

stt: {
  provider: "deepgram",
  model: "nova-3",
  language: "en",
  interim_results: true,
  smart_format: true,
  punctuate: true,
  profanity_filter: false,
  endpointing_ms: 300,
  utterance_end_ms: 1000,
  keywords: ["pinecall"],
}

Shortcut: "deepgram" or "deepgram:nova-3" or "deepgram:nova-3:es"

Gladia#

stt: {
  provider: "gladia",
  model: "accurate",
  language: "en",
  endpointing: 300,
  speech_threshold: 0.8,
  code_switching: false,
  audio_enhancer: true,
}

Shortcut: "gladia"

AWS Transcribe#

stt: {
  provider: "transcribe",
  language: "en-US",
}

Shortcut: "transcribe"

Which to choose#

ProviderBest forTrade-off
deepgram-fluxReal-time voice agentsLowest latency, fewer languages
deepgram (nova-3)Wide language supportSlightly higher latency than Flux
gladiaCode-switching, multilingualHigher latency than Deepgram
transcribeAWS-native deploymentsAWS pricing model

For most agents, start with deepgram-flux. Switch only if you need a language Flux doesn't support, or if you have specific accuracy requirements.

Hot-reloading STT#

You can swap STT providers at runtime:

// Agent-wide (all future calls)
agent.configure({ stt: "gladia" });

// One call only
call.configure({ stt: "deepgram" });

What's next#