fix(elevenlabs/stt): allow specifying scribe_v2 non-realtime model#4515
fix(elevenlabs/stt): allow specifying scribe_v2 non-realtime model#4515davidzhao merged 16 commits intolivekit:mainfrom
Conversation
chenghao-mou
left a comment
There was a problem hiding this comment.
Some minor issues but looks good otherwise! Thanks for contributing!
livekit-plugins/livekit-plugins-elevenlabs/livekit/plugins/elevenlabs/stt.py
Outdated
Show resolved
Hide resolved
livekit-plugins/livekit-plugins-elevenlabs/livekit/plugins/elevenlabs/stt.py
Outdated
Show resolved
Hide resolved
|
/test-stt |
STT Test ResultsStatus: ✗ Some tests failed
Failed Tests
Skipped Tests
Triggered by workflow run #233 |
livekit-plugins/livekit-plugins-elevenlabs/livekit/plugins/elevenlabs/stt.py
Outdated
Show resolved
Hide resolved
|
/test-stt |
|
❌ |
|
I have made the suggested changes ⚡ |
livekit-plugins/livekit-plugins-elevenlabs/livekit/plugins/elevenlabs/stt.py
Outdated
Show resolved
Hide resolved
| if is_given(use_realtime): | ||
| if use_realtime is True: | ||
| logger.warning( | ||
| "`use_realtime` parameter is deprecated. " | ||
| "Specify a realtime model_id to enable streaming. " | ||
| "Defaulting model_id to 'scribe_v2_realtime' " | ||
| ) | ||
| model_id = "scribe_v2_realtime" | ||
| else: | ||
| logger.warning( | ||
| "`use_realtime` parameter is deprecated. Instead set model_id to determine if streaming is enabled." | ||
| ) | ||
| if is_given(model_id) and "realtime" in model_id: | ||
| raise ValueError( | ||
| "The currently selected model is a realtime model but use_realtime is False" | ||
| ) | ||
| else: | ||
| use_realtime = True if (is_given(model_id) and "realtime" in model_id) else False | ||
|
|
||
| # Handle model_id defaults | ||
| if not is_given(model_id): | ||
| if use_realtime: | ||
| logger.warning("model_id is not provided. Defaulting to 'scribe_v2_realtime'.") | ||
| model_id = "scribe_v2_realtime" | ||
| else: | ||
| logger.warning("model_id is not provided. Defaulting to 'scribe_v1'.") | ||
| model_id = "scribe_v1" |
There was a problem hiding this comment.
maybe simplify the validation
| if is_given(use_realtime): | |
| if use_realtime is True: | |
| logger.warning( | |
| "`use_realtime` parameter is deprecated. " | |
| "Specify a realtime model_id to enable streaming. " | |
| "Defaulting model_id to 'scribe_v2_realtime' " | |
| ) | |
| model_id = "scribe_v2_realtime" | |
| else: | |
| logger.warning( | |
| "`use_realtime` parameter is deprecated. Instead set model_id to determine if streaming is enabled." | |
| ) | |
| if is_given(model_id) and "realtime" in model_id: | |
| raise ValueError( | |
| "The currently selected model is a realtime model but use_realtime is False" | |
| ) | |
| else: | |
| use_realtime = True if (is_given(model_id) and "realtime" in model_id) else False | |
| # Handle model_id defaults | |
| if not is_given(model_id): | |
| if use_realtime: | |
| logger.warning("model_id is not provided. Defaulting to 'scribe_v2_realtime'.") | |
| model_id = "scribe_v2_realtime" | |
| else: | |
| logger.warning("model_id is not provided. Defaulting to 'scribe_v1'.") | |
| model_id = "scribe_v1" | |
| if is_given(realtime_model): | |
| if is_given(model_id): | |
| logger.warning( | |
| "both `use_realtime` and `model_id` parameters are provided. `use_realtime` will be ignored." | |
| ) | |
| else: | |
| logger.warning( | |
| "`use_realtime` parameter is deprecated. " | |
| "Specify a realtime model_id to enable streaming. " | |
| "Defaulting model_id to 'scribe_v2_realtime' " | |
| ) | |
| model_id = "scribe_v2_realtime" if realtime_model else "scribe_v1" | |
| model_id = model_id if is_given(model_id) else "scribe_v1" | |
| realtime_model = model_id == "scribe_v2_realtime" |
then use realtime_model for streaming and
if not realtime_model and is_given(server_vad):
logger.warning("Server-side VAD is only supported for Scribe v2 realtime model")There was a problem hiding this comment.
Ok I have made that change
|
/test-stt |
|
@bml1g12 just need to fix CI and we'll get this merged. |
@davidzhao Sorry but I do not understand this CI failure - as when I run All looks healthy, and @chenghao-mou suggests the failure is unrelated to this PR here |
The latest failure came from ruff in 3.9: You can run |
📝 WalkthroughWalkthroughType annotations added to an ElevenLabs example script. The ElevenLabs STT component now supports configurable model selection through a new model_id parameter, replacing hardcoded model identifiers and deprecating the use_realtime flag with appropriate warnings. Changes
Estimated Code Review Effort🎯 3 (Moderate) | ⏱️ ~20 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
Comment |
Oh my apologies for missing that, I have pushed the ruff formatted changes |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In
`@livekit-plugins/livekit-plugins-elevenlabs/livekit/plugins/elevenlabs/stt.py`:
- Around line 109-125: The logger.warning call in the branch that handles both
use_realtime and model_id ("both `use_realtime` and `model_id` parameters are
provided. `use_realtime` will be ignored.") exceeds the 100-character limit;
update the warning in the if is_given(use_realtime) and is_given(model_id)
branch by splitting the long string across multiple shorter literal parts (e.g.,
implicit string concatenation with parentheses or separate +-joined strings) in
the logger.warning call so it stays under 100 chars per line, keeping the same
message and leaving the surrounding logic (use_realtime, model_id, and
subsequent model_id defaulting) unchanged.
🧹 Nitpick comments (1)
examples/other/elevenlab_scribe_v2.py (1)
27-36: Remove deprecateduse_realtimeparameter from example.The example uses both
use_realtime=True(deprecated) andmodel_id="scribe_v2_realtime"(new). Per the implementation instt.py, this combination logs a warning: "bothuse_realtimeandmodel_idparameters are provided.use_realtimewill be ignored." Examples should demonstrate the recommended usage pattern.♻️ Suggested fix
stt = elevenlabs.STT( - use_realtime=True, server_vad={ "vad_silence_threshold_secs": 0.5, "vad_threshold": 0.5, "min_speech_duration_ms": 100, "min_silence_duration_ms": 300, }, model_id="scribe_v2_realtime", )
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
examples/other/elevenlab_scribe_v2.pylivekit-plugins/livekit-plugins-elevenlabs/livekit/plugins/elevenlabs/stt.py
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
**/*.py: Format code with ruff
Run ruff linter and auto-fix issues
Run mypy type checker in strict mode
Maintain line length of 100 characters maximum
Ensure Python 3.9+ compatibility
Use Google-style docstrings
Files:
examples/other/elevenlab_scribe_v2.pylivekit-plugins/livekit-plugins-elevenlabs/livekit/plugins/elevenlabs/stt.py
🧠 Learnings (3)
📓 Common learnings
Learnt from: CR
Repo: livekit/agents PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-01-16T07:44:56.353Z
Learning: Implement Model Interface Pattern for STT, TTS, LLM, and Realtime models with provider-agnostic interfaces, fallback adapters for resilience, and stream adapters for different streaming patterns
📚 Learning: 2026-01-16T07:44:56.353Z
Learnt from: CR
Repo: livekit/agents PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-01-16T07:44:56.353Z
Learning: Run all checks (format-check, lint, type-check) using 'make check' before committing
Applied to files:
livekit-plugins/livekit-plugins-elevenlabs/livekit/plugins/elevenlabs/stt.py
📚 Learning: 2026-01-16T07:44:56.353Z
Learnt from: CR
Repo: livekit/agents PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-01-16T07:44:56.353Z
Learning: Applies to **/*.py : Run ruff linter and auto-fix issues
Applied to files:
livekit-plugins/livekit-plugins-elevenlabs/livekit/plugins/elevenlabs/stt.py
🧬 Code graph analysis (2)
examples/other/elevenlab_scribe_v2.py (7)
examples/bank-ivr/ivr_navigator_agent.py (1)
prewarm(79-80)examples/bank-ivr/ivr_system_agent.py (1)
prewarm(625-626)examples/other/kokoro_tts.py (1)
prewarm(30-31)examples/voice_agents/basic_agent.py (1)
prewarm(71-72)examples/other/transcription/multi-user-transcriber.py (1)
prewarm(139-140)examples/voice_agents/nvidia_test.py (1)
prewarm(21-22)examples/voice_agents/zapier_mcp_integration.py (1)
prewarm(48-49)
livekit-plugins/livekit-plugins-elevenlabs/livekit/plugins/elevenlabs/stt.py (2)
livekit-agents/livekit/agents/utils/misc.py (1)
is_given(25-26)livekit-plugins/livekit-plugins-elevenlabs/livekit/plugins/elevenlabs/tts.py (1)
model(193-194)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: unit-tests
- GitHub Check: type-check (3.9)
- GitHub Check: type-check (3.13)
🔇 Additional comments (9)
examples/other/elevenlab_scribe_v2.py (2)
17-17: LGTM on type annotations!The added return type annotations (
-> None) improve code clarity and align with stricter typing patterns in the codebase (e.g.,examples/bank-ivr/ivr_navigator_agent.py).Also applies to: 25-25
38-38: Explicit type annotation is acceptable but optional.The type annotation on
sessionis valid. However, sinceAgentSession(...)already returnsAgentSession, the annotation is redundant. This is a style preference—keeping it is fine for explicitness.livekit-plugins/livekit-plugins-elevenlabs/livekit/plugins/elevenlabs/stt.py (7)
23-23: LGTM on import addition.
Literalimport is correctly added to support theElevenLabsSTTModelstype alias.
60-66: LGTM on type definitions.The
ElevenLabsSTTModelstype alias with documentation link and themodel_idfield inSTTOptionsprovide good type safety while allowing custom model strings for flexibility. This aligns with the Model Interface Pattern.
84-89: LGTM on parameter additions.The deprecation annotation for
use_realtimeand addition ofmodel_idwithNOT_GIVENdefault follows the pattern established in past review discussions. Documentation in the docstring properly explains the deprecation.
155-157: LGTM on model property.Returning
self._opts.model_idaligns with the TTS pattern (livekit-plugins-elevenlabs/tts.pyreturnsself._opts.model) and addresses the past review comment.
182-182: LGTM on API integration.The
model_idis correctly passed to the ElevenLabs API form data, replacing previously hardcoded values.
432-439: LGTM on WebSocket integration.The
model_idis correctly added to the WebSocket URL query parameters for the realtime streaming endpoint.
142-151: LGTM on options initialization.The
model_idis correctly passed toSTTOptions, ensuring the resolved model is propagated through the options dataclass.
✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.
| if is_given(use_realtime): | ||
| if is_given(model_id): | ||
| logger.warning( | ||
| "both `use_realtime` and `model_id` parameters are provided. `use_realtime` will be ignored." | ||
| ) | ||
| else: | ||
| logger.warning( | ||
| "`use_realtime` parameter is deprecated. " | ||
| "Specify a realtime model_id to enable streaming. " | ||
| "Defaulting model_id to one based on use_realtime parameter. " | ||
| ) | ||
| model_id = "scribe_v2_realtime" if use_realtime else "scribe_v1" | ||
| model_id = model_id if is_given(model_id) else "scribe_v1" | ||
| use_realtime = model_id == "scribe_v2_realtime" | ||
|
|
||
| if not use_realtime and is_given(server_vad): | ||
| logger.warning("Server-side VAD is only supported for Scribe v2 realtime model") |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# First, locate and examine the actual file
fd -t f "stt.py" | grep elevenlabsRepository: livekit/agents
Length of output: 134
🏁 Script executed:
# Check the exact code at lines 109-125
if [ -f "livekit-plugins/livekit-plugins-elevenlabs/livekit/plugins/elevenlabs/stt.py" ]; then
sed -n '100,130p' "livekit-plugins/livekit-plugins-elevenlabs/livekit/plugins/elevenlabs/stt.py"
fiRepository: livekit/agents
Length of output: 1797
🌐 Web query:
ElevenLabs Scribe realtime models API 2025 streaming STT
💡 Result:
Short summary — ElevenLabs Scribe (2025/2026 Realtime):
- Model: Scribe v2 Realtime — low‑latency live STT (≈150 ms median latency), 90+ languages, word-level timestamps, speaker diarization, entity detection and advanced vocab handling. [1][2][5]
- Streaming API: WebSocket-based streaming (server- and client-side guides). Returns partial (interim) transcripts and committed segments; supports PCM 8–48 kHz and μ‑law. [3][4]
- Client vs Server: Client connections use a single-use temporary token (for mic/browser use); server implementations use a standard API key and can stream from URLs or files. [4][3]
- Developer controls: Voice Activity Detection (VAD), manual commit control, ability to send prior text context, and support for multi-language / auto language detection. [3][1]
- Where to start / docs: ElevenLabs realtime STT streaming docs and cookbooks (includes quickstarts and WebSocket examples). [3][4]
If you want, I can: (A) paste the WebSocket quickstart example for client or server, or (B) walk through obtaining temporary tokens and a minimal streaming demo in Node/Python. Which do you prefer?
🏁 Script executed:
# Check file structure and look for model constants/definitions
head -50 "livekit-plugins/livekit-plugins-elevenlabs/livekit/plugins/elevenlabs/stt.py"Repository: livekit/agents
Length of output: 1506
🏁 Script executed:
# Check for any other realtime model references
rg "realtime|scribe" "livekit-plugins/livekit-plugins-elevenlabs/livekit/plugins/elevenlabs/" -iRepository: livekit/agents
Length of output: 4141
🏁 Script executed:
# Check line lengths in the problematic section
sed -n '109,125p' "livekit-plugins/livekit-plugins-elevenlabs/livekit/plugins/elevenlabs/stt.py" | while IFS= read -r line; do
len=${`#line`}
if [ $len -gt 100 ]; then
echo "Line exceeds 100 chars (length: $len): $line"
fi
doneRepository: livekit/agents
Length of output: 209
🏁 Script executed:
# Look at the full method/context including docstrings
sed -n '70,130p' "livekit-plugins/livekit-plugins-elevenlabs/livekit/plugins/elevenlabs/stt.py"Repository: livekit/agents
Length of output: 3107
Fix line length violation on line 113.
Line 113 exceeds the 100-character maximum:
"both `use_realtime` and `model_id` parameters are provided. `use_realtime` will be ignored."
Split this warning message across multiple lines to comply with the Python style requirement.
The deprecation logic is sound. The exact match on line 122 (model_id == "scribe_v2_realtime") is appropriate—ElevenLabs currently provides only one realtime model, and the Literal type restricts the model_id to known values ("scribe_v1", "scribe_v2", "scribe_v2_realtime").
🤖 Prompt for AI Agents
In `@livekit-plugins/livekit-plugins-elevenlabs/livekit/plugins/elevenlabs/stt.py`
around lines 109 - 125, The logger.warning call in the branch that handles both
use_realtime and model_id ("both `use_realtime` and `model_id` parameters are
provided. `use_realtime` will be ignored.") exceeds the 100-character limit;
update the warning in the if is_given(use_realtime) and is_given(model_id)
branch by splitting the long string across multiple shorter literal parts (e.g.,
implicit string concatenation with parentheses or separate +-joined strings) in
the logger.warning call so it stays under 100 chars per line, keeping the same
message and leaving the surrounding logic (use_realtime, model_id, and
subsequent model_id defaulting) unchanged.
This PR allows users to select scribe_v2 for use as a non-realtime STT engine, it also allows selecting STT engine by model_id. It deprecates the use_realtime parameter in favour of automatically enabling this based on the model_id name.
Context: I wanted to run the new scribe_v2 model (https://elevenlabs.io/docs/overview/models#models-overview) but saw the STT plugin has no model parameter, and even seems to be hardcoded to the old scribe_v1 model. This PR fixes that by introducing a model_id in a similar way to many other plugins.
Disclaimer: I am not an expert on Elevenlabs code, I just spotted what looks like a missing feature or oversight, and made this PR accordingly
Note: When I tried use_realtime=True with latest livekit version, I found regular " ElevenLabs STT connection closed unexpectedly" when using manual turn detection and ending user turn. I see when searching livekit agents forum many people report similar issues. For this reason I wanted to disable use_realtime and work with the latest model.
Warning: I notice that when use_realtime is set, it used a hardcoded realtime_scribe_v2 - this PR tries to avoid a breaking change by enforcing the existing behaviour, that use_realtime=True will always use that model, if we want to be more future proof here we could make a breaking change and make it fully dynamic
Summary by CodeRabbit
New Features
Deprecations
✏️ Tip: You can customize this high-level summary in your review settings.