Skip to content

S2S: Speech-to-Speech with File Search in a Single API Call#643

Open
Prajna1999 wants to merge 22 commits intomainfrom
feature/speech-to-speech
Open

S2S: Speech-to-Speech with File Search in a Single API Call#643
Prajna1999 wants to merge 22 commits intomainfrom
feature/speech-to-speech

Conversation

@Prajna1999
Copy link
Collaborator

@Prajna1999 Prajna1999 commented Mar 5, 2026

Summary

Target issue is #642
Explain the motivation for making this change. What existing problem does the pull request solve?

Checklist

Before submitting a pull request, please ensure that you mark these task.

  • Ran fastapi run --reload app/main.py or docker compose up in the repository root and test.
  • If you've fixed a bug or added code that is tested and has test cases.

Notes

Please add here if any other information is required for the reviewer.

Summary by CodeRabbit

  • New Features

    • Added Speech-to-Speech API endpoint: STT (auto-detect) → RAG (KB) → TTS, with model selection and detected-language propagation
    • Asynchronous webhook callbacks for STT/LLM/TTS and per-stage error handling; latency tracking per stage
  • Documentation

    • Comprehensive STS with RAG workflow guide: supported languages, models, examples, edge cases, and best practices
  • Tests

    • New unit/integration tests covering language detection, block building, marker replacement, and endpoint flows

@Prajna1999 Prajna1999 changed the title Feature/speech to speech S2S: Speech-to-Speech in a single API Call Mar 5, 2026
@coderabbitai
Copy link

coderabbitai bot commented Mar 5, 2026

📝 Walkthrough

Walkthrough

Introduces Speech-to-Speech with RAG workflow via new POST /llm/sts endpoint, request/enum models, chain-building utilities, detected-language propagation, provider adjustments, tests, and docs to orchestrate STT → RAG → TTS with asynchronous callbacks and per-stage error handling.

Changes

Cohort / File(s) Summary
Documentation
backend/app/api/docs/llm/speech_to_speech.md
New comprehensive STS with RAG docs: endpoint spec, request/response schemas, callbacks, supported languages, examples, edge cases, latency tracking, and best practices.
API Router & Registration
backend/app/api/routes/llm_sts.py, backend/app/api/main.py
Adds /llm/sts endpoint handler and registers its router with the main API. Validates languages/callbacks and starts asynchronous chain jobs.
Request Models
backend/app/models/llm/request.py
Adds STTModel, TTSModel, LLMModel enums and SpeechToSpeechRequest model; extends TextContent with optional language_code and language normalization.
Chain Utilities
backend/app/services/llm/chain/utils.py
Adds SUPPORTED_LANGUAGE_CODES and builders: build_stt_block, build_rag_block, build_tts_block mapping models to providers and configuring provider-specific params.
Chain Execution Logic
backend/app/services/llm/chain/chain.py
Adds ChainContext.detected_language, updates result_to_query(result, context) to extract/propagate language_code, and uses detected language in subsequent queries.
Job Execution & Providers
backend/app/services/llm/jobs.py, backend/app/services/llm/providers/sai.py
execute_llm_call adds detected_language param and replaces {{detected}} in TTS configs; Sarvam provider now returns TextContent(language_code=...) for STT and passes speech_sample_rate=16000 for TTS.
Tests & Debugging
backend/app/tests/services/llm/test_sts.py, backend/test_sts_debug.py
New comprehensive STS tests (language propagation, block builds, endpoint validation) and a debug script to exercise chain job creation and Celery connectivity.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant API as "/llm/sts"
    participant Chain as "LLM Chain\nOrchestrator"
    participant STT as "STT Provider\n(Sarvam/Google)"
    participant RAG as "RAG/LLM\nProvider (OpenAI)"
    participant TTS as "TTS Provider\n(Sarvam/Google)"
    participant Callback as "Webhook\n(Callbacks)"

    Client->>API: POST /llm/sts (audio, languages, models, callback)
    activate API
    API->>Chain: create_chain_request (STT → RAG → TTS)
    deactivate API

    activate Chain
    Chain->>STT: execute STT block (audio, language=auto/specified)
    activate STT
    STT-->>Chain: TextOutput (transcript, language_code)
    deactivate STT

    Chain->>Callback: STT Intermediate callback (transcript, latency)

    Chain->>RAG: execute RAG block (query, knowledge_bases, language_code)
    activate RAG
    RAG-->>Chain: TextOutput (LLM response)
    deactivate RAG

    Chain->>Callback: LLM Intermediate callback (response, latency)

    Chain->>TTS: execute TTS block (response, language={{detected}} → replaced)
    activate TTS
    TTS-->>Chain: AudioOutput (audio, latency)
    deactivate TTS

    deactivate Chain

    Chain->>Callback: TTS Final callback (audio, usage, latency)
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~22 minutes

Possibly related PRs

Suggested labels

enhancement, ready-for-review

Suggested reviewers

  • AkhileshNegi
  • vprashrex
  • nishika26

Poem

🐰 I hopped into code with a curious nose,
I heard a voice turn text where conversation grows.
STT peeks words, RAG finds the clue,
TTS hums back in the language true —
Hooray for chains that make voices new! 🎶

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 41.44% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and specifically describes the main feature being introduced: a Speech-to-Speech API endpoint with RAG/File Search capabilities in a single API call, which aligns perfectly with the primary changes across all modified files.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feature/speech-to-speech

Tip

Try Coding Plans. Let us write the prompt for your AI agent so you can ship faster (with fewer bugs).
Share your feedback on Discord.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@Prajna1999 Prajna1999 marked this pull request as ready for review March 5, 2026 18:34
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 8

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
backend/app/crud/llm.py (1)

46-55: ⚠️ Potential issue | 🔴 Critical

Fix invalid Python signature ordering in create_llm_call.

Line 51 introduces a defaulted parameter (chain_id) before required keyword-only parameters (project_id, organization_id, resolved_config, original_provider), which is a syntax error (non-default argument follows default argument).

🛠️ Proposed fix
 def create_llm_call(
     session: Session,
     *,
     request: LLMCallRequest,
     job_id: UUID,
-    chain_id: UUID | None = None,
     project_id: int,
     organization_id: int,
     resolved_config: ConfigBlob,
     original_provider: str,
+    chain_id: UUID | None = None,
 ) -> LlmCall:
#!/bin/bash
# Verify the file parses after parameter reordering (read-only check)
python - <<'PY'
import ast, pathlib
p = pathlib.Path("backend/app/crud/llm.py")
try:
    ast.parse(p.read_text())
    print("OK: backend/app/crud/llm.py parses successfully")
except SyntaxError as e:
    print(f"SyntaxError: {e}")
PY
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/app/crud/llm.py` around lines 46 - 55, The function signature for
create_llm_call has a defaulted parameter (chain_id) placed before required
keyword-only params, causing a syntax error; fix it by reordering parameters so
all required keyword-only arguments (project_id, organization_id,
resolved_config, original_provider) appear before any parameters with defaults,
or alternatively give defaults to those subsequent parameters—update the
signature in the create_llm_call definition accordingly so Python accepts it
(keep references to create_llm_call, chain_id, project_id, organization_id,
resolved_config, original_provider).
🧹 Nitpick comments (12)
backend/app/services/llm/chain/types.py (1)

15-15: Consider tightening metadata type.

Using dict[str, Any] | None instead of dict | None would make downstream usage and static checks safer.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/app/services/llm/chain/types.py` at line 15, Replace the broad
"metadata: dict | None = None" annotation with a tightened generic type
"dict[str, Any] | None" in types.py and add the required "Any" import from
typing; update the annotation on the metadata field (in the dataclass/type that
declares metadata) to use dict[str, Any] | None so static checkers and
downstream code get precise typing. Ensure you import Any (e.g., from typing
import Any) at the top of the file.
backend/app/tests/services/llm/test_chain_executor.py (2)

161-173: Consider using the result variable or removing assignment.

The result variable is assigned but not used. While the test validates callback behavior, adding an assertion on result would make the test more explicit, or you can use _ to indicate intentionally unused.

♻️ Proposed fix
-            result = executor.run()
+            _ = executor.run()  # Result validated via callback mock

             mock_callback.assert_called_once()

Or add an assertion:

             result = executor.run()

             mock_callback.assert_called_once()
+            assert result["success"] is False
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/app/tests/services/llm/test_chain_executor.py` around lines 161 -
173, In test_run_failure_sends_callback, the local variable result is assigned
from executor.run() but never used; either add an assertion that inspects result
(e.g., assert expected failure structure or status) to make the test explicit,
or rename the variable to _ to indicate it is intentionally unused—update the
test_run_failure_sends_callback method where executor.run() is called to
implement one of these fixes.

192-215: Consider removing or using mock_job_crud.

The mock_job_crud is patched but never asserted. If it's only needed to prevent real DB calls, this is acceptable. However, you could either add an assertion or use the _ prefix to indicate it's intentionally unused.

♻️ Proposed fix
-        ) as mock_job_crud,
+        ) as _mock_job_crud,  # Prevents real DB calls
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/app/tests/services/llm/test_chain_executor.py` around lines 192 -
215, The patch for JobCrud in test_setup_updates_job_and_chain_status creates
mock_job_crud but never uses or asserts it; either add an assertion that it was
invoked (e.g., ensure JobCrud was constructed or specific CRUD method called
after executor.run) referencing mock_job_crud, or rename the patch target
variable to _mock_job_crud to indicate it is intentionally unused; update the
test around executor.run and the with(...) patch tuple to implement one of these
fixes.
backend/app/api/routes/llm_speech.py (1)

67-80: Move HTTPException import to top-level.

Importing HTTPException inside conditional blocks works but is unconventional. Since it's already used in FastAPI routes throughout the codebase, moving it to the top-level imports improves readability and consistency.

♻️ Proposed fix

Add to the top-level imports (line 5):

-from fastapi import APIRouter, Depends
+from fastapi import APIRouter, Depends, HTTPException

Then remove the inline imports at lines 67 and 75:

     if request.input_language and request.input_language != "auto":
         if request.input_language not in LANGUAGE_CODES:
-            from fastapi import HTTPException
-
             raise HTTPException(
                 status_code=400,
                 detail=f"Unsupported input language: {request.input_language}. Supported: {', '.join(LANGUAGE_CODES.keys())}",
             )

     if request.output_language and request.output_language not in LANGUAGE_CODES:
-        from fastapi import HTTPException
-
         raise HTTPException(
             status_code=400,
             detail=f"Unsupported output language: {request.output_language}. Supported: {', '.join(LANGUAGE_CODES.keys())}",
         )
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/app/api/routes/llm_speech.py` around lines 67 - 80, The HTTPException
is being imported inside conditional blocks in llm_speech route; move the import
to the module top-level alongside other FastAPI imports and remove the inline
imports in the validation branches that raise HTTPException (the blocks checking
request.input_language and request.output_language against LANGUAGE_CODES).
Locate the inline "from fastapi import HTTPException" lines near the checks and
delete them, and add a single top-level "from fastapi import HTTPException"
import so the raises in those branches use the top-level symbol.
backend/app/tests/services/llm/test_jobs.py (1)

1358-1366: Remove unused variable or add assertion.

The result variable is assigned but never used, as flagged by static analysis. Either remove the assignment or add an assertion to validate the expected behavior.

♻️ Proposed fix - Option 1: Remove assignment
-            result = self._execute_chain_job(chain_request_data)
+            self._execute_chain_job(chain_request_data)
♻️ Proposed fix - Option 2: Add assertion
             result = self._execute_chain_job(chain_request_data)

             mock_update_status.assert_called_once()
             _, kwargs = mock_update_status.call_args
             assert kwargs["chain_id"] == chain_id
             assert kwargs["status"].value == "failed"
+            assert result["success"] is False
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/app/tests/services/llm/test_jobs.py` around lines 1358 - 1366, The
test assigns result = self._execute_chain_job(chain_request_data) but never uses
it; either remove the unused assignment or add an assertion about the execution
outcome. Locate the call to self._execute_chain_job in the test (near the
mock_update_status assertions) and either delete the "result =" prefix so the
call stands alone, or add an assertion referencing result (e.g., assert result
is not None or assert result.status == ...) that expresses the expected behavior
while keeping the existing mock_update_status assertions intact.
backend/app/api/docs/llm/speech_to_speech.md (1)

7-15: Add language specifiers to fenced code blocks.

The code blocks on lines 7-9 and 13-15 are missing language specifiers. While these are simple text displays, adding a language identifier (e.g., text or plaintext) satisfies linting rules and improves accessibility.

📝 Proposed fix
-```
+```text
 POST /llm/sts

```diff
-```
+```text
 Voice Input → STT (auto language) → RAG (Knowledge Base) → TTS → Voice Output
</details>

<details>
<summary>🤖 Prompt for AI Agents</summary>

Verify each finding against the current code and only fix it if needed.

In @backend/app/api/docs/llm/speech_to_speech.md around lines 7 - 15, Add
language specifiers to the two fenced code blocks that currently contain "POST
/llm/sts" and "Voice Input → STT (auto language) → RAG (Knowledge Base) → TTS →
Voice Output": update the opening backticks of each code fence to include a
language identifier such as text or plaintext (e.g., change totext)
so linting rules and accessibility are satisfied while leaving the block
contents unchanged.


</details>

</blockquote></details>
<details>
<summary>backend/app/services/llm/chain/chain.py (2)</summary><blockquote>

`3-3`: **Import `Callable` from `collections.abc` instead of `typing`.**

For Python 3.9+, importing `Callable` from `collections.abc` is preferred.


<details>
<summary>Proposed fix</summary>

```diff
-from typing import Any, Callable
+from collections.abc import Callable
+from typing import Any
```
</details>

<details>
<summary>🤖 Prompt for AI Agents</summary>

```
Verify each finding against the current code and only fix it if needed.

In `@backend/app/services/llm/chain/chain.py` at line 3, The import currently
brings Callable from typing; update the import in this module so Callable is
imported from collections.abc instead (replace the typing import for Callable
with "from collections.abc import Callable") to follow Python 3.9+ best
practices; ensure any other uses of Callable in this file (the import line at
the top that includes "from typing import Any, Callable") still keep Any
imported from typing or move Any as needed.
```

</details>

---

`121-135`: **Consider exposing block index via a property instead of accessing private `_index`.**

The callback accesses `block._index` directly. While this works since it's internal code, exposing a read-only property would be cleaner.


<details>
<summary>Proposed change in ChainBlock class</summary>

```diff
 class ChainBlock:
     """A single block in the chain. Only responsible for executing itself."""
+
+    `@property`
+    def index(self) -> int:
+        """Block index in the chain."""
+        return self._index
```

Then in `LLMChain.execute`:
```diff
-            on_block_completed(block._index, result)
+            on_block_completed(block.index, result)
```
</details>

<details>
<summary>🤖 Prompt for AI Agents</summary>

```
Verify each finding against the current code and only fix it if needed.

In `@backend/app/services/llm/chain/chain.py` around lines 121 - 135, Add a
read-only property on ChainBlock (e.g., def index(self) -> int: return
self._index) and update LLMChain.execute to use that property instead of
accessing the private attribute directly: replace block._index with block.index
in the on_block_completed call and the error log (and anywhere else in
LLMChain.execute that reads the private field) so callers use the public
read-only accessor while preserving current behavior.
```

</details>

</blockquote></details>
<details>
<summary>backend/test_sts_debug.py (3)</summary><blockquote>

`145-147`: **Remove unused variable assignment.**

The `func` variable is assigned but never used. If this is intentional verification that the attribute exists, consider using `hasattr()` or simply accessing without assignment.


<details>
<summary>Proposed fix</summary>

```diff
-        func = getattr(module, "execute_chain_job")
-        print(f"✅ Dynamic import successful (same as Celery)")
+        # Verify attribute exists (same as Celery dynamic import)
+        _ = module.execute_chain_job
+        print("✅ Dynamic import successful (same as Celery)")
```
</details>

<details>
<summary>🤖 Prompt for AI Agents</summary>

```
Verify each finding against the current code and only fix it if needed.

In `@backend/test_sts_debug.py` around lines 145 - 147, The assignment to func
using getattr(module, "execute_chain_job") is unused; either remove the
assignment and just access/verify the attribute (e.g., use hasattr(module,
"execute_chain_job") or assert getattr(module, "execute_chain_job") is not None)
or use func after assignment. Update the code around
importlib.import_module("app.services.llm.jobs") and the getattr call so you
don't leave an unused variable named func—choose removal of the assignment or
replace it with a presence check (hasattr or assert) to express the intent.
```

</details>

---

`79-82`: **Remove extraneous f-string prefixes from strings without placeholders.**

Static analysis correctly identifies multiple f-strings that contain no interpolation. These should be regular strings.


<details>
<summary>Example fixes</summary>

```diff
-            print(f"✅ Chain job created successfully!")
-            print(f"   Job ID: {job_id}")
-            print(f"   Check your Celery worker logs for task execution")
+            print("✅ Chain job created successfully!")
+            print(f"   Job ID: {job_id}")
+            print("   Check your Celery worker logs for task execution")
```

```diff
-            print(f"✅ Celery workers are running:")
+            print("✅ Celery workers are running:")
```

Similar changes apply to lines 116, 139, 147, and 170-191.
</details>


Also applies to: 105-105, 116-116, 139-139, 147-147, 170-191

<details>
<summary>🤖 Prompt for AI Agents</summary>

```
Verify each finding against the current code and only fix it if needed.

In `@backend/test_sts_debug.py` around lines 79 - 82, Several print statements use
f-strings with no interpolation; change them to normal string literals to
satisfy static analysis (remove the leading f from prints that contain no `{}`
placeholders). Locate the print calls around the function that prints and
returns job_id (the three lines printing "✅ Chain job created successfully!", " 
Job ID: {job_id}" is fine to keep as an f-string but others without placeholders
must drop the f) and also update the other print lines flagged (around lines
indicated: 105, 116, 139, 147, and the block 170-191) by removing the
unnecessary f prefix so only prints that actually interpolate job_id or other
variables use f-strings.
```

</details>

---

`14-14`: **Add return type hints to functions.**

Per coding guidelines, all functions should have type hints for parameters and return values. These debug functions are missing return type annotations.


<details>
<summary>Proposed fixes</summary>

```diff
-def test_chain_job_creation():
+def test_chain_job_creation() -> str | None:
```

```diff
-def check_celery_connection():
+def check_celery_connection() -> None:
```

```diff
-def check_function_import():
+def check_function_import() -> None:
```
</details>


Also applies to: 91-91, 130-130

<details>
<summary>🤖 Prompt for AI Agents</summary>

```
Verify each finding against the current code and only fix it if needed.

In `@backend/test_sts_debug.py` at line 14, Add explicit return type annotations
to the test functions (e.g., test_chain_job_creation and the other top-level
test_* functions missing annotations) by adding -> None to their definitions and
annotate any parameters if present; update each def statement (such as def
test_chain_job_creation():) to include the return type (def
test_chain_job_creation() -> None:) so all test functions have proper type
hints.
```

</details>

</blockquote></details>
<details>
<summary>backend/app/models/llm/request.py (1)</summary><blockquote>

`15-34`: **Consider using `enum.StrEnum` for string enums (Python 3.11+).**

Per coding guidelines, this codebase uses Python 3.11+. `StrEnum` is cleaner than inheriting from both `str` and `Enum`.


<details>
<summary>Proposed fix</summary>

```diff
+from enum import StrEnum
-from enum import Enum
...

-class STTModel(str, Enum):
+class STTModel(StrEnum):
     """Supported STT models for speech-to-speech."""
     GEMINI_PRO = "gemini-2.5-pro"
     SARVAM = "saaras:v3"


-class TTSModel(str, Enum):
+class TTSModel(StrEnum):
     """Supported TTS models for speech-to-speech."""
     GEMINI_PRO = "gemini-2.5-pro-preview-tts"
     SARVAM = "bulbul:v3"


-class LLMModel(str, Enum):
+class LLMModel(StrEnum):
     """Supported LLM models for RAG in speech-to-speech."""
     GPT4O = "gpt-4o"
     GPT4O_MINI = "gpt-4o-mini"


-class ChainStatus(str, Enum):
+class ChainStatus(StrEnum):
     """Status of an LLM chain execution."""
     PENDING = "pending"
     ...
```
</details>


Also applies to: 634-640

<details>
<summary>🤖 Prompt for AI Agents</summary>

```
Verify each finding against the current code and only fix it if needed.

In `@backend/app/models/llm/request.py` around lines 15 - 34, The enums STTModel,
TTSModel, and LLMModel currently inherit from str and Enum; replace them to
inherit from enum.StrEnum (import StrEnum from enum) so they become true string
enums per Python 3.11+; update the class bases for STTModel, TTSModel, and
LLMModel and ensure the import is added (and replicate the same change for the
other occurrence noted at lines 634-640).
```

</details>

</blockquote></details>

</blockquote></details>

<details>
<summary>🤖 Prompt for all review comments with AI agents</summary>

Verify each finding against the current code and only fix it if needed.

Inline comments:
In @backend/app/api/routes/llm_chain.py:

  • Around line 39-41: The llm_chain endpoint function lacks a return type hint;
    update the function signature for llm_chain(_current_user: AuthContextDep,
    _session: SessionDep, request: LLMChainRequest) to include the correct return
    type (e.g., -> LLMChainResponse or the actual response type used by this route),
    and add any missing import for that response type at the top of the file; ensure
    the declared return type matches what the function actually returns.

In @backend/app/api/routes/llm_speech_examples.md:

  • Around line 316-323: The JSON example in the llm_speech_examples.md fenced
    block contains inline // comments which make it invalid JSON; update the example
    by either removing the comments from the JSON object (remove // Auto-detect and
    // Odia, Bengali, Punjabi, etc. next to "input_language" and "output_language")
    or change the fence from json to jsonc so the comments are allowed—edit
    the fenced example containing "query", "knowledge_base_ids", "input_language",
    and "output_language" accordingly.
  • Around line 5-7: Several fenced code blocks in llm_speech_examples.md (for
    example the block showing "POST /llm/sts") lack language specifiers; update each
    such fence to include the appropriate language tag (e.g., change tohttp
    for HTTP request examples like the "POST /llm/sts" block and to ```text for
    plain text snippets) so the markdown linter (MD040) is satisfied and readability
    is improved. Locate the fences that contain HTTP request examples (e.g., the
    block with "POST /llm/sts" and the other similar request examples) and the
    plain-text response or sample blocks, and add the correct language identifiers
    to those triple-backtick fences.

In @backend/app/api/routes/llm_speech.py:

  • Around line 36-40: The function speech_to_speech is missing a return type
    annotation; update its signature to include an explicit return type that matches
    the actual return value used in the implementation (for example, if it returns a
    FastAPI Response or StreamingResponse annotate with that type, or annotate with
    dict/Any if it returns JSON-like data). Modify the def speech_to_speech(...)
    signature to add the correct return type hint while keeping existing parameter
    types AuthContextDep, SessionDep, and SpeechToSpeechRequest unchanged.

In @backend/app/models/llm/response.py:

  • Around line 83-88: Update the docstring for the IntermediateChainResponse
    class by correcting the typo "Flattend" to "Flattened" in the descriptive text
    so the docstring reads "Flattened structure matching LLMCallResponse keys for
    consistency"; locate the docstring in the IntermediateChainResponse definition
    and make the single-word correction.

In @backend/app/services/llm/chain/utils.py:

  • Around line 94-96: The assignment to params["language_code"] uses a redundant
    conditional; replace the conditional expression with a direct assignment so
    params["language_code"] is set to language_code (i.e., remove the ternary that
    checks for "unknown"), locating the change at the line where
    params["language_code"] and language_code are used in this module.
  • Around line 149-184: The build_tts_block function currently hardcodes "en-IN"
    for Sarvam provider instead of using the passed language_code; update the Sarvam
    branch in build_tts_block so params["target_language_code"] = language_code (use
    the function arg), keeping the existing params["speaker"] and
    params["output_audio_codec"] assignments intact; verify model_configs and the
    provider detection logic for "sarvamai-native" remain unchanged so the correct
    branch still runs for TTSModel.SARVAM.

In @backend/app/tests/services/llm/test_chain.py:

  • Around line 33-219: The tests use untyped, manually-constructed fixtures
    (context, text_response, audio_response, make_config) and unannotated test
    functions; replace these with the project's factory-pattern fixtures (use the
    existing chain/context/llm factories instead of manual construction inside
    context, text_response, audio_response, and make_config) and add explicit type
    hints to each fixture and test function signature and return value (e.g.,
    annotate context: ChainContext, text_response/audio_response: LLMCallResponse,
    make_config: LLMCallConfig, and test functions' parameters/return types),
    importing any needed types from the production modules; update tests referencing
    ChainBlock, LLMChain, result_to_query, BlockResult, QueryParams to accept the
    typed fixtures so the tests remain functionally identical but follow the factory
  • typing guidelines.

Outside diff comments:
In @backend/app/crud/llm.py:

  • Around line 46-55: The function signature for create_llm_call has a defaulted
    parameter (chain_id) placed before required keyword-only params, causing a
    syntax error; fix it by reordering parameters so all required keyword-only
    arguments (project_id, organization_id, resolved_config, original_provider)
    appear before any parameters with defaults, or alternatively give defaults to
    those subsequent parameters—update the signature in the create_llm_call
    definition accordingly so Python accepts it (keep references to create_llm_call,
    chain_id, project_id, organization_id, resolved_config, original_provider).

Nitpick comments:
In @backend/app/api/docs/llm/speech_to_speech.md:

  • Around line 7-15: Add language specifiers to the two fenced code blocks that
    currently contain "POST /llm/sts" and "Voice Input → STT (auto language) → RAG
    (Knowledge Base) → TTS → Voice Output": update the opening backticks of each
    code fence to include a language identifier such as text or plaintext (e.g.,
    change totext) so linting rules and accessibility are satisfied while
    leaving the block contents unchanged.

In @backend/app/api/routes/llm_speech.py:

  • Around line 67-80: The HTTPException is being imported inside conditional
    blocks in llm_speech route; move the import to the module top-level alongside
    other FastAPI imports and remove the inline imports in the validation branches
    that raise HTTPException (the blocks checking request.input_language and
    request.output_language against LANGUAGE_CODES). Locate the inline "from fastapi
    import HTTPException" lines near the checks and delete them, and add a single
    top-level "from fastapi import HTTPException" import so the raises in those
    branches use the top-level symbol.

In @backend/app/models/llm/request.py:

  • Around line 15-34: The enums STTModel, TTSModel, and LLMModel currently
    inherit from str and Enum; replace them to inherit from enum.StrEnum (import
    StrEnum from enum) so they become true string enums per Python 3.11+; update the
    class bases for STTModel, TTSModel, and LLMModel and ensure the import is added
    (and replicate the same change for the other occurrence noted at lines 634-640).

In @backend/app/services/llm/chain/chain.py:

  • Line 3: The import currently brings Callable from typing; update the import in
    this module so Callable is imported from collections.abc instead (replace the
    typing import for Callable with "from collections.abc import Callable") to
    follow Python 3.9+ best practices; ensure any other uses of Callable in this
    file (the import line at the top that includes "from typing import Any,
    Callable") still keep Any imported from typing or move Any as needed.
  • Around line 121-135: Add a read-only property on ChainBlock (e.g., def
    index(self) -> int: return self._index) and update LLMChain.execute to use that
    property instead of accessing the private attribute directly: replace
    block._index with block.index in the on_block_completed call and the error log
    (and anywhere else in LLMChain.execute that reads the private field) so callers
    use the public read-only accessor while preserving current behavior.

In @backend/app/services/llm/chain/types.py:

  • Line 15: Replace the broad "metadata: dict | None = None" annotation with a
    tightened generic type "dict[str, Any] | None" in types.py and add the required
    "Any" import from typing; update the annotation on the metadata field (in the
    dataclass/type that declares metadata) to use dict[str, Any] | None so static
    checkers and downstream code get precise typing. Ensure you import Any (e.g.,
    from typing import Any) at the top of the file.

In @backend/app/tests/services/llm/test_chain_executor.py:

  • Around line 161-173: In test_run_failure_sends_callback, the local variable
    result is assigned from executor.run() but never used; either add an assertion
    that inspects result (e.g., assert expected failure structure or status) to make
    the test explicit, or rename the variable to _ to indicate it is intentionally
    unused—update the test_run_failure_sends_callback method where executor.run() is
    called to implement one of these fixes.
  • Around line 192-215: The patch for JobCrud in
    test_setup_updates_job_and_chain_status creates mock_job_crud but never uses or
    asserts it; either add an assertion that it was invoked (e.g., ensure JobCrud
    was constructed or specific CRUD method called after executor.run) referencing
    mock_job_crud, or rename the patch target variable to _mock_job_crud to indicate
    it is intentionally unused; update the test around executor.run and the
    with(...) patch tuple to implement one of these fixes.

In @backend/app/tests/services/llm/test_jobs.py:

  • Around line 1358-1366: The test assigns result =
    self._execute_chain_job(chain_request_data) but never uses it; either remove the
    unused assignment or add an assertion about the execution outcome. Locate the
    call to self._execute_chain_job in the test (near the mock_update_status
    assertions) and either delete the "result =" prefix so the call stands alone, or
    add an assertion referencing result (e.g., assert result is not None or assert
    result.status == ...) that expresses the expected behavior while keeping the
    existing mock_update_status assertions intact.

In @backend/test_sts_debug.py:

  • Around line 145-147: The assignment to func using getattr(module,
    "execute_chain_job") is unused; either remove the assignment and just
    access/verify the attribute (e.g., use hasattr(module, "execute_chain_job") or
    assert getattr(module, "execute_chain_job") is not None) or use func after
    assignment. Update the code around
    importlib.import_module("app.services.llm.jobs") and the getattr call so you
    don't leave an unused variable named func—choose removal of the assignment or
    replace it with a presence check (hasattr or assert) to express the intent.
  • Around line 79-82: Several print statements use f-strings with no
    interpolation; change them to normal string literals to satisfy static analysis
    (remove the leading f from prints that contain no {} placeholders). Locate the
    print calls around the function that prints and returns job_id (the three lines
    printing "✅ Chain job created successfully!", " Job ID: {job_id}" is fine to
    keep as an f-string but others without placeholders must drop the f) and also
    update the other print lines flagged (around lines indicated: 105, 116, 139,
    147, and the block 170-191) by removing the unnecessary f prefix so only prints
    that actually interpolate job_id or other variables use f-strings.
  • Line 14: Add explicit return type annotations to the test functions (e.g.,
    test_chain_job_creation and the other top-level test_* functions missing
    annotations) by adding -> None to their definitions and annotate any parameters
    if present; update each def statement (such as def test_chain_job_creation():)
    to include the return type (def test_chain_job_creation() -> None:) so all test
    functions have proper type hints.

</details>

---

<details>
<summary>ℹ️ Review info</summary>

<details>
<summary>⚙️ Run configuration</summary>

**Configuration used**: defaults

**Review profile**: CHILL

**Plan**: Pro

**Run ID**: `10106e27-d2cf-4baa-b7a8-1b10af5b048b`

</details>

<details>
<summary>📥 Commits</summary>

Reviewing files that changed from the base of the PR and between e28079c70ea98dac4ed4886541565eac1bf3f6b5 and a530796e7639ecc7f6fe12a92b25a711925ca21c.

</details>

<details>
<summary>📒 Files selected for processing (25)</summary>

* `backend/app/alembic/versions/048_create_llm_chain_table.py`
* `backend/app/api/docs/llm/llm_chain.md`
* `backend/app/api/docs/llm/speech_to_speech.md`
* `backend/app/api/main.py`
* `backend/app/api/routes/llm_chain.py`
* `backend/app/api/routes/llm_speech.py`
* `backend/app/api/routes/llm_speech_examples.md`
* `backend/app/crud/llm.py`
* `backend/app/crud/llm_chain.py`
* `backend/app/models/__init__.py`
* `backend/app/models/job.py`
* `backend/app/models/llm/__init__.py`
* `backend/app/models/llm/request.py`
* `backend/app/models/llm/response.py`
* `backend/app/services/llm/chain/__init__.py`
* `backend/app/services/llm/chain/chain.py`
* `backend/app/services/llm/chain/executor.py`
* `backend/app/services/llm/chain/types.py`
* `backend/app/services/llm/chain/utils.py`
* `backend/app/services/llm/jobs.py`
* `backend/app/tests/crud/test_llm_chain.py`
* `backend/app/tests/services/llm/test_chain.py`
* `backend/app/tests/services/llm/test_chain_executor.py`
* `backend/app/tests/services/llm/test_jobs.py`
* `backend/test_sts_debug.py`

</details>

</details>

<!-- This is an auto-generated comment by CodeRabbit for review status -->

Comment on lines +39 to +41
def llm_chain(
_current_user: AuthContextDep, _session: SessionDep, request: LLMChainRequest
):
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

Add return type hint to function signature.

The function is missing a return type hint. As per coding guidelines, all function parameters and return values should have type hints.

📝 Proposed fix
 def llm_chain(
     _current_user: AuthContextDep, _session: SessionDep, request: LLMChainRequest
-):
+) -> APIResponse[Message]:
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def llm_chain(
_current_user: AuthContextDep, _session: SessionDep, request: LLMChainRequest
):
def llm_chain(
_current_user: AuthContextDep, _session: SessionDep, request: LLMChainRequest
) -> APIResponse[Message]:
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/app/api/routes/llm_chain.py` around lines 39 - 41, The llm_chain
endpoint function lacks a return type hint; update the function signature for
llm_chain(_current_user: AuthContextDep, _session: SessionDep, request:
LLMChainRequest) to include the correct return type (e.g., -> LLMChainResponse
or the actual response type used by this route), and add any missing import for
that response type at the top of the file; ensure the declared return type
matches what the function actually returns.

Comment on lines +5 to +7
```
POST /llm/sts
```
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Add language identifiers to fenced code blocks.

These fences are missing language specifiers (MD040). Please label them (e.g., http, text) to satisfy lint and improve readability.

Also applies to: 124-126, 167-169, 211-214

🧰 Tools
🪛 markdownlint-cli2 (0.21.0)

[warning] 5-5: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/app/api/routes/llm_speech_examples.md` around lines 5 - 7, Several
fenced code blocks in llm_speech_examples.md (for example the block showing
"POST /llm/sts") lack language specifiers; update each such fence to include the
appropriate language tag (e.g., change ``` to ```http for HTTP request examples
like the "POST /llm/sts" block and to ```text for plain text snippets) so the
markdown linter (MD040) is satisfied and readability is improved. Locate the
fences that contain HTTP request examples (e.g., the block with "POST /llm/sts"
and the other similar request examples) and the plain-text response or sample
blocks, and add the correct language identifiers to those triple-backtick
fences.

Comment on lines +316 to +323
{
"query": {...},
"knowledge_base_ids": ["kb_123"],
"input_language": "auto", // Auto-detect
"output_language": "odia", // Odia, Bengali, Punjabi, etc.
"callback_url": "..."
}
```
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

json example is not valid JSON.

Lines 319-320 include // comments inside a json block. Either remove inline comments or change the fence to jsonc.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/app/api/routes/llm_speech_examples.md` around lines 316 - 323, The
JSON example in the llm_speech_examples.md fenced block contains inline //
comments which make it invalid JSON; update the example by either removing the
comments from the JSON object (remove // Auto-detect and // Odia, Bengali,
Punjabi, etc. next to "input_language" and "output_language") or change the
fence from ```json to ```jsonc so the comments are allowed—edit the fenced
example containing "query", "knowledge_base_ids", "input_language", and
"output_language" accordingly.

Comment on lines +36 to +40
def speech_to_speech(
_current_user: AuthContextDep,
_session: SessionDep,
request: SpeechToSpeechRequest,
):
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

Add return type hint to function signature.

The function is missing a return type hint. As per coding guidelines, all function parameters and return values should have type hints.

📝 Proposed fix
 def speech_to_speech(
     _current_user: AuthContextDep,
     _session: SessionDep,
     request: SpeechToSpeechRequest,
-):
+) -> APIResponse[Message]:
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def speech_to_speech(
_current_user: AuthContextDep,
_session: SessionDep,
request: SpeechToSpeechRequest,
):
def speech_to_speech(
_current_user: AuthContextDep,
_session: SessionDep,
request: SpeechToSpeechRequest,
) -> APIResponse[Message]:
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/app/api/routes/llm_speech.py` around lines 36 - 40, The function
speech_to_speech is missing a return type annotation; update its signature to
include an explicit return type that matches the actual return value used in the
implementation (for example, if it returns a FastAPI Response or
StreamingResponse annotate with that type, or annotate with dict/Any if it
returns JSON-like data). Modify the def speech_to_speech(...) signature to add
the correct return type hint while keeping existing parameter types
AuthContextDep, SessionDep, and SpeechToSpeechRequest unchanged.

Comment on lines +94 to +96
params["language_code"] = (
language_code if language_code != "unknown" else "unknown"
)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Redundant conditional expression.

The expression language_code if language_code != "unknown" else "unknown" always evaluates to language_code regardless of the condition.

Proposed fix
     if provider == "sarvamai-native":
         # Use "unknown" for automatic language detection, or specific BCP-47 code
-        params["language_code"] = (
-            language_code if language_code != "unknown" else "unknown"
-        )
+        params["language_code"] = language_code
         params["mode"] = "transcribe"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/app/services/llm/chain/utils.py` around lines 94 - 96, The assignment
to params["language_code"] uses a redundant conditional; replace the conditional
expression with a direct assignment so params["language_code"] is set to
language_code (i.e., remove the ternary that checks for "unknown"), locating the
change at the line where params["language_code"] and language_code are used in
this module.

Comment on lines +33 to +219
@pytest.fixture
def context():
return ChainContext(
job_id=uuid4(),
chain_id=uuid4(),
project_id=1,
organization_id=1,
callback_url="https://example.com/callback",
total_blocks=3,
intermediate_callback_flags=[True, True, False],
)


@pytest.fixture
def text_response():
return LLMCallResponse(
response=LLMResponse(
provider_response_id="resp-1",
conversation_id=None,
model="gpt-4",
provider="openai",
output=TextOutput(content=ResponseTextContent(value="Hello world")),
),
usage=Usage(input_tokens=10, output_tokens=20, total_tokens=30),
provider_raw_response=None,
)


@pytest.fixture
def audio_response():
return LLMCallResponse(
response=LLMResponse(
provider_response_id="resp-2",
conversation_id=None,
model="gemini",
provider="google",
output=AudioOutput(
content=AudioContent(
format="base64",
value="audio-data-base64",
mime_type="audio/wav",
)
),
),
usage=Usage(input_tokens=5, output_tokens=15, total_tokens=20),
provider_raw_response=None,
)


def make_config():
return LLMCallConfig(
blob=ConfigBlob(
completion=NativeCompletionConfig(
provider="openai-native",
type="text",
params={"model": "gpt-4"},
)
)
)


class TestResultToQuery:
def test_text_output_to_query(self, text_response):
result = BlockResult(response=text_response, usage=text_response.usage)

query = result_to_query(result)

assert isinstance(query.input, TextInput)
assert query.input.content.value == "Hello world"

def test_audio_output_to_query(self, audio_response):
result = BlockResult(response=audio_response, usage=audio_response.usage)

query = result_to_query(result)

assert isinstance(query.input, AudioInput)
assert query.input.content.value == "audio-data-base64"

def test_unsupported_output_type_raises(self):
mock_response = MagicMock()
mock_response.response.output.type = "unknown"
mock_response.response.output.__class__ = type("Unknown", (), {})
result = BlockResult(response=mock_response, usage=MagicMock())

with pytest.raises(ValueError, match="Cannot chain output type"):
result_to_query(result)


class TestChainBlock:
def test_execute_single_block(self, context, text_response):
query = QueryParams(input="test input")
config = make_config()
block = ChainBlock(config=config, index=0, context=context)

with patch("app.services.llm.chain.chain.execute_llm_call") as mock_execute:
mock_execute.return_value = BlockResult(
response=text_response, usage=text_response.usage
)

result = block.execute(query)

assert result.success
mock_execute.assert_called_once()

def test_execute_returns_failure(self, context):
query = QueryParams(input="test input")
config = make_config()
block = ChainBlock(config=config, index=0, context=context)

with patch("app.services.llm.chain.chain.execute_llm_call") as mock_execute:
mock_execute.return_value = BlockResult(error="Provider error")

result = block.execute(query)

assert not result.success
assert result.error == "Provider error"
mock_execute.assert_called_once()


class TestLLMChain:
def test_execute_empty_chain(self, context):
chain = LLMChain([], context)
query = QueryParams(input="test")

result = chain.execute(query)

assert not result.success
assert result.error == "Chain has no blocks"

def test_execute_single_block_chain(self, context, text_response):
config = make_config()
block = ChainBlock(config=config, index=0, context=context)
chain = LLMChain([block], context)

with patch("app.services.llm.chain.chain.execute_llm_call") as mock_execute:
mock_execute.return_value = BlockResult(
response=text_response, usage=text_response.usage
)

result = chain.execute(QueryParams(input="hello"))

assert result.success
mock_execute.assert_called_once()

def test_execute_multi_block_chain(self, context, text_response):
config = make_config()
blocks = [ChainBlock(config=config, index=i, context=context) for i in range(3)]
chain = LLMChain(blocks, context)

with patch("app.services.llm.chain.chain.execute_llm_call") as mock_execute:
mock_execute.return_value = BlockResult(
response=text_response, usage=text_response.usage
)

result = chain.execute(QueryParams(input="hello"))

assert result.success
assert mock_execute.call_count == 3

def test_execute_stops_on_failure(self, context, text_response):
config = make_config()
blocks = [ChainBlock(config=config, index=i, context=context) for i in range(3)]
chain = LLMChain(blocks, context)

with patch("app.services.llm.chain.chain.execute_llm_call") as mock_execute:
mock_execute.return_value = BlockResult(error="Provider error")

result = chain.execute(QueryParams(input="hello"))

assert not result.success
assert result.error == "Provider error"
mock_execute.assert_called_once()

def test_execute_calls_on_block_completed(self, context, text_response):
config = make_config()
blocks = [ChainBlock(config=config, index=i, context=context) for i in range(2)]
chain = LLMChain(blocks, context)
callback = MagicMock()

with patch("app.services.llm.chain.chain.execute_llm_call") as mock_execute:
mock_execute.return_value = BlockResult(
response=text_response, usage=text_response.usage
)

chain.execute(QueryParams(input="hello"), on_block_completed=callback)

assert callback.call_count == 2
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

Adopt typed factory fixtures in this test module.

Across Lines 33-219, fixture/test helpers are untyped and setup objects are manually assembled. Please switch to the project’s factory-pattern fixtures and add explicit type hints for fixture/test function signatures and returns.

As per coding guidelines backend/app/tests/**/*.py: “Use factory pattern for test fixtures in backend/app/tests/” and **/*.py: “Always add type hints to all function parameters and return values in Python code”.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/app/tests/services/llm/test_chain.py` around lines 33 - 219, The
tests use untyped, manually-constructed fixtures (context, text_response,
audio_response, make_config) and unannotated test functions; replace these with
the project's factory-pattern fixtures (use the existing chain/context/llm
factories instead of manual construction inside context, text_response,
audio_response, and make_config) and add explicit type hints to each fixture and
test function signature and return value (e.g., annotate context: ChainContext,
text_response/audio_response: LLMCallResponse, make_config: LLMCallConfig, and
test functions' parameters/return types), importing any needed types from the
production modules; update tests referencing ChainBlock, LLMChain,
result_to_query, BlockResult, QueryParams to accept the typed fixtures so the
tests remain functionally identical but follow the factory + typing guidelines.

@codecov
Copy link

codecov bot commented Mar 5, 2026

Codecov Report

❌ Patch coverage is 46.66667% with 48 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
backend/app/services/llm/chain/utils.py 23.33% 23 Missing ⚠️
backend/app/api/routes/llm_speech.py 39.39% 20 Missing ⚠️
backend/app/models/llm/request.py 80.76% 5 Missing ⚠️

📢 Thoughts on this report? Let us know!

@Prajna1999 Prajna1999 changed the title S2S: Speech-to-Speech in a single API Call S2S: Speech-to-Speech with File Search in a Single API Call Mar 6, 2026
@Prajna1999 Prajna1999 force-pushed the feature/speech-to-speech branch 2 times, most recently from 9e4f58e to c1807df Compare March 6, 2026 07:16
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

🧹 Nitpick comments (7)
backend/test_sts_debug.py (5)

91-91: Add return type hint to function signature.

📝 Proposed fix
-def check_celery_connection():
+def check_celery_connection() -> None:
     """Check if Celery is running and can receive tasks."""
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/test_sts_debug.py` at line 91, The function check_celery_connection
currently lacks a return type hint; update its signature to include the
appropriate return type (e.g., -> bool or -> None) consistent with its
implementation and callers so type checkers know what it returns; locate the
function named check_celery_connection and add the correct return annotation to
the def line and adjust any typing imports if needed.

130-130: Add return type hint to function signature.

📝 Proposed fix
-def check_function_import():
+def check_function_import() -> None:
     """Verify execute_chain_job can be imported."""
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/test_sts_debug.py` at line 130, The function check_function_import
currently lacks a return type hint; update its signature to include an explicit
return type (e.g., -> bool or -> None as appropriate) so callers and type
checkers know what to expect; locate the def check_function_import() declaration
and add the correct return annotation consistent with its implementation and
tests.

79-82: Remove extraneous f prefixes from strings without placeholders.

Multiple print statements use f-strings without any placeholders. This is flagged by static analysis (F541). Remove the f prefix or use regular strings.

Affected lines: 79, 81, 105, 116, 139, 147, 170-182, 190-191.

📝 Example fix
-            print(f"✅ Chain job created successfully!")
+            print("✅ Chain job created successfully!")
             print(f"   Job ID: {job_id}")
-            print(f"   Check your Celery worker logs for task execution")
+            print("   Check your Celery worker logs for task execution")
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/test_sts_debug.py` around lines 79 - 82, Several print statements use
f-strings with no placeholders (e.g., print(f"✅ Chain job created
successfully!"), print(f"   Check your Celery worker logs for task execution"),
and other prints around job_id, status, and debug outputs). Remove the
unnecessary 'f' prefix on these string literals (or add actual placeholders if
intended) so they become regular strings (e.g., print("✅ Chain job created
successfully!")) and leave prints that do interpolate variables (like job_id) as
f-strings; update all occurrences flagged (the prints around job_id, status
messages, and the blocks mentioned) to eliminate F541 warnings while keeping
semantics unchanged.

143-148: Remove unused variable and simplify dynamic import check.

Line 146 assigns to func but never uses it. Also, getattr with a constant attribute can be replaced with direct attribute access. However, since this tests dynamic import behavior (same as Celery), consider using hasattr instead if you only need to verify importability.

📝 Proposed fix
         # Try dynamic import (same way Celery does it)
         import importlib

         module = importlib.import_module("app.services.llm.jobs")
-        func = getattr(module, "execute_chain_job")
+        _ = module.execute_chain_job  # Verify attribute exists
         print(f"✅ Dynamic import successful (same as Celery)")
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/test_sts_debug.py` around lines 143 - 148, Remove the unused variable
and simplify the dynamic import check by replacing the unused getattr pattern:
import the module via importlib.import_module("app.services.llm.jobs") and
verify the presence of the constant attribute "execute_chain_job" with
hasattr(module, "execute_chain_job") (instead of assigning func = getattr(...)),
then keep the existing print confirmation; ensure importlib, module, and the
"execute_chain_job" symbol are referenced as described.

14-14: Add return type hint to function signature.

Per coding guidelines, all functions should have type hints for parameters and return values.

📝 Proposed fix
-def test_chain_job_creation():
+def test_chain_job_creation() -> str | None:
     """Test if chain job can be created and queued."""
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/test_sts_debug.py` at line 14, The test function
test_chain_job_creation lacks a return type annotation; update its signature to
include an explicit return type (e.g., change def test_chain_job_creation(): to
def test_chain_job_creation() -> None:) to comply with the project's
type-hinting guidelines; ensure any parameters (if added later) also include
type hints and run the test suite/type checker after the change.
backend/app/models/llm/request.py (1)

14-34: Consider using StrEnum instead of (str, Enum) for Python 3.11+.

Per coding guidelines, the codebase uses Python 3.11+. The modern StrEnum from the enum module is the idiomatic replacement for (str, Enum) inheritance, providing cleaner code and better type inference.

📝 Proposed fix
-from enum import Enum
+from enum import Enum, StrEnum

 # Speech-to-Speech Model Enums
-class STTModel(str, Enum):
+class STTModel(StrEnum):
     """Supported STT models for speech-to-speech."""

     GEMINI_PRO = "gemini-2.5-pro"
     SARVAM = "saaras:v3"


-class TTSModel(str, Enum):
+class TTSModel(StrEnum):
     """Supported TTS models for speech-to-speech."""

     GEMINI_PRO = "gemini-2.5-pro-preview-tts"
     SARVAM = "bulbul:v3"


-class LLMModel(str, Enum):
+class LLMModel(StrEnum):
     """Supported LLM models for RAG in speech-to-speech."""

     GPT4O = "gpt-4o"
     GPT4O_MINI = "gpt-4o-mini"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/app/models/llm/request.py` around lines 14 - 34, Replace the three
Enum classes to use the modern StrEnum: import StrEnum from enum (instead of
using (str, Enum)) and change class bases for STTModel, TTSModel, and LLMModel
to inherit from StrEnum; keep the same member names/values unchanged so
downstream code and type inference benefit from StrEnum behavior.
backend/app/api/routes/llm_speech_examples.md (1)

1-7: Documentation file location violates coding guidelines.

This file is located at backend/app/api/routes/llm_speech_examples.md, but per coding guidelines, endpoint documentation should be stored in backend/app/api/docs/<domain>/<action>.md. Consider moving this file to backend/app/api/docs/llm/speech_to_speech_examples.md to maintain consistency with the codebase structure.

As per coding guidelines: "Store endpoint documentation files in backend/app/api/docs/<domain>/<action>.md"

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/app/api/routes/llm_speech_examples.md` around lines 1 - 7, The
documentation file backend/app/api/routes/llm_speech_examples.md is in the wrong
location per guidelines; move it into the docs tree and rename to follow
domain/action semantics (e.g.,
backend/app/api/docs/llm/speech_to_speech_examples.md), update any references or
imports that point to the old path (search for llm_speech_examples.md or
routes/llm_speech_examples.md), and ensure the new file path and name match
other docs naming conventions so tooling and CI pick it up.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@backend/app/api/docs/llm/speech_to_speech.md`:
- Around line 183-212: The TTS callback example under "Callback 3: TTS Output
(Final)" contains the wrong model version; update the JSON field "model":
"bulbul:v1" to "model": "bulbul:v3" so it matches the documented default TTS
model used elsewhere (e.g., the default referenced at "bulbul:v3"); ensure the
change is made only in the example JSON "response" object where "provider":
"sarvamai-native" and "provider_response_id": "tts_def456".
- Around line 122-151: The STT callback example under "Callback 1: STT Output
(Intermediate)" uses an incorrect model name; update the JSON "model" field in
that example (the key "model" inside the "response" object) from "saarika:v1" to
the documented default "saaras:v3" so it matches the default STT model elsewhere
in the docs.
- Around line 7-15: The Markdown fenced code blocks in speech_to_speech.md (the
POST /llm/sts endpoint block and the Flow diagram block) lack language
identifiers; update the first fenced block to use the "http" language identifier
and the Flow fenced block to use the "text" identifier so they pass MD040
linting (refer to the fenced blocks containing "POST /llm/sts" and "Voice Input
→ STT (auto language) → RAG (Knowledge Base) → TTS → Voice Output").

In `@backend/app/api/routes/llm_speech_examples.md`:
- Around line 86-121: Update the STT callback JSON example so its "provider" and
"model" match the documented defaults: replace "provider": "google-native" and
"model": "gemini-2.5-pro" in the response block with "provider":
"sarvamai-native" and "model": "saaras:v3" (or alternatively add a short note in
the example clarifying it intentionally uses non-default selection); modify only
the values for the "provider" and "model" fields inside the "response" object to
keep the example consistent with the "Defaults Used" section.
- Around line 336-342: The README example uses "tts_model": "bulbul-v3" which
conflicts with the TTSModel enum in request.py that defines "bulbul:v3"; update
the example in llm_speech_examples.md (the JSON block showing
stt_model/llm_model/tts_model) to use the colon format "bulbul:v3" so it matches
the TTSModel enum, or alternatively change the TTSModel entry in request.py to
"bulbul-v3" if you prefer hyphen formatting—ensure both the example and the
TTSModel constant (named TTSModel) use the exact same string.

In `@backend/app/api/routes/llm_speech.py`:
- Around line 104-115: The code mutates request.request_metadata in-place by
assigning metadata = request.request_metadata and then calling
metadata.update(...); instead create a new dict (e.g., metadata =
dict(request.request_metadata or {} ) or use request.request_metadata.copy()) so
you can safely update it with the STS keys ("speech_to_speech",
"input_language", "output_language", "stt_model", "llm_model", "tts_model")
without altering the original request.request_metadata; update the metadata
variable and leave request.request_metadata immutable.
- Around line 65-80: The two inline imports of HTTPException inside the
validation block should be moved to the module level with the other FastAPI
imports: add "from fastapi import HTTPException" at the top of the file and
remove the inline imports inside the conditional checks that validate
request.input_language and request.output_language against LANGUAGE_CODES;
ensure the behavior of the blocks using request.input_language,
request.output_language, and LANGUAGE_CODES remains unchanged.

---

Nitpick comments:
In `@backend/app/api/routes/llm_speech_examples.md`:
- Around line 1-7: The documentation file
backend/app/api/routes/llm_speech_examples.md is in the wrong location per
guidelines; move it into the docs tree and rename to follow domain/action
semantics (e.g., backend/app/api/docs/llm/speech_to_speech_examples.md), update
any references or imports that point to the old path (search for
llm_speech_examples.md or routes/llm_speech_examples.md), and ensure the new
file path and name match other docs naming conventions so tooling and CI pick it
up.

In `@backend/app/models/llm/request.py`:
- Around line 14-34: Replace the three Enum classes to use the modern StrEnum:
import StrEnum from enum (instead of using (str, Enum)) and change class bases
for STTModel, TTSModel, and LLMModel to inherit from StrEnum; keep the same
member names/values unchanged so downstream code and type inference benefit from
StrEnum behavior.

In `@backend/test_sts_debug.py`:
- Line 91: The function check_celery_connection currently lacks a return type
hint; update its signature to include the appropriate return type (e.g., -> bool
or -> None) consistent with its implementation and callers so type checkers know
what it returns; locate the function named check_celery_connection and add the
correct return annotation to the def line and adjust any typing imports if
needed.
- Line 130: The function check_function_import currently lacks a return type
hint; update its signature to include an explicit return type (e.g., -> bool or
-> None as appropriate) so callers and type checkers know what to expect; locate
the def check_function_import() declaration and add the correct return
annotation consistent with its implementation and tests.
- Around line 79-82: Several print statements use f-strings with no placeholders
(e.g., print(f"✅ Chain job created successfully!"), print(f"   Check your Celery
worker logs for task execution"), and other prints around job_id, status, and
debug outputs). Remove the unnecessary 'f' prefix on these string literals (or
add actual placeholders if intended) so they become regular strings (e.g.,
print("✅ Chain job created successfully!")) and leave prints that do interpolate
variables (like job_id) as f-strings; update all occurrences flagged (the prints
around job_id, status messages, and the blocks mentioned) to eliminate F541
warnings while keeping semantics unchanged.
- Around line 143-148: Remove the unused variable and simplify the dynamic
import check by replacing the unused getattr pattern: import the module via
importlib.import_module("app.services.llm.jobs") and verify the presence of the
constant attribute "execute_chain_job" with hasattr(module, "execute_chain_job")
(instead of assigning func = getattr(...)), then keep the existing print
confirmation; ensure importlib, module, and the "execute_chain_job" symbol are
referenced as described.
- Line 14: The test function test_chain_job_creation lacks a return type
annotation; update its signature to include an explicit return type (e.g.,
change def test_chain_job_creation(): to def test_chain_job_creation() -> None:)
to comply with the project's type-hinting guidelines; ensure any parameters (if
added later) also include type hints and run the test suite/type checker after
the change.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 2ff6a166-6d3b-4157-a6b4-342e4bcb0255

📥 Commits

Reviewing files that changed from the base of the PR and between a530796 and 9eeb999.

📒 Files selected for processing (7)
  • backend/app/api/docs/llm/speech_to_speech.md
  • backend/app/api/main.py
  • backend/app/api/routes/llm_speech.py
  • backend/app/api/routes/llm_speech_examples.md
  • backend/app/models/llm/request.py
  • backend/app/services/llm/chain/utils.py
  • backend/test_sts_debug.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • backend/app/api/main.py

Comment on lines +7 to +15
```
POST /llm/sts
```

## Flow

```
Voice Input → STT (auto language) → RAG (Knowledge Base) → TTS → Voice Output
```
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Add language identifiers to fenced code blocks.

The fenced code blocks at lines 7-9 and 13-15 are missing language specifiers per MD040. Use http for the endpoint block and text for the flow diagram.

📝 Proposed fix
 ## Endpoint
 
-```
+```http
 POST /llm/sts

Flow

- +text
Voice Input → STT (auto language) → RAG (Knowledge Base) → TTS → Voice Output

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
```
POST /llm/sts
```
## Flow
```
Voice Input → STT (auto language) → RAG (Knowledge Base) → TTS → Voice Output
```
## Endpoint
🧰 Tools
🪛 markdownlint-cli2 (0.21.0)

[warning] 7-7: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


[warning] 13-13: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/app/api/docs/llm/speech_to_speech.md` around lines 7 - 15, The
Markdown fenced code blocks in speech_to_speech.md (the POST /llm/sts endpoint
block and the Flow diagram block) lack language identifiers; update the first
fenced block to use the "http" language identifier and the Flow fenced block to
use the "text" identifier so they pass MD040 linting (refer to the fenced blocks
containing "POST /llm/sts" and "Voice Input → STT (auto language) → RAG
(Knowledge Base) → TTS → Voice Output").

Comment on lines +122 to +151
### Callback 1: STT Output (Intermediate)
```json
{
"success": true,
"data": {
"block_index": 1,
"total_blocks": 3,
"response": {
"provider_response_id": "stt_xyz789",
"provider": "sarvamai-native",
"model": "saarika:v1",
"output": {
"type": "text",
"content": {
"value": "नमस्ते, मुझे अपने अकाउंट के बारे में जानकारी चाहिए"
}
}
},
"usage": {
"input_tokens": 0,
"output_tokens": 12,
"total_tokens": 12
}
},
"metadata": {
"speech_to_speech": true,
"input_language": "hi-IN"
}
}
```
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

STT callback example shows incorrect model name.

Line 132 shows "model": "saarika:v1" but the documented default STT model is "saaras:v3" (line 55). Consider updating the example to use the correct model name for consistency.

📝 Proposed fix
     "response": {
       "provider_response_id": "stt_xyz789",
       "provider": "sarvamai-native",
-      "model": "saarika:v1",
+      "model": "saaras:v3",
       "output": {
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
### Callback 1: STT Output (Intermediate)
```json
{
"success": true,
"data": {
"block_index": 1,
"total_blocks": 3,
"response": {
"provider_response_id": "stt_xyz789",
"provider": "sarvamai-native",
"model": "saarika:v1",
"output": {
"type": "text",
"content": {
"value": "नमस्ते, मुझे अपने अकाउंट के बारे में जानकारी चाहिए"
}
}
},
"usage": {
"input_tokens": 0,
"output_tokens": 12,
"total_tokens": 12
}
},
"metadata": {
"speech_to_speech": true,
"input_language": "hi-IN"
}
}
```
### Callback 1: STT Output (Intermediate)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/app/api/docs/llm/speech_to_speech.md` around lines 122 - 151, The STT
callback example under "Callback 1: STT Output (Intermediate)" uses an incorrect
model name; update the JSON "model" field in that example (the key "model"
inside the "response" object) from "saarika:v1" to the documented default
"saaras:v3" so it matches the default STT model elsewhere in the docs.

Comment on lines +183 to +212
### Callback 3: TTS Output (Final)
```json
{
"success": true,
"data": {
"response": {
"provider_response_id": "tts_def456",
"provider": "sarvamai-native",
"model": "bulbul:v1",
"output": {
"type": "audio",
"content": {
"format": "base64",
"value": "base64_encoded_audio_output",
"mime_type": "audio/ogg"
}
}
},
"usage": {
"input_tokens": 15,
"output_tokens": 0,
"total_tokens": 15
}
},
"metadata": {
"speech_to_speech": true,
"output_language": "hi-IN"
}
}
```
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

TTS callback example shows incorrect model version.

Line 191 shows "model": "bulbul:v1" but the documented default TTS model is "bulbul:v3" (line 65). Update the example for consistency.

📝 Proposed fix
     "response": {
       "provider_response_id": "tts_def456",
       "provider": "sarvamai-native",
-      "model": "bulbul:v1",
+      "model": "bulbul:v3",
       "output": {
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
### Callback 3: TTS Output (Final)
```json
{
"success": true,
"data": {
"response": {
"provider_response_id": "tts_def456",
"provider": "sarvamai-native",
"model": "bulbul:v1",
"output": {
"type": "audio",
"content": {
"format": "base64",
"value": "base64_encoded_audio_output",
"mime_type": "audio/ogg"
}
}
},
"usage": {
"input_tokens": 15,
"output_tokens": 0,
"total_tokens": 15
}
},
"metadata": {
"speech_to_speech": true,
"output_language": "hi-IN"
}
}
```
### Callback 3: TTS Output (Final)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/app/api/docs/llm/speech_to_speech.md` around lines 183 - 212, The TTS
callback example under "Callback 3: TTS Output (Final)" contains the wrong model
version; update the JSON field "model": "bulbul:v1" to "model": "bulbul:v3" so
it matches the documented default TTS model used elsewhere (e.g., the default
referenced at "bulbul:v3"); ensure the change is made only in the example JSON
"response" object where "provider": "sarvamai-native" and
"provider_response_id": "tts_def456".

Comment on lines +86 to +121
```json
{
"success": true,
"data": {
"block_index": 1,
"total_blocks": 3,
"response": {
"provider_response_id": "stt_xyz789",
"provider": "google-native",
"model": "gemini-2.5-pro",
"output": {
"type": "text",
"content": {
"value": "मेरा अकाउंट बैलेंस क्या है?"
}
}
},
"usage": {
"input_tokens": 0,
"output_tokens": 8,
"total_tokens": 8
}
},
"metadata": {
"speech_to_speech": true,
"input_language": "hi-IN",
"output_language": "en-IN",
"stt_model": "gemini-2.5-pro",
"llm_model": "gpt-4o",
"tts_model": "bulbul-v3",
"user_id": "user_123",
"session_id": "session_456",
"source": "whatsapp"
}
}
```
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

STT callback example shows different provider than documented defaults.

The example shows "provider": "google-native" and "model": "gemini-2.5-pro" at lines 94-95, but the "Defaults Used" section at lines 31-35 states the default STT is "Sarvam Saaras V3". Consider updating the example to use "provider": "sarvamai-native" and "model": "saaras:v3" to match the documented defaults, or clarify that this example uses non-default model selection.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/app/api/routes/llm_speech_examples.md` around lines 86 - 121, Update
the STT callback JSON example so its "provider" and "model" match the documented
defaults: replace "provider": "google-native" and "model": "gemini-2.5-pro" in
the response block with "provider": "sarvamai-native" and "model": "saaras:v3"
(or alternatively add a short note in the example clarifying it intentionally
uses non-default selection); modify only the values for the "provider" and
"model" fields inside the "response" object to keep the example consistent with
the "Defaults Used" section.

Comment on lines +336 to +342
```json
{
"stt_model": "saaras:v3",
"llm_model": "gpt-4o",
"tts_model": "bulbul-v3"
}
```
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Inconsistent TTS model identifier format.

Line 340 shows "tts_model": "bulbul-v3" with a hyphen, but the TTSModel enum in request.py defines it as "bulbul:v3" with a colon. This inconsistency could cause errors if users copy the example directly.

📝 Proposed fix
 {
   "stt_model": "saaras:v3",
   "llm_model": "gpt-4o",
-  "tts_model": "bulbul-v3"
+  "tts_model": "bulbul:v3"
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/app/api/routes/llm_speech_examples.md` around lines 336 - 342, The
README example uses "tts_model": "bulbul-v3" which conflicts with the TTSModel
enum in request.py that defines "bulbul:v3"; update the example in
llm_speech_examples.md (the JSON block showing stt_model/llm_model/tts_model) to
use the colon format "bulbul:v3" so it matches the TTSModel enum, or
alternatively change the TTSModel entry in request.py to "bulbul-v3" if you
prefer hyphen formatting—ensure both the example and the TTSModel constant
(named TTSModel) use the exact same string.

Comment on lines +65 to +80
if request.input_language and request.input_language != "auto":
if request.input_language not in LANGUAGE_CODES:
from fastapi import HTTPException

raise HTTPException(
status_code=400,
detail=f"Unsupported input language: {request.input_language}. Supported: {', '.join(LANGUAGE_CODES.keys())}",
)

if request.output_language and request.output_language not in LANGUAGE_CODES:
from fastapi import HTTPException

raise HTTPException(
status_code=400,
detail=f"Unsupported output language: {request.output_language}. Supported: {', '.join(LANGUAGE_CODES.keys())}",
)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

Move HTTPException import to module level.

HTTPException is imported twice inside the function body (lines 67 and 75). Move this import to the top of the file with other FastAPI imports for better performance and maintainability.

📝 Proposed fix
-from fastapi import APIRouter, Depends
+from fastapi import APIRouter, Depends, HTTPException

Then remove the inline imports at lines 67-68 and 75-76:

     if request.input_language and request.input_language != "auto":
         if request.input_language not in LANGUAGE_CODES:
-            from fastapi import HTTPException
-
             raise HTTPException(
                 status_code=400,
                 detail=f"Unsupported input language: {request.input_language}. Supported: {', '.join(LANGUAGE_CODES.keys())}",
             )

     if request.output_language and request.output_language not in LANGUAGE_CODES:
-        from fastapi import HTTPException
-
         raise HTTPException(
             status_code=400,
             detail=f"Unsupported output language: {request.output_language}. Supported: {', '.join(LANGUAGE_CODES.keys())}",
         )
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/app/api/routes/llm_speech.py` around lines 65 - 80, The two inline
imports of HTTPException inside the validation block should be moved to the
module level with the other FastAPI imports: add "from fastapi import
HTTPException" at the top of the file and remove the inline imports inside the
conditional checks that validate request.input_language and
request.output_language against LANGUAGE_CODES; ensure the behavior of the
blocks using request.input_language, request.output_language, and LANGUAGE_CODES
remains unchanged.

Comment on lines +104 to +115
# Add metadata to track STS-specific info
metadata = request.request_metadata or {}
metadata.update(
{
"speech_to_speech": True,
"input_language": input_lang_code,
"output_language": output_lang_code,
"stt_model": request.stt_model.value,
"llm_model": request.llm_model.value,
"tts_model": request.tts_model.value,
}
)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Avoid mutating the request's metadata dictionary in-place.

Line 105 assigns request.request_metadata directly to metadata, then line 106 mutates it with .update(). If request.request_metadata is not None, this modifies the original dictionary in-place, which could cause unintended side effects.

📝 Proposed fix
     # Add metadata to track STS-specific info
-    metadata = request.request_metadata or {}
+    metadata = dict(request.request_metadata) if request.request_metadata else {}
     metadata.update(
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/app/api/routes/llm_speech.py` around lines 104 - 115, The code
mutates request.request_metadata in-place by assigning metadata =
request.request_metadata and then calling metadata.update(...); instead create a
new dict (e.g., metadata = dict(request.request_metadata or {} ) or use
request.request_metadata.copy()) so you can safely update it with the STS keys
("speech_to_speech", "input_language", "output_language", "stt_model",
"llm_model", "tts_model") without altering the original
request.request_metadata; update the metadata variable and leave
request.request_metadata immutable.

@Prajna1999 Prajna1999 requested a review from nishika26 March 9, 2026 04:49
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (2)
backend/app/tests/services/llm/test_sts.py (2)

525-556: Avoid exact set equality for supported-language coverage.

This assertion will fail on any additive language rollout, even if the STS flow still works correctly. Prefer checking the required subset and validating each entry's format so future language additions don't break an unrelated change.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/app/tests/services/llm/test_sts.py` around lines 525 - 556, The test
test_all_supported_codes_are_valid currently asserts exact equality of
SUPPORTED_LANGUAGE_CODES to a hardcoded set which will break when new languages
are added; change it to verify that the required codes (the current minimal set)
are a subset of SUPPORTED_LANGUAGE_CODES and that every code in
SUPPORTED_LANGUAGE_CODES matches the expected BCP-47 format (e.g., via regex or
a validator) instead of exact set equality, referencing the
SUPPORTED_LANGUAGE_CODES constant and keeping the original list as the required
subset in the test.

227-243: Promote the STS payload setup into factories.

The scenarios below keep rebuilding SpeechToSpeechRequest payloads inline, so any schema tweak will force edits across most of the file. A request factory on top of these base fixtures would keep each test focused on the language permutation it cares about. As per coding guidelines backend/app/tests/**/*.py: Use factory pattern for test fixtures in `backend/app/tests/``.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/app/tests/services/llm/test_sts.py` around lines 227 - 243, The tests
repeatedly build SpeechToSpeechRequest payloads inline which is brittle; create
a reusable factory function/class (e.g., SpeechToSpeechRequestFactory) that
accepts overrides (language, voice, model, knowledge_base_ids) and uses the
existing mock_audio_input and knowledge_base_ids fixtures as defaults; update
tests to call SpeechToSpeechRequestFactory(...) instead of reconstructing
SpeechToSpeechRequest manually so schema changes are centralized and tests only
supply the differing fields.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@backend/app/api/main.py`:
- Around line 10-15: The import tuple in main.py contains a duplicate symbol
llm_sts which triggers Ruff F811; remove the extra llm_sts entry so llm_sts only
appears once in the import list (the tuple that currently contains llm_sts,
login, languages, llm, llm_chain, llm_sts) to resolve the lint error.

In `@backend/app/api/routes/llm_sts.py`:
- Around line 73-92: The code allows request.output_language values "auto" or
"unknown" (present in SUPPORTED_LANGUAGE_CODES) to flow into output_lang_code
and eventually into build_tts_block(), which produces invalid TTS provider
requests; add an explicit validation before computing output_lang_code that
rejects request.output_language if it equals "auto" or "unknown" (return
APIResponse.failure_response with a clear message and 400 status), referencing
request.output_language and SUPPORTED_LANGUAGE_CODES so build_tts_block() and
execute_llm_call() never receive those sentinel strings.

---

Nitpick comments:
In `@backend/app/tests/services/llm/test_sts.py`:
- Around line 525-556: The test test_all_supported_codes_are_valid currently
asserts exact equality of SUPPORTED_LANGUAGE_CODES to a hardcoded set which will
break when new languages are added; change it to verify that the required codes
(the current minimal set) are a subset of SUPPORTED_LANGUAGE_CODES and that
every code in SUPPORTED_LANGUAGE_CODES matches the expected BCP-47 format (e.g.,
via regex or a validator) instead of exact set equality, referencing the
SUPPORTED_LANGUAGE_CODES constant and keeping the original list as the required
subset in the test.
- Around line 227-243: The tests repeatedly build SpeechToSpeechRequest payloads
inline which is brittle; create a reusable factory function/class (e.g.,
SpeechToSpeechRequestFactory) that accepts overrides (language, voice, model,
knowledge_base_ids) and uses the existing mock_audio_input and
knowledge_base_ids fixtures as defaults; update tests to call
SpeechToSpeechRequestFactory(...) instead of reconstructing
SpeechToSpeechRequest manually so schema changes are centralized and tests only
supply the differing fields.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: fca12c2f-5f19-46d1-a4c0-c7fb1aed83bf

📥 Commits

Reviewing files that changed from the base of the PR and between 9eeb999 and 56920b1.

📒 Files selected for processing (8)
  • backend/app/api/main.py
  • backend/app/api/routes/llm_sts.py
  • backend/app/models/llm/request.py
  • backend/app/services/llm/chain/chain.py
  • backend/app/services/llm/chain/utils.py
  • backend/app/services/llm/jobs.py
  • backend/app/services/llm/providers/sai.py
  • backend/app/tests/services/llm/test_sts.py

Comment on lines +10 to +15
llm_sts,
login,
languages,
llm,
llm_chain,
llm_sts,
Copy link

@coderabbitai coderabbitai bot Mar 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Remove the duplicate llm_sts import.

llm_sts is imported twice in the same tuple here, which trips Ruff F811 and can block the lint job.

🧰 Tools
🪛 Ruff (0.15.4)

[error] 15-15: Redefinition of unused llm_sts from line 10: llm_sts redefined here

Remove definition: llm_sts

(F811)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/app/api/main.py` around lines 10 - 15, The import tuple in main.py
contains a duplicate symbol llm_sts which triggers Ruff F811; remove the extra
llm_sts entry so llm_sts only appears once in the import list (the tuple that
currently contains llm_sts, login, languages, llm, llm_chain, llm_sts) to
resolve the lint error.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Prajna1999 check this duplicate

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like the humans are having a chat. I'll hop back into my burrow for now. If you need me again, just tag @coderabbitai in a new comment, and I'll come hopping out!

Comment on lines +73 to +92
if (
request.output_language
and request.output_language not in SUPPORTED_LANGUAGE_CODES
):
return APIResponse.failure_response(
error=f"Unsupported output language code: {request.output_language}. Supported: {', '.join(sorted(SUPPORTED_LANGUAGE_CODES))}",
metadata={"status_code": 400},
)

# Determine language codes (already BCP-47, no conversion needed)
input_lang_code = request.input_language or "auto"

# If output_language not set, default to input_language
# If input is "auto", use "{{detected}}" marker to signal TTS to use detected language
if request.output_language:
output_lang_code = request.output_language
elif input_lang_code == "auto":
output_lang_code = "{{detected}}" # Marker to use detected language from STT
else:
output_lang_code = input_lang_code
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Reject output_language="auto" / "unknown" before building TTS.

This branch currently reuses SUPPORTED_LANGUAGE_CODES, which includes the STT-only sentinels "auto" and "unknown". If a client explicitly sends either value, build_tts_block() forwards it into the TTS config unchanged, and execute_llm_call() in backend/app/services/llm/jobs.py never rewrites it because only {{detected}} is substituted. That turns into an invalid provider request instead of using the detected language.

Suggested fix
-    if (
-        request.output_language
-        and request.output_language not in SUPPORTED_LANGUAGE_CODES
-    ):
+    if request.output_language in {"auto", "unknown"}:
+        return APIResponse.failure_response(
+            error=(
+                "Unsupported output language code: output_language must be a "
+                "concrete BCP-47 code. Omit it to use the detected language."
+            ),
+            metadata={"status_code": 400},
+        )
+
+    if (
+        request.output_language
+        and request.output_language not in SUPPORTED_LANGUAGE_CODES
+    ):
         return APIResponse.failure_response(
             error=f"Unsupported output language code: {request.output_language}. Supported: {', '.join(sorted(SUPPORTED_LANGUAGE_CODES))}",
             metadata={"status_code": 400},
         )
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/app/api/routes/llm_sts.py` around lines 73 - 92, The code allows
request.output_language values "auto" or "unknown" (present in
SUPPORTED_LANGUAGE_CODES) to flow into output_lang_code and eventually into
build_tts_block(), which produces invalid TTS provider requests; add an explicit
validation before computing output_lang_code that rejects
request.output_language if it equals "auto" or "unknown" (return
APIResponse.failure_response with a clear message and 400 status), referencing
request.output_language and SUPPORTED_LANGUAGE_CODES so build_tts_block() and
execute_llm_call() never receive those sentinel strings.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (2)
backend/app/api/routes/llm_sts.py (2)

35-39: 🛠️ Refactor suggestion | 🟠 Major

Add the missing return type annotation.

speech_to_speech still omits its return type, even though the handler returns an APIResponse[Message] payload. This was already called out earlier and is still unresolved.

As per coding guidelines, "Always add type hints to all function parameters and return values in Python code".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/app/api/routes/llm_sts.py` around lines 35 - 39, The speech_to_speech
handler is missing its return type annotation; update its signature for
speech_to_speech to declare it returns APIResponse[Message] (keeping the
existing parameter types like _current_user: AuthContextDep, _session:
SessionDep, request: SpeechToSpeechRequest) so the function signature explicitly
types the return value as APIResponse[Message]; ensure any imports or forward
refs for APIResponse and Message are present or added if needed.

70-89: ⚠️ Potential issue | 🟠 Major

Keep STT-only sentinels out of the TTS language path.

This still lets sentinel values leak into TTS config. output_language="auto" / "unknown" remains accepted, and input_language="unknown" with no explicit output_language still falls through to output_lang_code = "unknown". build_tts_block() forwards that raw string, while execute_llm_call() only rewrites the exact "{{detected}}" marker, so these requests can still fail at the TTS provider.

Suggested fix
+    if request.output_language in {"auto", "unknown"}:
+        return APIResponse.failure_response(
+            error=(
+                "Unsupported output language code: output_language must be a "
+                "concrete BCP-47 code. Omit it to use the detected language."
+            ),
+            metadata={"status_code": 400},
+        )
+
     if (
         request.output_language
         and request.output_language not in SUPPORTED_LANGUAGE_CODES
     ):
         return APIResponse.failure_response(
             error=f"Unsupported output language code: {request.output_language}. Supported: {', '.join(sorted(SUPPORTED_LANGUAGE_CODES))}",
             metadata={"status_code": 400},
         )
@@
-    elif input_lang_code == "auto":
+    elif input_lang_code in {"auto", "unknown"}:
         output_lang_code = "{{detected}}"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/app/api/routes/llm_sts.py` around lines 70 - 89, The current logic
lets STT-only sentinel values ("auto", "unknown") leak into TTS; update the
validation and output language selection so build_tts_block never receives raw
sentinel strings: in the request validation around SUPPORTED_LANGUAGE_CODES
treat "auto" and "unknown" as STT sentinels (not valid TTS language codes) and
map any occurrence of these (either request.output_language or derived
input_lang_code when no explicit output) to the "{{detected}}" marker used by
execute_llm_call; ensure output_lang_code is set to "{{detected}}" for sentinels
and only real language codes or "{{detected}}" are forwarded to build_tts_block
and execute_llm_call (reference symbols: request.output_language,
request.input_language, SUPPORTED_LANGUAGE_CODES, output_lang_code,
build_tts_block, execute_llm_call).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@backend/app/api/routes/llm_sts.py`:
- Around line 35-39: The speech_to_speech handler is missing its return type
annotation; update its signature for speech_to_speech to declare it returns
APIResponse[Message] (keeping the existing parameter types like _current_user:
AuthContextDep, _session: SessionDep, request: SpeechToSpeechRequest) so the
function signature explicitly types the return value as APIResponse[Message];
ensure any imports or forward refs for APIResponse and Message are present or
added if needed.
- Around line 70-89: The current logic lets STT-only sentinel values ("auto",
"unknown") leak into TTS; update the validation and output language selection so
build_tts_block never receives raw sentinel strings: in the request validation
around SUPPORTED_LANGUAGE_CODES treat "auto" and "unknown" as STT sentinels (not
valid TTS language codes) and map any occurrence of these (either
request.output_language or derived input_lang_code when no explicit output) to
the "{{detected}}" marker used by execute_llm_call; ensure output_lang_code is
set to "{{detected}}" for sentinels and only real language codes or
"{{detected}}" are forwarded to build_tts_block and execute_llm_call (reference
symbols: request.output_language, request.input_language,
SUPPORTED_LANGUAGE_CODES, output_lang_code, build_tts_block, execute_llm_call).

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: d5ccd710-6e43-49c9-9d39-43fd06db89c8

📥 Commits

Reviewing files that changed from the base of the PR and between 56920b1 and 96ea78e.

📒 Files selected for processing (1)
  • backend/app/api/routes/llm_sts.py

@Prajna1999 Prajna1999 self-assigned this Mar 10, 2026
@Prajna1999 Prajna1999 requested a review from kartpop March 11, 2026 06:27


@router.post(
"/llm/sts",
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

convert into llm/chain/sts


**Note:** Sarvam STT uses automatic language detection. No need to specify input language.

### LLM (RAG)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we support more than just these two models for RAG

EOF
```

**Note:** `stt_model`, `llm_model`, and `tts_model` are optional and will use defaults if not specified.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add "specifying" before "stt_model",etc , etc

"""Supported LLM models for RAG in speech-to-speech."""

GPT4O = "gpt-4o"
GPT4O_MINI = "gpt-4o-mini"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why are we supporting oonly these two models

Comment on lines +10 to +15
llm_sts,
login,
languages,
llm,
llm_chain,
llm_sts,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Prajna1999 check this duplicate

Comment on lines +29 to +31
print("\n" + "=" * 80)
print("STEP 1: Creating test chain request")
print("=" * 80)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need these

Comment on lines +1 to +9
"""Debug script for STS endpoint and chain job execution."""

import logging
import sys
from sqlmodel import Session

# Setup logging
logging.basicConfig(
level=logging.INFO, format="%(asctime)s - %(name)s - %(levelname)s - %(message)s"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need this file?

@@ -7,10 +7,12 @@
config,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Imports should be sorted alphabetically


import logging

from fastapi import APIRouter, Depends, HTTPException
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

HTTPException is not needed

response_model=APIResponse[Message],
dependencies=[Depends(require_permission(Permission.REQUIRE_PROJECT))],
)
def speech_to_speech(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

type annotation missing

)
def speech_to_speech(
_current_user: AuthContextDep,
_session: SessionDep,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should _session start with underscore if later we pass to different function in line 129

)

# Optional language config (BCP-47 codes)
input_language: str | None = Field(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this be str | None or str as default is "auto"

and request.input_language not in SUPPORTED_LANGUAGE_CODES
):
return APIResponse.failure_response(
error=f"Unsupported input language code: {request.input_language}. Supported: {', '.join(sorted(SUPPORTED_LANGUAGE_CODES))}",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need it sorted ? why not keep the sorted already

config=LLMCallConfig(
blob=ConfigBlob(
completion=KaapiCompletionConfig(
provider="openai",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is hardcoded here, but we need to write code that is extensible for other provider and models and abstract it

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants