Send x-llamastack-provider-data header with MCP auth to llama-stack#1179
Send x-llamastack-provider-data header with MCP auth to llama-stack#1179max-svistunov wants to merge 1 commit intolightspeed-core:mainfrom
Conversation
WalkthroughAdds MCP provider-data header construction and propagation: a helper builds an Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes 🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/utils/responses.py (1)
315-327:⚠️ Potential issue | 🔴 CriticalExtract
extra_headersfromResponsesApiParams.model_dump()before passing toclient.responses.create().The regression is not fixed: all three callers (a2a.py line 330, query.py line 33, streaming_query.py line 42) use
**responses_params.model_dump(), which incorrectly injectsextra_headersinto the JSON body instead of passing it as a transport-level HTTP header kwarg. The corrected pattern must be:body = responses_params.model_dump(exclude={"extra_headers"}, exclude_none=True) await client.responses.create(**body, extra_headers=responses_params.extra_headers)Without this fix, the
x-llamastack-provider-dataheader carrying MCP tool authentication is silently discarded, leaving the backend without required provider credentials.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/utils/responses.py` around lines 315 - 327, ResponsesApiParams.model_dump() is including extra_headers in the JSON body; change all call sites that do client.responses.create(**responses_params.model_dump(), ...) to instead extract the body via responses_params.model_dump(exclude={"extra_headers"}, exclude_none=True) and pass responses_params.extra_headers as the extra_headers kwarg to client.responses.create(..., extra_headers=responses_params.extra_headers). Update the callers that build ResponsesApiParams (the places invoking ResponsesApiParams and then calling model_dump before client.responses.create) to use this two-step pattern so the x-llamastack-provider-data header is sent as a transport header rather than part of the JSON payload.
🧹 Nitpick comments (1)
tests/unit/utils/test_responses.py (1)
930-938:"extra_headers" in dumpedis alwaysTrue— the assertion is vacuous.Pydantic's
model_dump()includes every declared field regardless of its value, so the presence check can never fail and adds no test signal. The meaningful assertion is theis not Nonecheck on line 936.🧹 Proposed cleanup
- dumped = result.model_dump() - assert ( - "extra_headers" in dumped - ), "extra_headers missing from ResponsesApiParams" - assert ( - dumped["extra_headers"] is not None - ), "extra_headers should not be None when MCP tools have headers" + dumped = result.model_dump() + assert ( + dumped["extra_headers"] is not None + ), "extra_headers should not be None when MCP tools have headers" assert "x-llamastack-provider-data" in dumped["extra_headers"]🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/unit/utils/test_responses.py` around lines 930 - 938, The assertion `"extra_headers" in dumped` is vacuous because Pydantic's model_dump() always includes declared fields; remove that line and instead ensure meaningful checks: keep the existing `assert dumped["extra_headers"] is not None` and the key presence assertion `assert "x-llamastack-provider-data" in dumped["extra_headers"]` (or strengthen it to `assert dumped["extra_headers"]` to ensure non-empty), referencing the test variable `dumped` produced by `result.model_dump()` and the `extra_headers` field.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/utils/responses.py`:
- Around line 220-224: The dict comprehension uses tool["server_url"] which can
raise KeyError if a tool dict lacks that key; update the comprehension in
responses.py (the block that builds mcp_headers) to safely access server_url
(e.g., use tool.get("server_url")) and include an extra guard so only tools with
a non-empty server_url, type == "mcp", and headers are included. Ensure the
resulting mcp_headers still maps server_url -> headers and keep the dict typing
intact.
---
Outside diff comments:
In `@src/utils/responses.py`:
- Around line 315-327: ResponsesApiParams.model_dump() is including
extra_headers in the JSON body; change all call sites that do
client.responses.create(**responses_params.model_dump(), ...) to instead extract
the body via responses_params.model_dump(exclude={"extra_headers"},
exclude_none=True) and pass responses_params.extra_headers as the extra_headers
kwarg to client.responses.create(...,
extra_headers=responses_params.extra_headers). Update the callers that build
ResponsesApiParams (the places invoking ResponsesApiParams and then calling
model_dump before client.responses.create) to use this two-step pattern so the
x-llamastack-provider-data header is sent as a transport header rather than part
of the JSON payload.
---
Nitpick comments:
In `@tests/unit/utils/test_responses.py`:
- Around line 930-938: The assertion `"extra_headers" in dumped` is vacuous
because Pydantic's model_dump() always includes declared fields; remove that
line and instead ensure meaningful checks: keep the existing `assert
dumped["extra_headers"] is not None` and the key presence assertion `assert
"x-llamastack-provider-data" in dumped["extra_headers"]` (or strengthen it to
`assert dumped["extra_headers"]` to ensure non-empty), referencing the test
variable `dumped` produced by `result.model_dump()` and the `extra_headers`
field.
d917bc7 to
1990ec8
Compare
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/utils/responses.py (1)
315-327:⚠️ Potential issue | 🔴 CriticalVerify extra_headers are consumed as HTTP-level parameters, not body fields, in all endpoint callers.
The issue identified in the original review is confirmed across all three endpoint callers:
src/app/endpoints/query.pyline 301src/app/endpoints/streaming_query.pylines 287–288src/app/endpoints/a2a.pyline 330All three use
**responses_params.model_dump()to unpack theResponsesApiParamsobject. This forcesextra_headersinto the request body kwargs rather than passing it as the HTTP-levelextra_headerskeyword argument toclient.responses.create(). The headers are silently lost.The correct pattern is:
params_dict = responses_params.model_dump(exclude={'extra_headers'}) stream = await client.responses.create( **params_dict, extra_headers=responses_params.extra_headers )Without this fix, MCP provider authentication headers will never reach the Llama Stack backend, reintroducing the exact regression the PR aims to solve.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/utils/responses.py` around lines 315 - 327, The ResponsesApiParams.extra_headers field is being included in the request body because callers use responses_params.model_dump() without excluding it; update each caller that invokes client.responses.create (e.g., where responses_params is used in query/streaming_query/a2a endpoints) to first do params_dict = responses_params.model_dump(exclude={'extra_headers'}) and then call client.responses.create(**params_dict, extra_headers=responses_params.extra_headers, stream=stream if applicable) so extra_headers is passed as the HTTP-level keyword argument to client.responses.create rather than serialized into the request body.
🧹 Nitpick comments (1)
src/utils/responses.py (1)
201-229: Dual auth propagation — intentional but worth documenting.
get_mcp_toolsalready embedsheadersinside each MCPtool_defentry (line 420)._build_provider_data_headersthen extracts those same headers and re-encodes them asx-llamastack-provider-data. Auth credentials therefore travel to llama-stack via two channels simultaneously.Per the official llama-stack documentation,
mcp_headersshould be keyed by server URL, which this implementation correctly does, so the format is sound. The dual propagation appears intentional (restoring the header mechanism as a regression fix while keeping the existing in-bodyheadersas a fallback), but a short inline comment explaining the coexistence would prevent future readers from treating one path as a dead-code removal candidate.Additionally, the llama-stack convention from published examples uses
X-LlamaStack-Provider-Data(mixed case) as the header name, while the code uses lowercasex-llamastack-provider-data. HTTP headers are case-insensitive so there is no functional difference, but aligning casing with the documented convention aids readability.✏️ Optional: document dual-channel intent + align casing
- return {"x-llamastack-provider-data": json.dumps({"mcp_headers": mcp_headers})} + # Auth headers are also present in each tool_def["headers"] for llama-stack + # implementations that read from the tool definition directly; this header + # provides the same data via the standard provider-data mechanism. + return {"X-LlamaStack-Provider-Data": json.dumps({"mcp_headers": mcp_headers})}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/utils/responses.py` around lines 201 - 229, Add a short inline comment in _build_provider_data_headers explaining that get_mcp_tools already embeds per-server headers into each tool_def and that this function intentionally re-encodes them into the Llama Stack provider header as a dual-propagation fallback; also change the returned header key from "x-llamastack-provider-data" to the documented casing "X-LlamaStack-Provider-Data" to match examples (HTTP headers are case-insensitive) while preserving the existing json payload structure and behavior.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Outside diff comments:
In `@src/utils/responses.py`:
- Around line 315-327: The ResponsesApiParams.extra_headers field is being
included in the request body because callers use responses_params.model_dump()
without excluding it; update each caller that invokes client.responses.create
(e.g., where responses_params is used in query/streaming_query/a2a endpoints) to
first do params_dict = responses_params.model_dump(exclude={'extra_headers'})
and then call client.responses.create(**params_dict,
extra_headers=responses_params.extra_headers, stream=stream if applicable) so
extra_headers is passed as the HTTP-level keyword argument to
client.responses.create rather than serialized into the request body.
---
Duplicate comments:
In `@src/utils/responses.py`:
- Around line 220-224: The comprehension assigning mcp_headers is already safe
because the filter includes tool.get("server_url"), so no code change is needed;
remove the duplicate review comment and mark the prior review as resolved (or
delete the redundant "[duplicate_comment]" note) to avoid confusion, referencing
the mcp_headers dict comprehension and the tools iteration so reviewers know
which code the resolution applies to.
---
Nitpick comments:
In `@src/utils/responses.py`:
- Around line 201-229: Add a short inline comment in
_build_provider_data_headers explaining that get_mcp_tools already embeds
per-server headers into each tool_def and that this function intentionally
re-encodes them into the Llama Stack provider header as a dual-propagation
fallback; also change the returned header key from "x-llamastack-provider-data"
to the documented casing "X-LlamaStack-Provider-Data" to match examples (HTTP
headers are case-insensitive) while preserving the existing json payload
structure and behavior.
Description
Restore the x-llamastack-provider-data HTTP header that regressed (stopped being sent). Now llama-stack can forward auth credentials per-MCP server again.
Type of change
Tools used to create PR
Identify any AI code assistants used in this PR (for transparency and review context)
Related Tickets & Documents
Checklist before requesting a review
Testing
Unit tests:
uv run pytest tests/unit/utils/test_responses.py -vOn live stack:
Send a POST /v1/streaming_query with MCP-HEADERS: {"mock-mcp": {"X-Authorization": "my-secret-client-token"}}, confirm MCP receives them.
Summary by CodeRabbit
New Features
Tests