fix: DeepSeek tool call parsing - nested objects & JSON repair#94
fix: DeepSeek tool call parsing - nested objects & JSON repair#94valkryhx wants to merge 4 commits intoCJackHwang:mainfrom
Conversation
- Fix: Expand stream sieve keywords to support function.name: and [TOOL_CALL_HISTORY] - Fix: Add repairInvalidJSONBackslashes to handle unescaped backslashes in Windows paths - Sync: Update JS stream sieve to match Go implementation - Test: Add unit tests for backslash repair and deepseek format parsing - Tool: Move repair json test tool to tests/repair_json_tool.go
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 7318d1f4a8
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| return -1 | ||
| } | ||
| lower := strings.ToLower(s) | ||
| keywords := []string{"tool_calls", "function.name:", "[tool_call_history]"} |
There was a problem hiding this comment.
Restrict capture start to payloads consumeToolCapture can parse
Including function.name: and [tool_call_history] in findToolSegmentStart makes the sieve enter capture mode for text-kv style output, but consumeToolCapture still only looks for tool_calls, so these captures never become ready until stream finalization; in practice, any chunk containing function.name: can cause the rest of the response to be buffered and emitted late as plain text instead of structured tool calls.
Useful? React with 👍 / 👎.
| if i+1 < len(runes) { | ||
| next := runes[i+1] | ||
| switch next { | ||
| case '"', '\\', '/', 'b', 'f', 'n', 'r', 't': |
There was a problem hiding this comment.
Escape
\n/\t when repairing Windows-style path backslashes
The repair routine treats \n, \t, \r, etc. as already-valid escapes and leaves them unchanged, so inputs like C:\new\tools (model intended literal backslashes) are decoded as newline/tab characters after JSON unmarshal; this silently mutates tool arguments and can execute the wrong path/command instead of preserving the original Windows path text.
Useful? React with 👍 / 👎.
- Upgrade missingArrayBracketsPattern regex to support single-level nested {} objects
- This fixes DeepSeek's list hallucination where tool call JSON objects contain nested fields like {"input": {"q": "value"}}
- Add comprehensive test cases covering 2-5 nested objects, mixed nested/primitive fields, and real DeepSeek 8-queen output patterns
- Add RepairLooseJSON function to repair unquoted keys and missing array brackets
Fixes: DeepSeek tool call parsing with nested JSON objects
…ds and safety limits - Add support for multiple keywords: tool_calls, function.name:, [tool_call_history] - Add OOM protection with search limits in extractToolCallObjects - Add max scan length limit in extractJSONObject to prevent OOM on unclosed objects - Update tool_sieve to handle more tool call patterns - Add loose JSON repair in parseToolCallPayload for better error recovery This improves DeepSeek tool call parsing robustness.
|
@valkryhx is attempting to deploy a commit to the cjack's projects Team on Vercel. A member of the Team first needs to authorize it. |
- Add targeted test commands to TESTING.md for debugging tool call issues - Add quick test commands reference in README.md - Document specific test cases for DeepSeek tool call parsing
Summary
修复 DeepSeek 工具调用解析问题,支持嵌套 JSON 对象和缺失数组括号的自动修复。
Problem
DeepSeek 在返回 tool calls 时有时会输出不规范的 JSON:
这些情况导致工具调用被当作普通文本返回,客户端无法识别和执行。
Solution
1. 升级正则表达式支持单层嵌套
2. 添加 RepairLooseJSON 函数
3. 增强关键词检测
支持多种 tool call 语法:
4. 添加 OOM 保护
Files Changed
Test Cases Added
Verification