Skip to content

feat(wafer/deepseek-v4-flash): add new models [bot]#1591

Merged
architkumar-truefoundry merged 3 commits into
mainfrom
bot/add-wafer-deepseek-v4-flash-20260624-093519
Jun 26, 2026
Merged

feat(wafer/deepseek-v4-flash): add new models [bot]#1591
architkumar-truefoundry merged 3 commits into
mainfrom
bot/add-wafer-deepseek-v4-flash-20260624-093519

Conversation

@models-bot

@models-bot models-bot Bot commented Jun 24, 2026

Copy link
Copy Markdown
Contributor

Auto-generated by model-addition-agent for wafer/deepseek-v4-flash.


Note

Low Risk
Declarative provider metadata only; no application or routing logic changes.

Overview
Adds a new Wafer provider catalog entry for deepseek-v4-flash, registering serverless chat with 1M context, token pricing (including cache-read), and capabilities such as function calling, structured/JSON output, prompt caching, and thinking mode.

The model is marked active with chat and responses supported modes; sources point at Wafer docs and the pass API models list.

Reviewed by Cursor Bugbot for commit f481449. Bugbot is set up for automated code reviews on this repo. Configure here.

@github-actions

Copy link
Copy Markdown
Contributor

/test-models

@harshiv-26

Copy link
Copy Markdown
Collaborator

Gateway test setup failed for wafer

The test job aborted before any tests ran. Error:

Failed to apply virtual-account 'gateway-tester-v2-dcf5a550-c': {"statusCode":500,"message":"Error applying the manifest of type virtual-account"}

This is usually a transient infra issue (catalogue build, TrueFoundry API, GitHub archive). Try /test-models again, or check the job logs.

@github-actions

Copy link
Copy Markdown
Contributor

/test-models

@harshiv-26

Copy link
Copy Markdown
Collaborator

Gateway test results

  • Total: 10
  • Passed: 7
  • Failed: 3
  • Validation failed: 0
  • Errored: 0
  • Skipped: 0
  • Success rate: 70.0%
Provider Model Scenarios
wafer deepseek-v4-flash success: structured-output:stream, structured-output, json-output:stream, json-output, tool-call, reasoning, params

failure: reasoning:stream, params:stream, tool-call:stream
Failures (3)

wafer/deepseek-v4-flash — reasoning:stream (failure)

Error
Traceback (most recent call last):
  File "/tmp/tmpefs4m2d9/snippet.py", line 5, in <module>
    response = client.chat.completions.create(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/openai/_utils/_utils.py", line 286, in wrapper
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/openai/resources/chat/completions/completions.py", line 1147, in create
    return self._post(
           ^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/openai/_base_client.py", line 1259, in post
    return cast(ResponseT, self.request(cast_to, opts, stream=stream, stream_cls=stream_cls))
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/openai/_base_client.py", line 1047, in request
    raise self._make_status_error_from_response(err.response) from None
openai.PermissionDeniedError: Error code: 403 - {'status': 'failure', 'message': 'User gateway-tester-v2-d2795429-f is not authorized to access model test-v2-wafer/deepseek-v4-flash or model does not exist', 'error': {'message': 'User gateway-tester-v2-d2795429-f is not authorized to access model test-v2-wafer/deepseek-v4-flash or model does not exist', 'type': 'AuthorizationError', 'code': '403'}, 'error_origin_level': 'authorization'}
Code snippet
from openai import OpenAI

client = OpenAI(api_key="***", base_url="https://internal.devtest.truefoundry.tech/api/llm")

response = client.chat.completions.create(
    model="test-v2-wafer/deepseek-v4-flash",
    messages=[
        {"role": "user", "content": "How to calculate 3^3^3^3? Think step by step and show all reasoning."},
    ],
    reasoning_effort="medium",
    stream=True,
)
_reasoning_detected = False
for chunk in response:
    if chunk.choices and len(chunk.choices) > 0:
        delta = chunk.choices[0].delta
        if delta.content is not None:
            print(delta.content, end="", flush=True)
        if getattr(delta, "reasoning_content", None) is not None:
            _reasoning_detected = True
        if getattr(delta, "reasoning", None) is not None:
            _reasoning_detected = True

    _usage = getattr(chunk, "usage", None)
    if _usage is not None:
        _details = getattr(_usage, "completion_tokens_details", None)
        if _details and getattr(_details, "reasoning_tokens", 0) > 0:
            _reasoning_detected = True

if not _reasoning_detected:
    raise Exception("VALIDATION FAILED: reasoning stream - no reasoning information in stream")
print("\nVALIDATION: reasoning stream SUCCESS")

wafer/deepseek-v4-flash — params:stream (failure)

Error
Traceback (most recent call last):
  File "/tmp/tmpz9vkvqby/snippet.py", line 5, in <module>
    response = client.chat.completions.create(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/openai/_utils/_utils.py", line 286, in wrapper
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/openai/resources/chat/completions/completions.py", line 1147, in create
    return self._post(
           ^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/openai/_base_client.py", line 1259, in post
    return cast(ResponseT, self.request(cast_to, opts, stream=stream, stream_cls=stream_cls))
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/openai/_base_client.py", line 1047, in request
    raise self._make_status_error_from_response(err.response) from None
openai.PermissionDeniedError: Error code: 403 - {'status': 'failure', 'message': 'User gateway-tester-v2-d2795429-f is not authorized to access model test-v2-wafer/deepseek-v4-flash or model does not exist', 'error': {'message': 'User gateway-tester-v2-d2795429-f is not authorized to access model test-v2-wafer/deepseek-v4-flash or model does not exist', 'type': 'AuthorizationError', 'code': '403'}, 'error_origin_level': 'authorization'}
Code snippet
from openai import OpenAI

client = OpenAI(api_key="***", base_url="https://internal.devtest.truefoundry.tech/api/llm")

response = client.chat.completions.create(
    model="test-v2-wafer/deepseek-v4-flash",
    messages=[
        {"role": "user", "content": "What is the capital of France?"},
    ],
    stream=True,
)

for chunk in response:
    if chunk.choices and len(chunk.choices) > 0:
        delta = chunk.choices[0].delta
        if delta.content is not None:
            print(delta.content, end="", flush=True)

wafer/deepseek-v4-flash — tool-call:stream (failure)

Error
Traceback (most recent call last):
  File "/tmp/tmpf91smyi7/snippet.py", line 27, in <module>
    response = client.chat.completions.create(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/openai/_utils/_utils.py", line 286, in wrapper
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/openai/resources/chat/completions/completions.py", line 1147, in create
    return self._post(
           ^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/openai/_base_client.py", line 1259, in post
    return cast(ResponseT, self.request(cast_to, opts, stream=stream, stream_cls=stream_cls))
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/openai/_base_client.py", line 1047, in request
    raise self._make_status_error_from_response(err.response) from None
openai.PermissionDeniedError: Error code: 403 - {'status': 'failure', 'message': 'User gateway-tester-v2-d2795429-f is not authorized to access model test-v2-wafer/deepseek-v4-flash or model does not exist', 'error': {'message': 'User gateway-tester-v2-d2795429-f is not authorized to access model test-v2-wafer/deepseek-v4-flash or model does not exist', 'type': 'AuthorizationError', 'code': '403'}, 'error_origin_level': 'authorization'}
Code snippet
from openai import OpenAI

client = OpenAI(api_key="***", base_url="https://internal.devtest.truefoundry.tech/api/llm")

tools = [
    {
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Get the current weather for a location.",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {
                        "type": "string",
                        "description": "The city name, e.g. London",
                    },
                },
                "required": ["location"],
                "additionalProperties": False,
            },
            "strict": True,
        },
    },
]

response = client.chat.completions.create(
    model="test-v2-wafer/deepseek-v4-flash",
    messages=[
        {"role": "user", "content": "Use the get_weather tool to check the weather in London. You must call the tool, do not respond with plain text."},
    ],
    tools=tools,
    tool_choice="auto",
    stream=True,
)
_tool_calls_made = False
for chunk in response:
    if chunk.choices and len(chunk.choices) > 0:
        delta = chunk.choices[0].delta
        if delta.content is not None:
            print(delta.content, end="", flush=True)
        if delta.tool_calls:
            _tool_calls_made = True
            for _tc in delta.tool_calls:
                if _tc.function:
                    print(_tc.function.arguments or "", end="", flush=True)

if not _tool_calls_made:
    raise Exception("VALIDATION FAILED: tool-call stream - no tool calls received")
print("\nVALIDATION: tool-call stream SUCCESS")
Successes (7)

wafer/deepseek-v4-flash — structured-output:stream (success)

Output
{"name": "Science Fair", "date": "Friday", "participants": ["Alice", "Bob"]}

VALIDATION: structured-output stream SUCCESS

wafer/deepseek-v4-flash — structured-output (success)

Output
{"name":"Science Fair","date":"Friday","participants":["Alice","Bob"]}
VALIDATION: structured-output SUCCESS

wafer/deepseek-v4-flash — json-output:stream (success)

Output
{
  "colors": [
    {
      "name": "Crimson",
      "hex": "#DC143C"
    },
    {
      "name": "Teal",
      "hex": "#008080"
    },
    {
      "na
... (truncated, 87 chars omitted)

wafer/deepseek-v4-flash — json-output (success)

Output
{
  "colors": [
    {
      "name": "Red",
      "hex": "#FF0000"
    },
    {
      "name": "Green",
      "hex": "#00FF00"
    },
    {
      "name"
... (truncated, 77 chars omitted)

wafer/deepseek-v4-flash — tool-call (success)

Output
Function: get_weather
Arguments: {"location": "London"}
VALIDATION: tool-call SUCCESS

wafer/deepseek-v4-flash — reasoning (success)

Output
The expression \(3^3^3^3\) is interpreted as a power tower with right-associative exponentiation, meaning it is computed from the top down:

\[
3^{3^{
... (truncated, 458 chars omitted)

wafer/deepseek-v4-flash — params (success)

Output
The capital of France is Paris.

@architkumar-truefoundry architkumar-truefoundry merged commit c3fdfad into main Jun 26, 2026
8 checks passed
@architkumar-truefoundry architkumar-truefoundry deleted the bot/add-wafer-deepseek-v4-flash-20260624-093519 branch June 26, 2026 08:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants