feat(wafer/deepseek-v4-flash): add new models [bot] by models-bot[bot] · Pull Request #1591 · truefoundry/models

models-bot · 2026-06-24T09:35:21Z

Auto-generated by model-addition-agent for wafer/deepseek-v4-flash.

Note

Low Risk
Declarative provider metadata only; no application or routing logic changes.

Overview
Adds a new Wafer provider catalog entry for deepseek-v4-flash, registering serverless chat with 1M context, token pricing (including cache-read), and capabilities such as function calling, structured/JSON output, prompt caching, and thinking mode.

The model is marked active with chat and responses supported modes; sources point at Wafer docs and the pass API models list.

^{Reviewed by Cursor Bugbot for commit f481449. Bugbot is set up for automated code reviews on this repo. Configure here.}

github-actions · 2026-06-24T10:09:12Z

/test-models

harshiv-26 · 2026-06-24T10:10:01Z

Gateway test setup failed for `wafer`

The test job aborted before any tests ran. Error:

Failed to apply virtual-account 'gateway-tester-v2-dcf5a550-c': {"statusCode":500,"message":"Error applying the manifest of type virtual-account"}

This is usually a transient infra issue (catalogue build, TrueFoundry API, GitHub archive). Try /test-models again, or check the job logs.

github-actions · 2026-06-26T07:40:43Z

/test-models

harshiv-26 · 2026-06-26T07:42:55Z

Gateway test results

Total: 10
Passed: 7
Failed: 3
Validation failed: 0
Errored: 0
Skipped: 0
Success rate: 70.0%

Provider	Model	Scenarios
`wafer`	`deepseek-v4-flash`	success: structured-output:stream, structured-output, json-output:stream, json-output, tool-call, reasoning, params failure: reasoning:stream, params:stream, tool-call:stream

Failures (3)

wafer/deepseek-v4-flash — reasoning:stream (failure)

Error

Traceback (most recent call last):
  File "/tmp/tmpefs4m2d9/snippet.py", line 5, in <module>
    response = client.chat.completions.create(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/openai/_utils/_utils.py", line 286, in wrapper
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/openai/resources/chat/completions/completions.py", line 1147, in create
    return self._post(
           ^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/openai/_base_client.py", line 1259, in post
    return cast(ResponseT, self.request(cast_to, opts, stream=stream, stream_cls=stream_cls))
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/openai/_base_client.py", line 1047, in request
    raise self._make_status_error_from_response(err.response) from None
openai.PermissionDeniedError: Error code: 403 - {'status': 'failure', 'message': 'User gateway-tester-v2-d2795429-f is not authorized to access model test-v2-wafer/deepseek-v4-flash or model does not exist', 'error': {'message': 'User gateway-tester-v2-d2795429-f is not authorized to access model test-v2-wafer/deepseek-v4-flash or model does not exist', 'type': 'AuthorizationError', 'code': '403'}, 'error_origin_level': 'authorization'}

Code snippet

from openai import OpenAI

client = OpenAI(api_key="***", base_url="https://internal.devtest.truefoundry.tech/api/llm")

response = client.chat.completions.create(
    model="test-v2-wafer/deepseek-v4-flash",
    messages=[
        {"role": "user", "content": "How to calculate 3^3^3^3? Think step by step and show all reasoning."},
    ],
    reasoning_effort="medium",
    stream=True,
)
_reasoning_detected = False
for chunk in response:
    if chunk.choices and len(chunk.choices) > 0:
        delta = chunk.choices[0].delta
        if delta.content is not None:
            print(delta.content, end="", flush=True)
        if getattr(delta, "reasoning_content", None) is not None:
            _reasoning_detected = True
        if getattr(delta, "reasoning", None) is not None:
            _reasoning_detected = True

    _usage = getattr(chunk, "usage", None)
    if _usage is not None:
        _details = getattr(_usage, "completion_tokens_details", None)
        if _details and getattr(_details, "reasoning_tokens", 0) > 0:
            _reasoning_detected = True

if not _reasoning_detected:
    raise Exception("VALIDATION FAILED: reasoning stream - no reasoning information in stream")
print("\nVALIDATION: reasoning stream SUCCESS")

wafer/deepseek-v4-flash — params:stream (failure)

Error

Traceback (most recent call last):
  File "/tmp/tmpz9vkvqby/snippet.py", line 5, in <module>
    response = client.chat.completions.create(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/openai/_utils/_utils.py", line 286, in wrapper
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/openai/resources/chat/completions/completions.py", line 1147, in create
    return self._post(
           ^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/openai/_base_client.py", line 1259, in post
    return cast(ResponseT, self.request(cast_to, opts, stream=stream, stream_cls=stream_cls))
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/openai/_base_client.py", line 1047, in request
    raise self._make_status_error_from_response(err.response) from None
openai.PermissionDeniedError: Error code: 403 - {'status': 'failure', 'message': 'User gateway-tester-v2-d2795429-f is not authorized to access model test-v2-wafer/deepseek-v4-flash or model does not exist', 'error': {'message': 'User gateway-tester-v2-d2795429-f is not authorized to access model test-v2-wafer/deepseek-v4-flash or model does not exist', 'type': 'AuthorizationError', 'code': '403'}, 'error_origin_level': 'authorization'}

Code snippet

from openai import OpenAI

client = OpenAI(api_key="***", base_url="https://internal.devtest.truefoundry.tech/api/llm")

response = client.chat.completions.create(
    model="test-v2-wafer/deepseek-v4-flash",
    messages=[
        {"role": "user", "content": "What is the capital of France?"},
    ],
    stream=True,
)

for chunk in response:
    if chunk.choices and len(chunk.choices) > 0:
        delta = chunk.choices[0].delta
        if delta.content is not None:
            print(delta.content, end="", flush=True)

wafer/deepseek-v4-flash — tool-call:stream (failure)

Error

Traceback (most recent call last):
  File "/tmp/tmpf91smyi7/snippet.py", line 27, in <module>
    response = client.chat.completions.create(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/openai/_utils/_utils.py", line 286, in wrapper
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/openai/resources/chat/completions/completions.py", line 1147, in create
    return self._post(
           ^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/openai/_base_client.py", line 1259, in post
    return cast(ResponseT, self.request(cast_to, opts, stream=stream, stream_cls=stream_cls))
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/openai/_base_client.py", line 1047, in request
    raise self._make_status_error_from_response(err.response) from None
openai.PermissionDeniedError: Error code: 403 - {'status': 'failure', 'message': 'User gateway-tester-v2-d2795429-f is not authorized to access model test-v2-wafer/deepseek-v4-flash or model does not exist', 'error': {'message': 'User gateway-tester-v2-d2795429-f is not authorized to access model test-v2-wafer/deepseek-v4-flash or model does not exist', 'type': 'AuthorizationError', 'code': '403'}, 'error_origin_level': 'authorization'}

Code snippet

from openai import OpenAI

client = OpenAI(api_key="***", base_url="https://internal.devtest.truefoundry.tech/api/llm")

tools = [
    {
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Get the current weather for a location.",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {
                        "type": "string",
                        "description": "The city name, e.g. London",
                    },
                },
                "required": ["location"],
                "additionalProperties": False,
            },
            "strict": True,
        },
    },
]

response = client.chat.completions.create(
    model="test-v2-wafer/deepseek-v4-flash",
    messages=[
        {"role": "user", "content": "Use the get_weather tool to check the weather in London. You must call the tool, do not respond with plain text."},
    ],
    tools=tools,
    tool_choice="auto",
    stream=True,
)
_tool_calls_made = False
for chunk in response:
    if chunk.choices and len(chunk.choices) > 0:
        delta = chunk.choices[0].delta
        if delta.content is not None:
            print(delta.content, end="", flush=True)
        if delta.tool_calls:
            _tool_calls_made = True
            for _tc in delta.tool_calls:
                if _tc.function:
                    print(_tc.function.arguments or "", end="", flush=True)

if not _tool_calls_made:
    raise Exception("VALIDATION FAILED: tool-call stream - no tool calls received")
print("\nVALIDATION: tool-call stream SUCCESS")

Successes (7)

wafer/deepseek-v4-flash — structured-output:stream (success)

Output

{"name": "Science Fair", "date": "Friday", "participants": ["Alice", "Bob"]}

VALIDATION: structured-output stream SUCCESS

wafer/deepseek-v4-flash — structured-output (success)

Output

{"name":"Science Fair","date":"Friday","participants":["Alice","Bob"]}
VALIDATION: structured-output SUCCESS

wafer/deepseek-v4-flash — json-output:stream (success)

Output

{
  "colors": [
    {
      "name": "Crimson",
      "hex": "#DC143C"
    },
    {
      "name": "Teal",
      "hex": "#008080"
    },
    {
      "na
... (truncated, 87 chars omitted)

wafer/deepseek-v4-flash — json-output (success)

Output

{
  "colors": [
    {
      "name": "Red",
      "hex": "#FF0000"
    },
    {
      "name": "Green",
      "hex": "#00FF00"
    },
    {
      "name"
... (truncated, 77 chars omitted)

wafer/deepseek-v4-flash — tool-call (success)

Output

Function: get_weather
Arguments: {"location": "London"}
VALIDATION: tool-call SUCCESS

wafer/deepseek-v4-flash — reasoning (success)

Output

The expression \(3^3^3^3\) is interpreted as a power tower with right-associative exponentiation, meaning it is computed from the top down:

\[
3^{3^{
... (truncated, 458 chars omitted)

wafer/deepseek-v4-flash — params (success)

Output

The capital of France is Paris.

Truefoundry Models Bot added 2 commits June 24, 2026 09:35

feat(wafer/deepseek-v4-flash): add new models [bot]

c35ea49

feat(wafer): update model YAMLs [bot]

4642da6

Merge branch 'main' into bot/add-wafer-deepseek-v4-flash-20260624-093519

f481449

architkumar-truefoundry approved these changes Jun 26, 2026

View reviewed changes

architkumar-truefoundry merged commit c3fdfad into main Jun 26, 2026
8 checks passed

architkumar-truefoundry deleted the bot/add-wafer-deepseek-v4-flash-20260624-093519 branch June 26, 2026 08:02

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(wafer/deepseek-v4-flash): add new models [bot]#1591

feat(wafer/deepseek-v4-flash): add new models [bot]#1591
architkumar-truefoundry merged 3 commits into
mainfrom
bot/add-wafer-deepseek-v4-flash-20260624-093519

models-bot Bot commented Jun 24, 2026 •

edited by cursor Bot

Loading

Uh oh!

github-actions Bot commented Jun 24, 2026

Uh oh!

harshiv-26 commented Jun 24, 2026

Uh oh!

github-actions Bot commented Jun 26, 2026

Uh oh!

harshiv-26 commented Jun 26, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

models-bot Bot commented Jun 24, 2026 • edited by cursor Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions Bot commented Jun 24, 2026

Uh oh!

harshiv-26 commented Jun 24, 2026

Gateway test setup failed for wafer

Uh oh!

github-actions Bot commented Jun 26, 2026

Uh oh!

harshiv-26 commented Jun 26, 2026

Gateway test results

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

models-bot Bot commented Jun 24, 2026 •

edited by cursor Bot

Loading

Gateway test setup failed for `wafer`