Skip to content

Docker Model Runner HTTP API not responsive on Windows - "Empty reply from server" #748

@Neox92

Description

@Neox92

Environment: - Windows 11 - Docker Desktop v4.64.0

Problem: The Docker Model Runner HTTP API server does not respond to HTTP requests, although all settings are correctly configured. Settings enabled: ✅ Enable Docker Model Runner ✅ Enable host-side TCP support (Port 12434) ✅ Enable GPU-backed inference The model loads correctly into the GPU (dedicated VRAM is being used).

Model loads fine to GPU via CLI or Docker Desktop (docker model run gpt-oss "test" works), but curl http://localhost:12434/engines/v1/models returns "Empty reply from server".

Workaround discovered: When I changed the port from 12434 to any other port (e.g., 12357), curl http://localhost:12357/engines/v1/models works immediately without any issues.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions