Skip to content

Clarify the relationship between HttpOptions.timeout() (X-Server-Timeout header) and OkHttp callTimeout #867

@LukeYu22

Description

@LukeYu22

Description

I'm using google-genai:1.32.0 with Vertex AI for image generation (models.generateContent with responseModalities=IMAGE). I need help understanding the interaction between two timeout mechanisms in the SDK.

Two timeout paths in the SDK

1. Client-level HttpOptions.timeout() → OkHttp callTimeout

Set during client initialization:

HttpOptions httpOptions = HttpOptions.builder()
        .timeout(300000)  // 300s
        .retryOptions(HttpRetryOptions.builder().attempts(2).build())
        .build();

Client client = Client.builder()
        .project("my-project")
        .location("global")
        .httpOptions(httpOptions)
        .build();

From source code (ApiClient.createHttpClient), this sets OkHttpClient.Builder.callTimeout(Duration.ofMillis(timeout)), which covers the entire call lifecycle including all retry attempts.

2. Per-request HttpOptions.timeout()X-Server-Timeout HTTP header

Set on GenerateContentConfig:

GenerateContentConfig config = GenerateContentConfig.builder()
        .responseModalities("IMAGE")
        .httpOptions(HttpOptions.builder()
                .timeout(180000)  // 180s
                .build())
        .build();

client.models.generateContent(model, prompt, config);

From source code (ApiClient.getTimeoutHeader), this converts timeout / 1000 and sends as X-Server-Timeout: 180 HTTP header. It does NOT override the OkHttp callTimeout.

My questions

Q1: What does X-Server-Timeout do on the server side?

Does the Vertex AI server respect this header? If so:

  • Does it abort image generation processing after the specified seconds?
  • What HTTP status code / error does it return when server-side timeout is reached?
  • Is this behavior the same for all model types (text, image, video)?

Q2: How do these two timeouts interact with RetryInterceptor?

The callTimeout covers all retry attempts combined (not per-attempt). With attempts=2:

Scenario: X-Server-Timeout: 180, callTimeout: 300s

t=0s     attempt 1 starts
t=180s   server returns timeout error (respecting X-Server-Timeout: 180)
t=182s   attempt 2 starts (after ~2s backoff)
         BUT only 118s remains in callTimeout budget
t=300s   callTimeout fires → InterruptedIOException: timeout
         attempt 2 is killed (never had a chance to complete)

Is this the expected behavior? It seems like the second attempt can never succeed if the first attempt exhausts most of the callTimeout budget.

Q3: What is the recommended configuration?

For long-running operations like image generation (which can take 30-120s), what is the recommended way to configure timeouts?

Specifically:

  • Should we set per-request timeout at all, or just rely on client-level callTimeout?
  • If we want retries (attempts=2), should callTimeout be at least X-Server-Timeout × attempts + backoff?
  • Is there a way to set a per-attempt timeout instead of a per-call timeout?

Q4: What happens when timeout is not set?

  • Client-level: What is the default callTimeout? From bytecode analysis, it appears to only be set when HttpOptions.timeout() is present.
  • Per-request: If no timeout is set on per-request HttpOptions, no X-Server-Timeout header is sent. What is the server's default processing timeout for image generation?

Environment

  • SDK version: google-genai:1.32.0
  • API: Vertex AI (not Gemini Developer API)
  • Model: gemini-2.5-flash-image (image generation)
  • Use case: High-concurrency image generation (~120 concurrent requests)

Stack trace (when callTimeout is hit)

GenAiIOException: Failed to execute HTTP request.
    at c.g.g.HttpApiClient.executeRequest(HttpApiClient.java:83)
Caused by: java.io.InterruptedIOException: timeout
    at okhttp3.internal.connection.RealCall.timeoutExit(RealCall.kt:398)
    at okhttp3.internal.connection.RealCall.callDone(RealCall.kt:360)
Caused by: java.io.IOException: Canceled
    at okhttp3.internal.connection.RealCall.noMoreExchanges$okhttp(RealCall.kt:325)

What I've tried

Config Result
callTimeout=60s, attempts=5 (default) Fast overall, but some image gen requests fail (too short)
callTimeout=180s, attempts=2, per-request timeout=180s Still timeout — attempt 2 never has enough budget
callTimeout=300s, attempts=2, per-request timeout=180s Same issue — 180+2+180=362 > 300

Thank you for any guidance!

Metadata

Metadata

Assignees

Labels

priority: p3Desirable enhancement or fix. May not be included in next release.status:awaiting user responsetype: questionRequest for information or clarification. Not an issue.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions