-
Notifications
You must be signed in to change notification settings - Fork 106
Description
Description
I'm using google-genai:1.32.0 with Vertex AI for image generation (models.generateContent with responseModalities=IMAGE). I need help understanding the interaction between two timeout mechanisms in the SDK.
Two timeout paths in the SDK
1. Client-level HttpOptions.timeout() → OkHttp callTimeout
Set during client initialization:
HttpOptions httpOptions = HttpOptions.builder()
.timeout(300000) // 300s
.retryOptions(HttpRetryOptions.builder().attempts(2).build())
.build();
Client client = Client.builder()
.project("my-project")
.location("global")
.httpOptions(httpOptions)
.build();From source code (ApiClient.createHttpClient), this sets OkHttpClient.Builder.callTimeout(Duration.ofMillis(timeout)), which covers the entire call lifecycle including all retry attempts.
2. Per-request HttpOptions.timeout() → X-Server-Timeout HTTP header
Set on GenerateContentConfig:
GenerateContentConfig config = GenerateContentConfig.builder()
.responseModalities("IMAGE")
.httpOptions(HttpOptions.builder()
.timeout(180000) // 180s
.build())
.build();
client.models.generateContent(model, prompt, config);From source code (ApiClient.getTimeoutHeader), this converts timeout / 1000 and sends as X-Server-Timeout: 180 HTTP header. It does NOT override the OkHttp callTimeout.
My questions
Q1: What does X-Server-Timeout do on the server side?
Does the Vertex AI server respect this header? If so:
- Does it abort image generation processing after the specified seconds?
- What HTTP status code / error does it return when server-side timeout is reached?
- Is this behavior the same for all model types (text, image, video)?
Q2: How do these two timeouts interact with RetryInterceptor?
The callTimeout covers all retry attempts combined (not per-attempt). With attempts=2:
Scenario: X-Server-Timeout: 180, callTimeout: 300s
t=0s attempt 1 starts
t=180s server returns timeout error (respecting X-Server-Timeout: 180)
t=182s attempt 2 starts (after ~2s backoff)
BUT only 118s remains in callTimeout budget
t=300s callTimeout fires → InterruptedIOException: timeout
attempt 2 is killed (never had a chance to complete)
Is this the expected behavior? It seems like the second attempt can never succeed if the first attempt exhausts most of the callTimeout budget.
Q3: What is the recommended configuration?
For long-running operations like image generation (which can take 30-120s), what is the recommended way to configure timeouts?
Specifically:
- Should we set per-request
timeoutat all, or just rely on client-levelcallTimeout? - If we want retries (
attempts=2), shouldcallTimeoutbe at leastX-Server-Timeout × attempts + backoff? - Is there a way to set a per-attempt timeout instead of a per-call timeout?
Q4: What happens when timeout is not set?
- Client-level: What is the default
callTimeout? From bytecode analysis, it appears to only be set whenHttpOptions.timeout()is present. - Per-request: If no
timeoutis set on per-requestHttpOptions, noX-Server-Timeoutheader is sent. What is the server's default processing timeout for image generation?
Environment
- SDK version:
google-genai:1.32.0 - API: Vertex AI (not Gemini Developer API)
- Model:
gemini-2.5-flash-image(image generation) - Use case: High-concurrency image generation (~120 concurrent requests)
Stack trace (when callTimeout is hit)
GenAiIOException: Failed to execute HTTP request.
at c.g.g.HttpApiClient.executeRequest(HttpApiClient.java:83)
Caused by: java.io.InterruptedIOException: timeout
at okhttp3.internal.connection.RealCall.timeoutExit(RealCall.kt:398)
at okhttp3.internal.connection.RealCall.callDone(RealCall.kt:360)
Caused by: java.io.IOException: Canceled
at okhttp3.internal.connection.RealCall.noMoreExchanges$okhttp(RealCall.kt:325)
What I've tried
| Config | Result |
|---|---|
callTimeout=60s, attempts=5 (default) |
Fast overall, but some image gen requests fail (too short) |
callTimeout=180s, attempts=2, per-request timeout=180s |
Still timeout — attempt 2 never has enough budget |
callTimeout=300s, attempts=2, per-request timeout=180s |
Same issue — 180+2+180=362 > 300 |
Thank you for any guidance!