Add streaming task log support to BaseExecutor and FileTaskHandler#69299
Open
jason810496 wants to merge 1 commit into
Open
Add streaming task log support to BaseExecutor and FileTaskHandler#69299jason810496 wants to merge 1 commit into
jason810496 wants to merge 1 commit into
Conversation
Reading a running task's log through an executor materializes the whole log in the API server before the bounded LogStreamAccumulator can bound memory, so large logs spike the API server heap. This adds an interface executors can implement to stream log lines lazily instead.
This was referenced Jul 3, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
part of the streaming task log series
Why
When the API server reads logs for a RUNNING task, the executor's
get_task_logreturns the whole log fully materialized, so it sits on the heap at once before the boundedLogStreamAccumulatordownstream (5000 messages resident, rest spilled to disk) can do its job. Large logs spike the API server's anonymous heap and can OOM it.What
get_streaming_task_logand asupports_streaming_logsclass attribute (defaultFalse) toBaseExecutor.FileTaskHandler._readprefers the streaming method when the executor advertisessupports_streaming_logs, and falls back to the legacyget_task_logotherwise, so provider and custom executors that haven't implemented it keep working unchanged.Benchmark
A/B measurement of API server memory serving the same ~415 MB ndjson log (1M lines) of a RUNNING KubernetesExecutor task through
GET .../logs/{try_number}withAccept: application/x-ndjson, sampling the API server cgroup. A = materializing read (before this series), B = streaming read (this PR + the KubernetesExecutor follow-up).memory.current)Without streaming the full log lives on the heap at once (~2.1 GiB anonymous, non-reclaimable memory, which is what OOMs the API server). With streaming the executor yields into
LogStreamAccumulator; B's remaining RSS growth is mostly reclaimable page cache from the accumulator's disk spill.Interpretation caveat:
KubernetesExecutor.RUNNING_POD_LOG_LINES = 100normally caps the executor read to the last 100 lines, which would hide the difference; the benchmark lifted the cap to 100,000,000 in both builds. Production keeps the cap at 100, so this measures the mechanism's headroom for executors returning large logs, not a shipped memory reduction today (therunning_pod_log_linesconfig PR is what makes the cap tunable). Single run per build, and only the ndjson path streams end to end (Accept: application/jsonbuffers the whole response regardless).Was generative AI tooling used to co-author this PR?