Add streaming task log support to KubernetesExecutor#69300
Draft
jason810496 wants to merge 3 commits into
Draft
Conversation
Reading a running task's log through an executor materializes the whole log in the API server before the bounded LogStreamAccumulator can bound memory, so large logs spike the API server heap. This adds an interface executors can implement to stream log lines lazily instead.
This was referenced Jul 3, 2026
Fetching a running task's pod log materialized every line in the API server before serving it. Streaming the lines lazily through the new BaseExecutor.get_streaming_task_log interface lets the bounded log accumulator cap resident memory while serving large logs.
497b8f2 to
a0b19bb
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
part of the streaming task log series
Why
KubernetesExecutor.get_task_logmaterializes the whole pod log in the API server. Implementing the newBaseExecutor.get_streaming_task_logyields lines lazily into the boundedLogStreamAccumulator: ~11.6x lower peak heap growth (+2093.9 MiB vs +179.9 MiB) serving a ~415 MB running-task log (full benchmark in #69299).What
get_streaming_task_logonKubernetesExecutor;get_task_logstays and delegates to it, so older cores keep working.supports_streaming_logsonKubernetesExecutor,CeleryKubernetesExecutor, andLocalKubernetesExecutor(the wrappers route kubernetes-queue tasks).Was generative AI tooling used to co-author this PR?