Skip to content

fix: collect and manage generated files from job executions#50

Merged
aron-muon merged 3 commits intoaron-muon:mainfrom
nosportugal:task-proper-return
Apr 12, 2026
Merged

fix: collect and manage generated files from job executions#50
aron-muon merged 3 commits intoaron-muon:mainfrom
nosportugal:task-proper-return

Conversation

@gafda
Copy link
Copy Markdown
Contributor

@gafda gafda commented Apr 10, 2026

TL;DR

Fix generated files not always being returned to LibreChat after code execution.

Root causes & fixes:

  • Keyword heuristic removed — file detection was gated on keywords like savefig, open(, etc. in the code; now always runs after execution.
    Job-path files collected before pod cleanup — cold-path languages (Go, Rust, Java, …) returned handle=None so files were never fetched before the pod was destroyed; files are now downloaded eagerly inside execute_with_job.
    Code file filter expanded — main.go and main.rs were not excluded from the generated-files list; all 12 language source filenames are now filtered.

This pull request enhances the handling of files generated during code execution, especially for Kubernetes Job-based executions where the pod is destroyed after completion. The main improvements ensure that generated files are detected, downloaded, and made available even after the execution pod is gone. The changes also expand and harden the logic for skipping language source files and improve test coverage for these scenarios.

File handling improvements for Job-based executions:

  • Added logic to detect and download generated files before Job pod cleanup, storing their contents for later retrieval (src/services/kubernetes/job_executor.py, src/services/kubernetes/models.py, src/services/execution/runner.py). [1] [2] [3] [4] [5]
  • Implemented a method (pop_job_file_content) to retrieve pre-downloaded file content for Job executions, allowing orchestrator services to access files after the pod is destroyed (src/services/execution/runner.py, src/services/interfaces.py, src/services/orchestrator.py). [1] [2] [3] [4]
  • Updated orchestrator logic to use pre-downloaded file content as a fallback when the execution pod is no longer available (src/services/orchestrator.py). [1] [2]

Source file detection and filtering:

  • Centralized and expanded the set of language source filenames to skip during generated file detection, ensuring all relevant code files are excluded (src/services/execution/runner.py, src/services/kubernetes/job_executor.py). [1] [2] [3]

Test enhancements:

  • Expanded unit tests to cover all code file types in file detection and added tests for the new pop_job_file_content logic and orchestrator fallback behavior (tests/unit/test_execution_runner.py, tests/unit/test_orchestrator.py). [1] [2] [3] [4] [5]

These changes make file handling more robust for both interactive and batch code execution scenarios, ensuring users can reliably access generated outputs regardless of the execution backend.

* Add functionality to detect and download generated files from Job pods before cleanup.
* Introduce  method in JobExecutor for file retrieval.
* Update ExecutionResult model to include  attribute.
* Implement  method in ExecutionServiceInterface for accessing pre-downloaded file content.
* Enhance orchestrator to utilize pre-downloaded content when container is unavailable.
* Add unit tests for new functionality and ensure proper handling of code files.
Copilot AI review requested due to automatic review settings April 10, 2026 15:52
@gafda gafda requested a review from aron-muon as a code owner April 10, 2026 15:52
@gafda
Copy link
Copy Markdown
Contributor Author

gafda commented Apr 10, 2026

@aron-muon , here's a little development to greatly ensure that the generated \ changed files are returned.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR improves reliability of returning files generated during code execution (especially for Kubernetes Job-based executions where pods are cleaned up immediately), by collecting generated files earlier and adding a runner/orchestrator fallback path for retrieving those files after pod deletion.

Changes:

  • Always runs generated-file detection after execution (removes keyword gating) and expands code-source file filtering across languages.
  • For Job-based executions, downloads generated files before pod cleanup and caches their bytes for later retrieval.
  • Adds pop_job_file_content plumbing plus new unit tests covering Job fallback behavior and expanded code-file skipping.

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
src/services/kubernetes/job_executor.py Collects & downloads generated files during Job execution before cleanup.
src/services/kubernetes/models.py Extends ExecutionResult to optionally carry Job-generated file metadata/content.
src/services/execution/runner.py Centralizes code-filename skip set; caches Job file bytes and exposes pop_job_file_content.
src/services/interfaces.py Adds a default pop_job_file_content hook to the execution service interface.
src/services/orchestrator.py Falls back to runner’s pre-downloaded Job file bytes when the pod/container is missing.
tests/unit/test_execution_runner.py Expands generated-file detection tests and adds tests for pop_job_file_content.
tests/unit/test_orchestrator.py Adds coverage for orchestrator Job-file fallback when no container exists.
Comments suppressed due to low confidence (1)

src/services/kubernetes/job_executor.py:551

  • Generated-file collection for Job executions is currently gated on result.exit_code == 0. Pool-based executions attempt generated-file detection regardless of exit code, so this can reintroduce missing outputs when code writes files and then fails. Consider collecting generated files whenever job.runner_url is available (possibly with additional safeguards) rather than only on success.
            )

            # Execute code
            result = await self.execute(

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

* Change job file content storage to use a tuple of (session_id, path) for unique identification.
* Update related methods to accommodate session-based file content retrieval.
* Enhance unit tests to verify session-specific behavior for job file content management.
@aron-muon aron-muon changed the title Collect and manage generated files from job executions fix: collect and manage generated files from job executions Apr 12, 2026
Copy link
Copy Markdown
Owner

@aron-muon aron-muon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the contribution!

@aron-muon aron-muon merged commit 00e14ca into aron-muon:main Apr 12, 2026
28 checks passed
@github-actions
Copy link
Copy Markdown

🎉 This PR is included in version 3.2.2 🎉

The release is available on GitHub release

Your semantic-release bot 📦🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants