fix(csa-session): recon sessions write to session-local output path (#910) by RyderFreeman4Logos · Pull Request #955 · RyderFreeman4Logos/cli-sub-agent

RyderFreeman4Logos · 2026-04-20T10:51:03Z

Summary

Canonicalize recon pattern/skill prompts to use \$CSA_SESSION_DIR/output/...
Add post-session audit that detects repo-tracked file mutations (warn only, no auto-revert)
Rule 027: PATTERN.md + workflow.toml updated in lockstep (patterns/mktd)

Scope

13 files, +313 / -50 LOC. New modules: pipeline_post_exec_audit.rs (129 LOC), manager_audit.rs (75 LOC).

Test plan

cargo test --workspace green
weave compile green
just pre-commit green

Closes #910.

…910) Read-only recon CSA sessions repeatedly wrote to the repo-tracked output/summary.md instead of session-local $CSA_SESSION_DIR/output/summary.md because the recon prompt used a relative path and CWD was the project root. Orchestrators had to `git checkout output/` after every recon — noisy and risked accidental commit of recon work product. Canonicalize recon output paths in pattern/skill prompts to use $CSA_SESSION_DIR/output/... (absolute). Verified CSA_SESSION_DIR is already exported in the child environment. Added post-session audit: when a session ends, compare repo-tracked file mtimes against session_start_time; warn (via tracing::warn and session-local audit-warnings.md) on any mutations. No auto-revert — operator retains manual control. Rule 027 applied — PATTERN.md and workflow.toml updated in lockstep. Closes #910. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

gemini-code-assist

Code Review

This pull request implements a repository write audit mechanism to detect unintended modifications to tracked files during read-only or reconnaissance sessions. Key changes include a new auditing module in cli-sub-agent, logic to identify session types based on prompt keywords, and updates to the session manager to record audit warnings as artifacts. The review feedback suggests refactoring unstable Rust let_chains for better compatibility, using static constants for string marker arrays to optimize performance, and addressing potential test flakiness caused by thread::sleep when asserting on file timestamps.

gemini-code-assist · 2026-04-20T10:54:57Z

+    if let Some(artifact_path) =
+        csa_session::write_audit_warning_artifact(&ctx.session_dir, &mutated_paths)?
+        && let Ok(rel_path) = artifact_path.strip_prefix(&ctx.session_dir)
+    {
+        session_result.artifacts.push(SessionArtifact::new(
+            rel_path.to_string_lossy().into_owned(),
+        ));
+    }


The if let ... && let ... syntax is part of the let_chains feature, which is currently unstable in Rust. While it might compile with a nightly toolchain, it's generally recommended to stick to stable features for better compatibility and readability.

Consider refactoring this to nested if let statements, which is idiomatic in stable Rust.

Suggested change

if let Some(artifact_path) =

csa_session::write_audit_warning_artifact(&ctx.session_dir, &mutated_paths)?

&& let Ok(rel_path) = artifact_path.strip_prefix(&ctx.session_dir)

{

session_result.artifacts.push(SessionArtifact::new(

rel_path.to_string_lossy().into_owned(),

));

}

if let Some(artifact_path) =

csa_session::write_audit_warning_artifact(&ctx.session_dir, &mutated_paths)?

{

if let Ok(rel_path) = artifact_path.strip_prefix(&ctx.session_dir) {

session_result.artifacts.push(SessionArtifact::new(

rel_path.to_string_lossy().into_owned(),

));

}

}

gemini-code-assist · 2026-04-20T10:54:57Z

+    let explicit_readonly = [
+        "read-only",
+        "readonly",
+        "do not edit",
+        "don't edit",
+        "must not edit",
+        "without editing",
+        "do not modify",
+        "don't modify",
+    ];
+    if explicit_readonly
+        .iter()
+        .any(|marker| prompt_lower.contains(marker))
+    {


For performance and clarity, this array of string literals can be defined as a static constant. This avoids reallocating it on the stack every time this function is called.

Suggested change

let explicit_readonly = [

"read-only",

"readonly",

"do not edit",

"don't edit",

"must not edit",

"without editing",

"do not modify",

"don't modify",

];

if explicit_readonly

.iter()

.any(|marker| prompt_lower.contains(marker))

{

static EXPLICIT_READONLY: &[&str] = &[

"read-only",

"readonly",

"do not edit",

"don't edit",

"must not edit",

"without editing",

"do not modify",

"don't modify",

];

if EXPLICIT_READONLY

.iter()

.any(|marker| prompt_lower.contains(marker))

{

gemini-code-assist · 2026-04-20T10:54:57Z

+    let recon_markers = [
+        "recon",
+        "reconnaissance",
+        "analyze",
+        "analyse",
+        "analysis",
+        "summarize",
+        "summary",
+        "inspect",
+        "investigate",
+    ];
+    let write_markers = [
+        "implement",
+        "edit",
+        "fix",
+        "modify",
+        "update",
+        "patch",
+        "write code",
+        "create file",
+        "commit",
+        "merge",
+        "refactor",
+    ];
+    recon_markers
+        .iter()
+        .any(|marker| prompt_lower.contains(marker))
+        && !write_markers
+            .iter()
+            .any(|marker| prompt_lower.contains(marker))


Similarly, these marker arrays can also be static constants to avoid reallocating them on each function call.

static RECON_MARKERS: &[&str] = &[ "recon", "reconnaissance", "analyze", "analyse", "analysis", "summarize", "summary", "inspect", "investigate", ]; static WRITE_MARKERS: &[&str] = &[ "implement", "edit", "fix", "modify", "update", "patch", "write code", "create file", "commit", "merge", "refactor", ]; RECON_MARKERS .iter() .any(|marker| prompt_lower.contains(marker)) && !WRITE_MARKERS .iter() .any(|marker| prompt_lower.contains(marker))

gemini-code-assist · 2026-04-20T10:54:57Z

+    run_git(td.path(), &["add", "tracked.txt"]);
+    run_git(td.path(), &["commit", "-m", "init"]);
+    let session_start = std::time::SystemTime::now();
+    std::thread::sleep(std::time::Duration::from_millis(20));


Using thread::sleep in tests that rely on file modification timestamps can lead to flakiness, especially with a short duration like 20ms. The test might fail on a slow or heavily loaded system if the file write on the next line happens within the same timestamp granularity as session_start.

A more robust approach would be to manually set the file's modification time to be explicitly after session_start. You could use a helper function to touch the file with a future timestamp. For example:

// In a test helper module use std::time::{Duration, SystemTime}; pub fn touch_file(path: &std::path::Path, offset_secs: u64) { let new_time = SystemTime::now() + Duration::from_secs(offset_secs); filetime::set_file_mtime(path, filetime::FileTime::from_system_time(new_time)).unwrap(); }

This would require adding the filetime crate as a dev-dependency. If you prefer to avoid new dependencies, consider increasing the sleep duration to something safer, like 1 second, though it will make the test slower.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: ec422fca5e

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-04-20T10:55:55Z

            tool_state.last_action_summary = classified_summary;
        }
    }
+    audit::maybe_record_repo_write_audit(&ctx, &mut session_result)?;


Keep repo-write audit best-effort

The new repo-write audit is wired as a hard failure path: process_execution_result propagates any error from maybe_record_repo_write_audit with ?, and pipeline_session_exec turns post-exec errors into terminal session failures. That means an observational audit problem (for example git invocation failure, non-UTF8 path output, or inability to write audit-warnings.md) can mark an otherwise successful run as failed, which contradicts the warning-only behavior described for this feature.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-04-20T10:55:55Z

+        let candidate = project_root.join(rel_path);
+        let Ok(metadata) = fs::metadata(&candidate) else {


Resolve mutated paths from repository root

audit_repo_tracked_writes treats each git diff --name-only entry as relative to project_root, but Git emits repository-root-relative paths (e.g. from repo/sub, output is sub/file.rs). When project_root is a nested directory, project_root.join(rel_path) points to a non-existent path like repo/sub/sub/file.rs, so metadata lookup fails and real tracked mutations are silently skipped.

Useful? React with 👍 / 👎.

…d 2) Round-1 introduced should_audit_repo_tracked_writes() gated on task_type == Some("run"), but mktd planning sessions go through plan_cmd_exec.rs:229 with task_type "plan". The audit therefore silently skipped the exact mktd scenarios the round-1 PATTERN.md / workflow.toml updates promised to enforce. Extend the gate to cover run/plan/plan-step explicitly; continue skipping review/debate (they legitimately write to repo-tracked output) and unknown task_type (stay conservative). Closes #910. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Round-2 audit used git diff --name-only --diff-filter=M HEAD, which falsely reports clean when a session edits and commits a tracked file and silently drops added, deleted, and renamed files. This switches repo-write auditing to a pre/post snapshot model by comparing the session's creation HEAD against the final HEAD and unioning that committed diff with current porcelain status for uncommitted changes. The audit now reports added, modified, deleted, and renamed tracked paths in both audit-warnings.md and the manager result sidecar, while legacy sessions without a creation HEAD snapshot warn and no-op. Sidecar persistence was wired into save_result so the structured audit survives result loading. Closes #910. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…ne (#910 round 4) Round-3 captured pre_session_head and computed audit as post_porcelain UNION diff(pre_head..HEAD). Pre-existing dirty files were falsely attributed to the session because post_porcelain includes them unchanged. Capture pre_session_porcelain alongside pre_session_head. Compute attributable uncommitted changes as set-difference (post_porcelain \ pre_porcelain) keyed on (path, status-code). Union with the commit-range diff. Conservative: a file dirty with the same status code in both snapshots is NOT attributed (documented limitation; round-5 could add content-hash refinement if needed). Legacy sessions without pre_session_porcelain fall back to commits-only audit + tracing::warn. Closes #910. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…und 5) save_result_in wrote a new output/result.toml when manager_fields was populated, but left a pre-existing sidecar on disk untouched when a subsequent save arrived with empty manager_fields. The stale file was rehydrated by load_result_in on next read, silently attributing earlier-run audit/report metadata to the clean run. Explicitly remove output/result.toml and its artifact entry when the current save has no manager sidecar. Regression tests cover stale contract-sidecar cleanup plus save -> populated manager fields -> save empty -> reload returns empty manager_fields. Closes #910. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

… round 6) Round-5 made save_result_in unlink output/result.toml whenever manager_fields.as_sidecar() returned None. That overcorrected the stale-sidecar problem: an envelope-only save with empty manager_fields could wipe legitimate manager metadata written by a different path. Introduce SaveOptions::clear_stale_manager_sidecar with a safe default of false. Default save_result_in behavior now preserves existing sidecars and their contract artifact entries; explicit cleanup goes through clear_manager_sidecar(). Tests now cover preservation by default and explicit clear behavior, and the sidecar-preservation tests were split out to keep the monolith guard green. Closes #910. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…audit (#910 round 7) Two related correctness fixes for round-6 residuals: 1. manager_result.rs published the envelope before the sidecar, leaving the envelope referencing a missing/stale sidecar if the sidecar write or clear failed. Reorder: write sidecar atomically via temp+rename first, then publish envelope last. On sidecar failure, envelope stays unchanged. 2. pipeline_post_exec audit errors propagated via ?, turning a successful read-only session into a failed one whenever the audit helper could not parse or persist. Audit is observational; swallow its Result at call sites and emit tracing::warn! instead. Three audit boundaries (compute / persist / envelope-mutate) all best-effort. Tests cover sidecar-write-failure preserving envelope and audit-failure preserving execution success. Closes #910. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

… round 8) Round-7 made the write path atomic (sidecar temp+rename, then envelope publish). The clear path was asymmetric: clear_manager_sidecar() ran BEFORE the new envelope was written. On envelope-write failure, disk left an old envelope referencing a now-deleted sidecar. Mirror the write-path ordering: on clear, write the new envelope (with output/result.toml already removed from artifacts) atomically FIRST, and unlink the sidecar only after envelope publication succeeds. A stray sidecar on disk if unlink itself fails is acceptable — envelope remains authoritative. Closes #910. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…round 9) Round-8 left the audit comparing against git_head_at_creation / pre_session_porcelain captured once at session birth. A reused session where turn 1 modified tracked files and turn 2 resumed without edits would still emit a repo_write_audit warning on turn 2 because the baseline was stale by a whole turn. Capture a fresh (HEAD, porcelain) snapshot at the start of every execution, thread it through PostExecContext, and prefer it over the session-creation baseline in the audit compute step. Session-creation fields remain as a resilient fallback when per-execution capture fails. Closes #910. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

RyderFreeman4Logos · 2026-04-20T13:39:09Z

Merge audit trail — 9 rounds iterative fixes + design-residual circuit-break

This PR went through 9 rounds of review with codex. Each round surfaced a NEW concrete correctness bug (per CLAUDE.md lesson #821 rule of legitimate iterative quality work):

Round 1: output path canonicalization → output/ instead of repo-root
Round 2: audit gating extended beyond task_type="run" to cover plan/plan-step
Round 3: git diff --name-only --diff-filter=M HEAD replaced with pre/post snapshot covering A/M/D/R + commits
Round 4: pre-session porcelain baseline subtracted to avoid false-positive pre-existing-dirty attribution
Round 5: stale sidecar deletion on empty manager_fields save
Round 6: overcorrection of round-5 reverted via SaveOptions::clear_stale_manager_sidecar opt-in
Round 7: atomic sidecar→envelope write ordering + best-effort audit (swallow errors at 3 boundaries)
Round 8: symmetric atomic envelope→sidecar-unlink clear ordering
Round 9: per-execution audit baseline (refresh at turn start, not session creation)

Round-10 review findings (circuit-break rationale)

Session 01KPNHKCR6C09Y2BAW34S7S9SM returned:

HIGH (manager_result.rs:89): preserve-by-default leaks stale sidecar across turns via load_result_in overlay.
MEDIUM (manager_result.rs:127): clear_manager_sidecar failure is silently swallowed.

The HIGH is a same-point design flip with round-5's original finding (preserve vs overwrite sidecar semantics). Per CLAUDE.md lesson #821, same-point contradictions trigger circuit-break rather than further rounds.

Both are filed as followup issues for design decisions:

csa-session: load_result_in unconditionally overlays sidecar regardless of envelope artifacts #956 — load_result_in sidecar overlay reconciliation (HIGH)
csa-session: save_result_in reports Ok even when clear_manager_sidecar unlink fails #957 — silent unlink swallow (MEDIUM, blocks on csa-session: load_result_in unconditionally overlays sidecar regardless of envelope artifacts #956)

Merging at round 9 with atomicity + per-execution baseline fixes intact. The load-side reconciliation is a separate design decision requiring user input on preserve-vs-overwrite semantics.

Closes #910.

gemini-code-assist bot reviewed Apr 20, 2026

View reviewed changes

chatgpt-codex-connector bot reviewed Apr 20, 2026

View reviewed changes

RyderFreeman4Logos and others added 8 commits April 20, 2026 04:04

This was referenced Apr 20, 2026

csa-session: load_result_in unconditionally overlays sidecar regardless of envelope artifacts #956

Open

csa-session: save_result_in reports Ok even when clear_manager_sidecar unlink fails #957

Open

RyderFreeman4Logos merged commit d17c274 into main Apr 20, 2026
6 of 7 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(csa-session): recon sessions write to session-local output path (#910)#955

fix(csa-session): recon sessions write to session-local output path (#910)#955
RyderFreeman4Logos merged 9 commits intomainfrom
fix/910-recon-output-session-local

RyderFreeman4Logos commented Apr 20, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Apr 20, 2026

Uh oh!

gemini-code-assist bot Apr 20, 2026

Uh oh!

gemini-code-assist bot Apr 20, 2026

Uh oh!

gemini-code-assist bot Apr 20, 2026

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

chatgpt-codex-connector bot Apr 20, 2026

Uh oh!

chatgpt-codex-connector bot Apr 20, 2026

Uh oh!

RyderFreeman4Logos commented Apr 20, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

		let candidate = project_root.join(rel_path);
		let Ok(metadata) = fs::metadata(&candidate) else {

Conversation

RyderFreeman4Logos commented Apr 20, 2026

Summary

Scope

Test plan

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Apr 20, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Apr 20, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Apr 20, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Apr 20, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Apr 20, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector bot Apr 20, 2026

Choose a reason for hiding this comment

Uh oh!

RyderFreeman4Logos commented Apr 20, 2026

Merge audit trail — 9 rounds iterative fixes + design-residual circuit-break

Round-10 review findings (circuit-break rationale)

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant