Skip to content

fix(csa-session): recon sessions write to session-local output path (#910)#955

Merged
RyderFreeman4Logos merged 9 commits intomainfrom
fix/910-recon-output-session-local
Apr 20, 2026
Merged

fix(csa-session): recon sessions write to session-local output path (#910)#955
RyderFreeman4Logos merged 9 commits intomainfrom
fix/910-recon-output-session-local

Conversation

@RyderFreeman4Logos
Copy link
Copy Markdown
Owner

Summary

  • Canonicalize recon pattern/skill prompts to use \$CSA_SESSION_DIR/output/...
  • Add post-session audit that detects repo-tracked file mutations (warn only, no auto-revert)
  • Rule 027: PATTERN.md + workflow.toml updated in lockstep (patterns/mktd)

Scope

13 files, +313 / -50 LOC. New modules: pipeline_post_exec_audit.rs (129 LOC), manager_audit.rs (75 LOC).

Test plan

  • cargo test --workspace green
  • weave compile green
  • just pre-commit green

Closes #910.

…910)

Read-only recon CSA sessions repeatedly wrote to the repo-tracked
output/summary.md instead of session-local
$CSA_SESSION_DIR/output/summary.md because the recon prompt used a
relative path and CWD was the project root. Orchestrators had to
`git checkout output/` after every recon — noisy and risked
accidental commit of recon work product.

Canonicalize recon output paths in pattern/skill prompts to use
$CSA_SESSION_DIR/output/... (absolute). Verified CSA_SESSION_DIR
is already exported in the child environment.

Added post-session audit: when a session ends, compare repo-tracked
file mtimes against session_start_time; warn (via tracing::warn and
session-local audit-warnings.md) on any mutations. No auto-revert —
operator retains manual control.

Rule 027 applied — PATTERN.md and workflow.toml updated in lockstep.

Closes #910.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request implements a repository write audit mechanism to detect unintended modifications to tracked files during read-only or reconnaissance sessions. Key changes include a new auditing module in cli-sub-agent, logic to identify session types based on prompt keywords, and updates to the session manager to record audit warnings as artifacts. The review feedback suggests refactoring unstable Rust let_chains for better compatibility, using static constants for string marker arrays to optimize performance, and addressing potential test flakiness caused by thread::sleep when asserting on file timestamps.

Comment on lines +25 to +32
if let Some(artifact_path) =
csa_session::write_audit_warning_artifact(&ctx.session_dir, &mutated_paths)?
&& let Ok(rel_path) = artifact_path.strip_prefix(&ctx.session_dir)
{
session_result.artifacts.push(SessionArtifact::new(
rel_path.to_string_lossy().into_owned(),
));
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The if let ... && let ... syntax is part of the let_chains feature, which is currently unstable in Rust. While it might compile with a nightly toolchain, it's generally recommended to stick to stable features for better compatibility and readability.

Consider refactoring this to nested if let statements, which is idiomatic in stable Rust.

Suggested change
if let Some(artifact_path) =
csa_session::write_audit_warning_artifact(&ctx.session_dir, &mutated_paths)?
&& let Ok(rel_path) = artifact_path.strip_prefix(&ctx.session_dir)
{
session_result.artifacts.push(SessionArtifact::new(
rel_path.to_string_lossy().into_owned(),
));
}
if let Some(artifact_path) =
csa_session::write_audit_warning_artifact(&ctx.session_dir, &mutated_paths)?
{
if let Ok(rel_path) = artifact_path.strip_prefix(&ctx.session_dir) {
session_result.artifacts.push(SessionArtifact::new(
rel_path.to_string_lossy().into_owned(),
));
}
}

Comment on lines +50 to +63
let explicit_readonly = [
"read-only",
"readonly",
"do not edit",
"don't edit",
"must not edit",
"without editing",
"do not modify",
"don't modify",
];
if explicit_readonly
.iter()
.any(|marker| prompt_lower.contains(marker))
{
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

For performance and clarity, this array of string literals can be defined as a static constant. This avoids reallocating it on the stack every time this function is called.

Suggested change
let explicit_readonly = [
"read-only",
"readonly",
"do not edit",
"don't edit",
"must not edit",
"without editing",
"do not modify",
"don't modify",
];
if explicit_readonly
.iter()
.any(|marker| prompt_lower.contains(marker))
{
static EXPLICIT_READONLY: &[&str] = &[
"read-only",
"readonly",
"do not edit",
"don't edit",
"must not edit",
"without editing",
"do not modify",
"don't modify",
];
if EXPLICIT_READONLY
.iter()
.any(|marker| prompt_lower.contains(marker))
{

Comment on lines +67 to +96
let recon_markers = [
"recon",
"reconnaissance",
"analyze",
"analyse",
"analysis",
"summarize",
"summary",
"inspect",
"investigate",
];
let write_markers = [
"implement",
"edit",
"fix",
"modify",
"update",
"patch",
"write code",
"create file",
"commit",
"merge",
"refactor",
];
recon_markers
.iter()
.any(|marker| prompt_lower.contains(marker))
&& !write_markers
.iter()
.any(|marker| prompt_lower.contains(marker))
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Similarly, these marker arrays can also be static constants to avoid reallocating them on each function call.

    static RECON_MARKERS: &[&str] = &[
        "recon",
        "reconnaissance",
        "analyze",
        "analyse",
        "analysis",
        "summarize",
        "summary",
        "inspect",
        "investigate",
    ];
    static WRITE_MARKERS: &[&str] = &[
        "implement",
        "edit",
        "fix",
        "modify",
        "update",
        "patch",
        "write code",
        "create file",
        "commit",
        "merge",
        "refactor",
    ];
    RECON_MARKERS
        .iter()
        .any(|marker| prompt_lower.contains(marker))
        && !WRITE_MARKERS
            .iter()
            .any(|marker| prompt_lower.contains(marker))

run_git(td.path(), &["add", "tracked.txt"]);
run_git(td.path(), &["commit", "-m", "init"]);
let session_start = std::time::SystemTime::now();
std::thread::sleep(std::time::Duration::from_millis(20));
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Using thread::sleep in tests that rely on file modification timestamps can lead to flakiness, especially with a short duration like 20ms. The test might fail on a slow or heavily loaded system if the file write on the next line happens within the same timestamp granularity as session_start.

A more robust approach would be to manually set the file's modification time to be explicitly after session_start. You could use a helper function to touch the file with a future timestamp. For example:

// In a test helper module
use std::time::{Duration, SystemTime};
pub fn touch_file(path: &std::path::Path, offset_secs: u64) {
    let new_time = SystemTime::now() + Duration::from_secs(offset_secs);
    filetime::set_file_mtime(path, filetime::FileTime::from_system_time(new_time)).unwrap();
}

This would require adding the filetime crate as a dev-dependency. If you prefer to avoid new dependencies, consider increasing the sleep duration to something safer, like 1 second, though it will make the test slower.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: ec422fca5e

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

tool_state.last_action_summary = classified_summary;
}
}
audit::maybe_record_repo_write_audit(&ctx, &mut session_result)?;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Keep repo-write audit best-effort

The new repo-write audit is wired as a hard failure path: process_execution_result propagates any error from maybe_record_repo_write_audit with ?, and pipeline_session_exec turns post-exec errors into terminal session failures. That means an observational audit problem (for example git invocation failure, non-UTF8 path output, or inability to write audit-warnings.md) can mark an otherwise successful run as failed, which contradicts the warning-only behavior described for this feature.

Useful? React with 👍 / 👎.

Comment thread crates/csa-session/src/manager_audit.rs Outdated
Comment on lines +33 to +34
let candidate = project_root.join(rel_path);
let Ok(metadata) = fs::metadata(&candidate) else {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Resolve mutated paths from repository root

audit_repo_tracked_writes treats each git diff --name-only entry as relative to project_root, but Git emits repository-root-relative paths (e.g. from repo/sub, output is sub/file.rs). When project_root is a nested directory, project_root.join(rel_path) points to a non-existent path like repo/sub/sub/file.rs, so metadata lookup fails and real tracked mutations are silently skipped.

Useful? React with 👍 / 👎.

RyderFreeman4Logos and others added 8 commits April 20, 2026 04:04
…d 2)

Round-1 introduced should_audit_repo_tracked_writes() gated on
task_type == Some("run"), but mktd planning sessions go through
plan_cmd_exec.rs:229 with task_type "plan". The audit therefore
silently skipped the exact mktd scenarios the round-1 PATTERN.md /
workflow.toml updates promised to enforce.

Extend the gate to cover run/plan/plan-step explicitly; continue
skipping review/debate (they legitimately write to repo-tracked
output) and unknown task_type (stay conservative).

Closes #910.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Round-2 audit used git diff --name-only --diff-filter=M HEAD, which falsely reports clean when a session edits and commits a tracked file and silently drops added, deleted, and renamed files. This switches repo-write auditing to a pre/post snapshot model by comparing the session's creation HEAD against the final HEAD and unioning that committed diff with current porcelain status for uncommitted changes.

The audit now reports added, modified, deleted, and renamed tracked paths in both audit-warnings.md and the manager result sidecar, while legacy sessions without a creation HEAD snapshot warn and no-op. Sidecar persistence was wired into save_result so the structured audit survives result loading.

Closes #910.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ne (#910 round 4)

Round-3 captured pre_session_head and computed audit as
post_porcelain UNION diff(pre_head..HEAD). Pre-existing dirty
files were falsely attributed to the session because post_porcelain
includes them unchanged.

Capture pre_session_porcelain alongside pre_session_head. Compute
attributable uncommitted changes as set-difference
(post_porcelain \ pre_porcelain) keyed on (path, status-code).
Union with the commit-range diff. Conservative: a file dirty with
the same status code in both snapshots is NOT attributed (documented
limitation; round-5 could add content-hash refinement if needed).

Legacy sessions without pre_session_porcelain fall back to
commits-only audit + tracing::warn.

Closes #910.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…und 5)

save_result_in wrote a new output/result.toml when manager_fields was populated, but left a pre-existing sidecar on disk untouched when a subsequent save arrived with empty manager_fields. The stale file was rehydrated by load_result_in on next read, silently attributing earlier-run audit/report metadata to the clean run.

Explicitly remove output/result.toml and its artifact entry when the current save has no manager sidecar. Regression tests cover stale contract-sidecar cleanup plus save -> populated manager fields -> save empty -> reload returns empty manager_fields.

Closes #910.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… round 6)

Round-5 made save_result_in unlink output/result.toml whenever manager_fields.as_sidecar() returned None. That overcorrected the stale-sidecar problem: an envelope-only save with empty manager_fields could wipe legitimate manager metadata written by a different path.

Introduce SaveOptions::clear_stale_manager_sidecar with a safe default of false. Default save_result_in behavior now preserves existing sidecars and their contract artifact entries; explicit cleanup goes through clear_manager_sidecar(). Tests now cover preservation by default and explicit clear behavior, and the sidecar-preservation tests were split out to keep the monolith guard green.

Closes #910.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…audit (#910 round 7)

Two related correctness fixes for round-6 residuals:

1. manager_result.rs published the envelope before the sidecar,
   leaving the envelope referencing a missing/stale sidecar if the
   sidecar write or clear failed. Reorder: write sidecar atomically
   via temp+rename first, then publish envelope last. On sidecar
   failure, envelope stays unchanged.

2. pipeline_post_exec audit errors propagated via ?, turning a
   successful read-only session into a failed one whenever the
   audit helper could not parse or persist. Audit is observational;
   swallow its Result at call sites and emit tracing::warn! instead.
   Three audit boundaries (compute / persist / envelope-mutate) all
   best-effort.

Tests cover sidecar-write-failure preserving envelope and
audit-failure preserving execution success.

Closes #910.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… round 8)

Round-7 made the write path atomic (sidecar temp+rename, then envelope
publish). The clear path was asymmetric: clear_manager_sidecar() ran
BEFORE the new envelope was written. On envelope-write failure, disk
left an old envelope referencing a now-deleted sidecar.

Mirror the write-path ordering: on clear, write the new envelope
(with output/result.toml already removed from artifacts) atomically
FIRST, and unlink the sidecar only after envelope publication
succeeds. A stray sidecar on disk if unlink itself fails is
acceptable — envelope remains authoritative.

Closes #910.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…round 9)

Round-8 left the audit comparing against git_head_at_creation / pre_session_porcelain captured once at session birth. A reused session where turn 1 modified tracked files and turn 2 resumed without edits would still emit a repo_write_audit warning on turn 2 because the baseline was stale by a whole turn.

Capture a fresh (HEAD, porcelain) snapshot at the start of every execution, thread it through PostExecContext, and prefer it over the session-creation baseline in the audit compute step. Session-creation fields remain as a resilient fallback when per-execution capture fails.

Closes #910.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@RyderFreeman4Logos
Copy link
Copy Markdown
Owner Author

Merge audit trail — 9 rounds iterative fixes + design-residual circuit-break

This PR went through 9 rounds of review with codex. Each round surfaced a NEW concrete correctness bug (per CLAUDE.md lesson #821 rule of legitimate iterative quality work):

  1. Round 1: output path canonicalization → output/ instead of repo-root
  2. Round 2: audit gating extended beyond task_type="run" to cover plan/plan-step
  3. Round 3: git diff --name-only --diff-filter=M HEAD replaced with pre/post snapshot covering A/M/D/R + commits
  4. Round 4: pre-session porcelain baseline subtracted to avoid false-positive pre-existing-dirty attribution
  5. Round 5: stale sidecar deletion on empty manager_fields save
  6. Round 6: overcorrection of round-5 reverted via SaveOptions::clear_stale_manager_sidecar opt-in
  7. Round 7: atomic sidecar→envelope write ordering + best-effort audit (swallow errors at 3 boundaries)
  8. Round 8: symmetric atomic envelope→sidecar-unlink clear ordering
  9. Round 9: per-execution audit baseline (refresh at turn start, not session creation)

Round-10 review findings (circuit-break rationale)

Session 01KPNHKCR6C09Y2BAW34S7S9SM returned:

  • HIGH (manager_result.rs:89): preserve-by-default leaks stale sidecar across turns via load_result_in overlay.
  • MEDIUM (manager_result.rs:127): clear_manager_sidecar failure is silently swallowed.

The HIGH is a same-point design flip with round-5's original finding (preserve vs overwrite sidecar semantics). Per CLAUDE.md lesson #821, same-point contradictions trigger circuit-break rather than further rounds.

Both are filed as followup issues for design decisions:

Merging at round 9 with atomicity + per-execution baseline fixes intact. The load-side reconciliation is a separate design decision requiring user input on preserve-vs-overwrite semantics.

Closes #910.

@RyderFreeman4Logos RyderFreeman4Logos merged commit d17c274 into main Apr 20, 2026
6 of 7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

CSA recon sessions repeatedly violate session-local output constraint (write to repo-tracked output/summary.md)

1 participant