#580 Make low-risk deep-review-pro agents conditional by hubertgajewski · Pull Request #591 · hubertgajewski/orwellstat

hubertgajewski · 2026-06-05T05:59:55Z

Summary

makes deep-review-security, deep-review-project-checklist, and deep-review-docs dispatch through deterministic trigger rules instead of always
updates affected agent prompts, AI assistant docs, benchmark fixtures, and fixture validation tests
records issue [enhancement] Make low-risk deep-review-pro agents conditional #580 benchmark evidence, including exact dispatch-count deltas, a deterministic prompt-footprint token proxy, and the Codex sub-agent token-usage limitation
documents the shared benchmark cost-proxy policy so [enhancement] Send agent-specific subdiffs to deep-review-pro agents #581-[enhancement] Add large-diff risk bucketing to deep-review-pro #586 use consistent fallback calculations when exact usage is unavailable

Closes #580
Contributes to #587

Test plan

python3 -m coverage run -m unittest scripts.test_benchmark_deep_review_pro && python3 -m coverage report -m scripts/benchmark-deep-review-pro.py scripts/test_benchmark_deep_review_pro.py
git diff --cached --check
python3 -m json.tool docs/deep-review-pro-benchmark/fixtures.json >/tmp/orwellstat-fixtures-json-check.txt
python3 scripts/benchmark-deep-review-pro.py --before "$tmp/before" --after "$tmp/after" --out-dir "$tmp/report" with empty captured-run directories, verifying fixture dispatch counts and unavailable token fields are reported explicitly
Prompt-footprint estimate recorded in docs/deep-review-pro-benchmark/reports/580-conditional-dispatch.md from actual agent prompts, roster domains, and fixture diffs using a documented ceil(characters / 4) heuristic
Reviewed epic [epic] Reduce deep-review-pro token cost without lowering review quality #587 child stories [enhancement] Send agent-specific subdiffs to deep-review-pro agents #581-[enhancement] Add large-diff risk bucketing to deep-review-pro #586 and documented matching fallback cost proxies in docs/deep-review-pro-benchmark/README.md
Reviewer/Claude environment can rerun /deep-review-pro benchmark captures if exact Claude <usage> token postscripts are required beyond the Codex-recorded unavailable-fields report

Summary by CodeRabbit

Release Notes

Documentation
- Updated deep-review orchestrator guides documenting conditional dispatch triggers for reviewer agents based on code change patterns
- Expanded benchmark documentation with updated fixture expectations, cost tracking methodology, and scenario-specific dispatch behavior
Tests
- Added dispatch cell parsing and validation to verify correct reviewer agent execution across different change types

coderabbitai · 2026-06-05T06:00:08Z

📝 Walkthrough

Walkthrough

The PR implements conditional dispatch for three low-risk /deep-review-pro agents by replacing unconditional always dispatch with deterministic, scope-derived triggers. deep-review-security, deep-review-project-checklist, and deep-review-docs run only when their named triggers match the resolved scope; docs, fixtures, benchmarks, and tests updated accordingly.

Changes

Conditional Dispatch Implementation

Layer / File(s)	Summary
Conditional dispatch triggers and orchestration logic `.claude/skills/deep-review-pro/SKILL.md`	Master roster dispatch cells for `deep-review-security`, `deep-review-project-checklist`, and `deep-review-docs` moved from `always` to named triggers (`security-risk trigger`, `project-checklist trigger`, `docs trigger`). Scope resolution denies are refactored into `SANDBOX_DENY_COMPONENT_PATTERNS` and a Dispatch trigger definitions section deterministically computes `CHANGED_PATHS`, `NEW_PATHS`, and `ADDED_LINES` and evaluates non-`always` rows, emitting `SKIPPED: <Dispatch cell> not satisfied` when triggers fail.
Agent dispatch gating and integration `.claude/agents/deep-review-docs.md`, `.claude/agents/deep-review-project-checklist.md`, `.claude/agents/deep-review-security.md`	Each agent prompt now documents its orchestrator dispatch trigger and explicit skip criteria (docs trigger, project-checklist trigger, security-risk trigger) and reaffirms required behaviors (e.g., tainted-data tracing for security) before emitting findings.
Dispatch behavior documentation `docs/AI_ASSISTANTS.md`, `docs/deep-review-pro-benchmark/README.md`	Docs updated to describe which specialist agents run under which triggers, how skipped agents are recorded in aggregate results, and new sections for recorded reports and cost-proxy policy.
Benchmark fixtures and test validation `docs/deep-review-pro-benchmark/fixtures.json`, `docs/deep-review-pro-benchmark/fixtures/script-code-only.diff`, `docs/deep-review-pro-benchmark/reports/580-conditional-dispatch.md`, `scripts/test_benchmark_deep_review_pro.py`	Fixtures updated (docs-only, playwright-test) and a new `script-code-only` fixture added; `scripts/test_benchmark_deep_review_pro.py` adds regex parsing for dispatch cells and unit tests validating parser failure modes and exact dispatched/skipped agent sets; benchmark report added documenting before/after dispatch deltas and validation assertions.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related issues

#580 — Direct match: implements conditional dispatch for low-risk deep-review-pro agents per the issue acceptance criteria.

Possibly related PRs

hubertgajewski/orwellstat#588 — Related benchmark tooling changes validating dispatch/fixture behavior.
hubertgajewski/orwellstat#535 — Related edits to the deep-review-docs agent prompt.

Poem

🐰 I tiptoe through the trigger light,
Waiting for the docs or signs of fright.
When paths align and risks appear,
I hop to help, both swift and clear.
Tokens saved — the burrow cheers!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately summarizes the main change: replacing unconditional dispatch of three low-risk agents with deterministic trigger-based dispatch, which is reflected throughout the changeset.
Linked Issues check	✅ Passed	The pull request successfully implements all acceptance criteria from issue `#580`: conditional dispatch rules for deep-review-project-checklist (code-only skip), deep-review-docs (no-new-files skip), and deep-review-security (conservative source/workflow/manifest/env run) with documented triggers, updated prompts, and fixture-based validation.
Out of Scope Changes check	✅ Passed	All changes are directly related to implementing conditional dispatch for low-risk agents: updates to agent prompts, SKILL.md orchestrator logic, benchmark fixtures, documentation, and fixture validation tests align with issue `#580` requirements.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feature/580

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (1)

.claude/skills/deep-review-pro/SKILL.md (1)
228-228: ⚡ Quick win

Clarify untracked file content-scan policy for security trigger.

Line 228 states "any untracked file that is not clearly docs/generated/test-only, or whose content cannot be safely scanned before skipping" as a risk trigger. This suggests content inspection may be required for untracked files, but line 188 states "When a path is needed only for dispatch classification, inspect the path string; do not read denied paths."

If an untracked file matches a deny pattern (e.g., .env, *secret*), should the trigger:

Fire immediately based on path alone (conservative, safe)

Skip content scan and classify by path extension only

Be treated as an error case

The current wording doesn't make the precedence clear. Consider explicitly stating: "any untracked file whose path matches a deny pattern or whose path is not clearly docs/generated/test-only (by extension or directory)" to avoid ambiguity about content scanning.
📝 Proposed clarification
-- any untracked file that is not clearly docs/generated/test-only, or whose content cannot be safely scanned before skipping
+- any untracked file whose path matches a deny pattern, or whose path is not clearly docs/generated/test-only by extension or directory (e.g., not under `docs/`, not ending in `.md`, `.snap`, `.diff`, etc.)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.claude/skills/deep-review-pro/SKILL.md at line 228, The wording around
untracked-file handling is ambiguous between the rule "any untracked file that
is not clearly docs/generated/test-only, or whose content cannot be safely
scanned before skipping" and the rule "When a path is needed only for dispatch
classification, inspect the path string; do not read denied paths"; update the
SKILL.md wording to explicitly state precedence and behavior by
replacing/augmenting the sentence that begins "any untracked file that is not
clearly docs/generated/test-only..." with a clear rule such as: if an untracked
file's path matches a deny pattern (e.g., .env, *secret*) it should trigger the
security rule based on path alone (no content read), otherwise only files
clearly in docs/generated/test-only by extension or directory are excluded from
scanning; reference the original phrases "any untracked file that is not clearly
docs/generated/test-only" and "When a path is needed only for dispatch
classification, inspect the path string; do not read denied paths" so reviewers
can locate and replace the ambiguous text.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@scripts/test_benchmark_deep_review_pro.py`:
- Around line 66-76: The parser read_deep_review_pro_dispatch_cells currently
silently overwrites entries when parsing SKILL.md; update it to validate each
parsed row is well-formed (ensure cells has expected length and the first cell
matches pattern `deep-review-*`) and to fail fast by raising an exception if a
malformed row is encountered or if a duplicate key would be inserted into
dispatch_cells; reference the function name read_deep_review_pro_dispatch_cells,
the dispatch_cells dict, and the parsing of cells so the fix checks cells length
and format and uses an explicit error (instead of last-write-wins) when a
duplicate key or malformed row is detected.

---

Nitpick comments:
In @.claude/skills/deep-review-pro/SKILL.md:
- Line 228: The wording around untracked-file handling is ambiguous between the
rule "any untracked file that is not clearly docs/generated/test-only, or whose
content cannot be safely scanned before skipping" and the rule "When a path is
needed only for dispatch classification, inspect the path string; do not read
denied paths"; update the SKILL.md wording to explicitly state precedence and
behavior by replacing/augmenting the sentence that begins "any untracked file
that is not clearly docs/generated/test-only..." with a clear rule such as: if
an untracked file's path matches a deny pattern (e.g., .env, *secret*) it should
trigger the security rule based on path alone (no content read), otherwise only
files clearly in docs/generated/test-only by extension or directory are excluded
from scanning; reference the original phrases "any untracked file that is not
clearly docs/generated/test-only" and "When a path is needed only for dispatch
classification, inspect the path string; do not read denied paths" so reviewers
can locate and replace the ambiguous text.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 8a500350-1915-4258-8610-9ebf892b5051

📥 Commits

Reviewing files that changed from the base of the PR and between 4398fc9 and 27cb122.

📒 Files selected for processing (10)

.claude/agents/deep-review-docs.md
.claude/agents/deep-review-project-checklist.md
.claude/agents/deep-review-security.md
.claude/skills/deep-review-pro/SKILL.md
docs/AI_ASSISTANTS.md
docs/deep-review-pro-benchmark/README.md
docs/deep-review-pro-benchmark/fixtures.json
docs/deep-review-pro-benchmark/fixtures/script-code-only.diff
docs/deep-review-pro-benchmark/reports/580-conditional-dispatch.md
scripts/test_benchmark_deep_review_pro.py

coderabbitai

🧹 Nitpick comments (1)

docs/deep-review-pro-benchmark/reports/580-conditional-dispatch.md (1)

51-51: 💤 Low value

Consider varying sentence structure for readability.

The static analysis tool correctly identified three consecutive sentences beginning with "It". While this is a minor style issue, varying the sentence structure slightly would improve readability.

✍️ Optional rewording suggestion

-This follows the benchmark README's Cost Proxy Policy. It is not billing data. It excludes harness overhead, cache effects, conversation history, model-specific tokenization, and agent output tokens. It is useful only as a deterministic before/after proxy for the prompt text this change stops sending.
+This follows the benchmark README's Cost Proxy Policy. The estimate is not billing data and excludes harness overhead, cache effects, conversation history, model-specific tokenization, and agent output tokens. It serves only as a deterministic before/after proxy for the prompt text this change stops sending.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@docs/deep-review-pro-benchmark/reports/580-conditional-dispatch.md` at line
51, The three consecutive sentences in the doc fragment ("It is not billing
data. It excludes harness overhead, cache effects, conversation history,
model-specific tokenization, and agent output tokens. It is useful only as a
deterministic before/after proxy for the prompt text this change stops
sending.") should be rewritten to vary sentence openings for readability; locate
that sentence block and combine or rephrase so the first "It" becomes a clause
(e.g., "This is not billing data and excludes..."), the second lists exclusions
concisely, and the final sentence clarifies purpose (e.g., "serves only as a
deterministic before/after proxy...") to avoid three sentences starting with
"It".

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@docs/deep-review-pro-benchmark/reports/580-conditional-dispatch.md`:
- Line 51: The three consecutive sentences in the doc fragment ("It is not
billing data. It excludes harness overhead, cache effects, conversation history,
model-specific tokenization, and agent output tokens. It is useful only as a
deterministic before/after proxy for the prompt text this change stops
sending.") should be rewritten to vary sentence openings for readability; locate
that sentence block and combine or rephrase so the first "It" becomes a clause
(e.g., "This is not billing data and excludes..."), the second lists exclusions
concisely, and the final sentence clarifies purpose (e.g., "serves only as a
deterministic before/after proxy...") to avoid three sentences starting with
"It".

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 7304658f-5e34-4bcc-8271-58067fb49a0f

📥 Commits

Reviewing files that changed from the base of the PR and between 27cb122 and d648db9.

📒 Files selected for processing (4)

.claude/skills/deep-review-pro/SKILL.md
docs/deep-review-pro-benchmark/README.md
docs/deep-review-pro-benchmark/reports/580-conditional-dispatch.md
scripts/test_benchmark_deep_review_pro.py

🚧 Files skipped from review as they are similar to previous changes (2)

docs/deep-review-pro-benchmark/README.md
.claude/skills/deep-review-pro/SKILL.md

#580 Make deep-review-pro low-risk dispatch conditional

27cb122

hubertgajewski had a problem deploying to staging June 5, 2026 05:59 — with GitHub Actions Failure

hubertgajewski temporarily deployed to staging June 5, 2026 05:59 — with GitHub Actions Inactive

hubertgajewski temporarily deployed to staging June 5, 2026 06:00 — with GitHub Actions Inactive

hubertgajewski temporarily deployed to staging June 5, 2026 06:01 — with GitHub Actions Inactive

coderabbitai Bot reviewed Jun 5, 2026

View reviewed changes

Comment thread scripts/test_benchmark_deep_review_pro.py Outdated

#580 Record prompt-footprint estimate

263f1bf

hubertgajewski temporarily deployed to staging June 5, 2026 06:28 — with GitHub Actions Inactive

hubertgajewski had a problem deploying to staging June 5, 2026 06:29 — with GitHub Actions Error

hubertgajewski temporarily deployed to staging June 5, 2026 06:41 — with GitHub Actions Inactive

hubertgajewski temporarily deployed to staging June 5, 2026 06:42 — with GitHub Actions Inactive

#580 Keep deny pattern docs single-sourced

ee963e3

hubertgajewski temporarily deployed to staging June 5, 2026 06:47 — with GitHub Actions Inactive

hubertgajewski temporarily deployed to staging June 5, 2026 06:48 — with GitHub Actions Inactive

#580 Address deep-review-pro review comments

d648db9

coderabbitai Bot reviewed Jun 5, 2026

View reviewed changes

This was referenced Jun 5, 2026

#581 Send scoped subdiffs to deep-review-pro agents #592

Merged

#582 Reuse deep-review-pro rerun results #593

Merged

#583 Reduce deep-review-pro output verbosity #594

Merged

#585 Add static deep-review pre-pass #600

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

#580 Make low-risk deep-review-pro agents conditional#591

#580 Make low-risk deep-review-pro agents conditional#591
hubertgajewski merged 6 commits into
mainfrom
feature/580

hubertgajewski commented Jun 5, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Jun 5, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Possibly related issues

Possibly related PRs

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

hubertgajewski commented Jun 5, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Summary by CodeRabbit

Release Notes

Uh oh!

coderabbitai Bot commented Jun 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related issues

Possibly related PRs

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

hubertgajewski commented Jun 5, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Jun 5, 2026 •

edited

Loading