Skip to content

#580 Make low-risk deep-review-pro agents conditional#591

Merged
hubertgajewski merged 6 commits into
mainfrom
feature/580
Jun 5, 2026
Merged

#580 Make low-risk deep-review-pro agents conditional#591
hubertgajewski merged 6 commits into
mainfrom
feature/580

Conversation

@hubertgajewski

@hubertgajewski hubertgajewski commented Jun 5, 2026

Copy link
Copy Markdown
Owner

Summary

Closes #580
Contributes to #587

Test plan

  • python3 -m coverage run -m unittest scripts.test_benchmark_deep_review_pro && python3 -m coverage report -m scripts/benchmark-deep-review-pro.py scripts/test_benchmark_deep_review_pro.py
  • git diff --cached --check
  • python3 -m json.tool docs/deep-review-pro-benchmark/fixtures.json >/tmp/orwellstat-fixtures-json-check.txt
  • python3 scripts/benchmark-deep-review-pro.py --before "$tmp/before" --after "$tmp/after" --out-dir "$tmp/report" with empty captured-run directories, verifying fixture dispatch counts and unavailable token fields are reported explicitly
  • Prompt-footprint estimate recorded in docs/deep-review-pro-benchmark/reports/580-conditional-dispatch.md from actual agent prompts, roster domains, and fixture diffs using a documented ceil(characters / 4) heuristic
  • Reviewed epic [epic] Reduce deep-review-pro token cost without lowering review quality #587 child stories [enhancement] Send agent-specific subdiffs to deep-review-pro agents #581-[enhancement] Add large-diff risk bucketing to deep-review-pro #586 and documented matching fallback cost proxies in docs/deep-review-pro-benchmark/README.md
  • Reviewer/Claude environment can rerun /deep-review-pro benchmark captures if exact Claude <usage> token postscripts are required beyond the Codex-recorded unavailable-fields report

Summary by CodeRabbit

Release Notes

  • Documentation

    • Updated deep-review orchestrator guides documenting conditional dispatch triggers for reviewer agents based on code change patterns
    • Expanded benchmark documentation with updated fixture expectations, cost tracking methodology, and scenario-specific dispatch behavior
  • Tests

    • Added dispatch cell parsing and validation to verify correct reviewer agent execution across different change types

@coderabbitai

coderabbitai Bot commented Jun 5, 2026

Copy link
Copy Markdown

Review Change Stack

📝 Walkthrough

Walkthrough

The PR implements conditional dispatch for three low-risk /deep-review-pro agents by replacing unconditional always dispatch with deterministic, scope-derived triggers. deep-review-security, deep-review-project-checklist, and deep-review-docs run only when their named triggers match the resolved scope; docs, fixtures, benchmarks, and tests updated accordingly.

Changes

Conditional Dispatch Implementation

Layer / File(s) Summary
Conditional dispatch triggers and orchestration logic
.claude/skills/deep-review-pro/SKILL.md
Master roster dispatch cells for deep-review-security, deep-review-project-checklist, and deep-review-docs moved from always to named triggers (security-risk trigger, project-checklist trigger, docs trigger). Scope resolution denies are refactored into SANDBOX_DENY_COMPONENT_PATTERNS and a Dispatch trigger definitions section deterministically computes CHANGED_PATHS, NEW_PATHS, and ADDED_LINES and evaluates non-always rows, emitting SKIPPED: <Dispatch cell> not satisfied when triggers fail.
Agent dispatch gating and integration
.claude/agents/deep-review-docs.md, .claude/agents/deep-review-project-checklist.md, .claude/agents/deep-review-security.md
Each agent prompt now documents its orchestrator dispatch trigger and explicit skip criteria (docs trigger, project-checklist trigger, security-risk trigger) and reaffirms required behaviors (e.g., tainted-data tracing for security) before emitting findings.
Dispatch behavior documentation
docs/AI_ASSISTANTS.md, docs/deep-review-pro-benchmark/README.md
Docs updated to describe which specialist agents run under which triggers, how skipped agents are recorded in aggregate results, and new sections for recorded reports and cost-proxy policy.
Benchmark fixtures and test validation
docs/deep-review-pro-benchmark/fixtures.json, docs/deep-review-pro-benchmark/fixtures/script-code-only.diff, docs/deep-review-pro-benchmark/reports/580-conditional-dispatch.md, scripts/test_benchmark_deep_review_pro.py
Fixtures updated (docs-only, playwright-test) and a new script-code-only fixture added; scripts/test_benchmark_deep_review_pro.py adds regex parsing for dispatch cells and unit tests validating parser failure modes and exact dispatched/skipped agent sets; benchmark report added documenting before/after dispatch deltas and validation assertions.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related issues

  • #580 — Direct match: implements conditional dispatch for low-risk deep-review-pro agents per the issue acceptance criteria.

Possibly related PRs

Poem

🐰 I tiptoe through the trigger light,
Waiting for the docs or signs of fright.
When paths align and risks appear,
I hop to help, both swift and clear.
Tokens saved — the burrow cheers!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main change: replacing unconditional dispatch of three low-risk agents with deterministic trigger-based dispatch, which is reflected throughout the changeset.
Linked Issues check ✅ Passed The pull request successfully implements all acceptance criteria from issue #580: conditional dispatch rules for deep-review-project-checklist (code-only skip), deep-review-docs (no-new-files skip), and deep-review-security (conservative source/workflow/manifest/env run) with documented triggers, updated prompts, and fixture-based validation.
Out of Scope Changes check ✅ Passed All changes are directly related to implementing conditional dispatch for low-risk agents: updates to agent prompts, SKILL.md orchestrator logic, benchmark fixtures, documentation, and fixture validation tests align with issue #580 requirements.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feature/580

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
.claude/skills/deep-review-pro/SKILL.md (1)

228-228: ⚡ Quick win

Clarify untracked file content-scan policy for security trigger.

Line 228 states "any untracked file that is not clearly docs/generated/test-only, or whose content cannot be safely scanned before skipping" as a risk trigger. This suggests content inspection may be required for untracked files, but line 188 states "When a path is needed only for dispatch classification, inspect the path string; do not read denied paths."

If an untracked file matches a deny pattern (e.g., .env, *secret*), should the trigger:

  1. Fire immediately based on path alone (conservative, safe)
  2. Skip content scan and classify by path extension only
  3. Be treated as an error case

The current wording doesn't make the precedence clear. Consider explicitly stating: "any untracked file whose path matches a deny pattern or whose path is not clearly docs/generated/test-only (by extension or directory)" to avoid ambiguity about content scanning.

📝 Proposed clarification
-- any untracked file that is not clearly docs/generated/test-only, or whose content cannot be safely scanned before skipping
+- any untracked file whose path matches a deny pattern, or whose path is not clearly docs/generated/test-only by extension or directory (e.g., not under `docs/`, not ending in `.md`, `.snap`, `.diff`, etc.)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.claude/skills/deep-review-pro/SKILL.md at line 228, The wording around
untracked-file handling is ambiguous between the rule "any untracked file that
is not clearly docs/generated/test-only, or whose content cannot be safely
scanned before skipping" and the rule "When a path is needed only for dispatch
classification, inspect the path string; do not read denied paths"; update the
SKILL.md wording to explicitly state precedence and behavior by
replacing/augmenting the sentence that begins "any untracked file that is not
clearly docs/generated/test-only..." with a clear rule such as: if an untracked
file's path matches a deny pattern (e.g., .env, *secret*) it should trigger the
security rule based on path alone (no content read), otherwise only files
clearly in docs/generated/test-only by extension or directory are excluded from
scanning; reference the original phrases "any untracked file that is not clearly
docs/generated/test-only" and "When a path is needed only for dispatch
classification, inspect the path string; do not read denied paths" so reviewers
can locate and replace the ambiguous text.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@scripts/test_benchmark_deep_review_pro.py`:
- Around line 66-76: The parser read_deep_review_pro_dispatch_cells currently
silently overwrites entries when parsing SKILL.md; update it to validate each
parsed row is well-formed (ensure cells has expected length and the first cell
matches pattern `deep-review-*`) and to fail fast by raising an exception if a
malformed row is encountered or if a duplicate key would be inserted into
dispatch_cells; reference the function name read_deep_review_pro_dispatch_cells,
the dispatch_cells dict, and the parsing of cells so the fix checks cells length
and format and uses an explicit error (instead of last-write-wins) when a
duplicate key or malformed row is detected.

---

Nitpick comments:
In @.claude/skills/deep-review-pro/SKILL.md:
- Line 228: The wording around untracked-file handling is ambiguous between the
rule "any untracked file that is not clearly docs/generated/test-only, or whose
content cannot be safely scanned before skipping" and the rule "When a path is
needed only for dispatch classification, inspect the path string; do not read
denied paths"; update the SKILL.md wording to explicitly state precedence and
behavior by replacing/augmenting the sentence that begins "any untracked file
that is not clearly docs/generated/test-only..." with a clear rule such as: if
an untracked file's path matches a deny pattern (e.g., .env, *secret*) it should
trigger the security rule based on path alone (no content read), otherwise only
files clearly in docs/generated/test-only by extension or directory are excluded
from scanning; reference the original phrases "any untracked file that is not
clearly docs/generated/test-only" and "When a path is needed only for dispatch
classification, inspect the path string; do not read denied paths" so reviewers
can locate and replace the ambiguous text.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 8a500350-1915-4258-8610-9ebf892b5051

📥 Commits

Reviewing files that changed from the base of the PR and between 4398fc9 and 27cb122.

📒 Files selected for processing (10)
  • .claude/agents/deep-review-docs.md
  • .claude/agents/deep-review-project-checklist.md
  • .claude/agents/deep-review-security.md
  • .claude/skills/deep-review-pro/SKILL.md
  • docs/AI_ASSISTANTS.md
  • docs/deep-review-pro-benchmark/README.md
  • docs/deep-review-pro-benchmark/fixtures.json
  • docs/deep-review-pro-benchmark/fixtures/script-code-only.diff
  • docs/deep-review-pro-benchmark/reports/580-conditional-dispatch.md
  • scripts/test_benchmark_deep_review_pro.py

Comment thread scripts/test_benchmark_deep_review_pro.py Outdated

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
docs/deep-review-pro-benchmark/reports/580-conditional-dispatch.md (1)

51-51: 💤 Low value

Consider varying sentence structure for readability.

The static analysis tool correctly identified three consecutive sentences beginning with "It". While this is a minor style issue, varying the sentence structure slightly would improve readability.

✍️ Optional rewording suggestion
-This follows the benchmark README's Cost Proxy Policy. It is not billing data. It excludes harness overhead, cache effects, conversation history, model-specific tokenization, and agent output tokens. It is useful only as a deterministic before/after proxy for the prompt text this change stops sending.
+This follows the benchmark README's Cost Proxy Policy. The estimate is not billing data and excludes harness overhead, cache effects, conversation history, model-specific tokenization, and agent output tokens. It serves only as a deterministic before/after proxy for the prompt text this change stops sending.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@docs/deep-review-pro-benchmark/reports/580-conditional-dispatch.md` at line
51, The three consecutive sentences in the doc fragment ("It is not billing
data. It excludes harness overhead, cache effects, conversation history,
model-specific tokenization, and agent output tokens. It is useful only as a
deterministic before/after proxy for the prompt text this change stops
sending.") should be rewritten to vary sentence openings for readability; locate
that sentence block and combine or rephrase so the first "It" becomes a clause
(e.g., "This is not billing data and excludes..."), the second lists exclusions
concisely, and the final sentence clarifies purpose (e.g., "serves only as a
deterministic before/after proxy...") to avoid three sentences starting with
"It".
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@docs/deep-review-pro-benchmark/reports/580-conditional-dispatch.md`:
- Line 51: The three consecutive sentences in the doc fragment ("It is not
billing data. It excludes harness overhead, cache effects, conversation history,
model-specific tokenization, and agent output tokens. It is useful only as a
deterministic before/after proxy for the prompt text this change stops
sending.") should be rewritten to vary sentence openings for readability; locate
that sentence block and combine or rephrase so the first "It" becomes a clause
(e.g., "This is not billing data and excludes..."), the second lists exclusions
concisely, and the final sentence clarifies purpose (e.g., "serves only as a
deterministic before/after proxy...") to avoid three sentences starting with
"It".

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 7304658f-5e34-4bcc-8271-58067fb49a0f

📥 Commits

Reviewing files that changed from the base of the PR and between 27cb122 and d648db9.

📒 Files selected for processing (4)
  • .claude/skills/deep-review-pro/SKILL.md
  • docs/deep-review-pro-benchmark/README.md
  • docs/deep-review-pro-benchmark/reports/580-conditional-dispatch.md
  • scripts/test_benchmark_deep_review_pro.py
🚧 Files skipped from review as they are similar to previous changes (2)
  • docs/deep-review-pro-benchmark/README.md
  • .claude/skills/deep-review-pro/SKILL.md

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[enhancement] Make low-risk deep-review-pro agents conditional

1 participant