#585 Add static deep-review pre-pass#600
Conversation
|
Warning Review limit reached
More reviews will be available in 47 minutes and 2 seconds. Learn how PR review limits work. Your organization has run out of usage credits. Purchase more in the billing tab. ⌛ How to resolve this issue?After more reviews become available, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available. Please see our Fair Usage Limits Policy for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Plus Run ID: 📒 Files selected for processing (1)
📝 WalkthroughWalkthroughIntroduce an orchestrator-level static pre-pass, narrow specialist agents to semantic responsibilities, add dispatch-static-v1/compact-static-v1 and WORKTREE checkpoint support in benchmark scripts, update support dispatch selectors, update benchmark fixtures/reports, and expand tests to validate the new static pre-pass outputs and routing. ChangesStatic Pre-Pass Orchestration and Agent Refactoring
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related issues
Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
f82095c to
cad1af0
Compare
There was a problem hiding this comment.
🧹 Nitpick comments (1)
scripts/test_benchmark_deep_review_epic_matrix.py (1)
136-139: ⚡ Quick winUse a contract-stable assertion instead of prose text matching.
Line 137 currently couples this test to a human-readable domain phrase (
"path-alias / loadEnv"), which is brittle and can fail on harmless wording edits.Suggested change
- self.assertIn( - "path-alias / loadEnv", - roster["deep-review-project-checklist"]["domain"], - ) + self.assertIn("deep-review-project-checklist", roster) + self.assertEqual( + roster["deep-review-project-checklist"]["dispatch"], + "project-checklist trigger", + )🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@scripts/test_benchmark_deep_review_epic_matrix.py` around lines 136 - 139, The test is brittle because it matches a human-readable phrase; replace the prose match with a contract-stable assertion: instead of assertIn("path-alias / loadEnv", roster["deep-review-project-checklist"]["domain"]), extract the canonical/stable identifier for that domain from the same entry (e.g., a slug/id/tag field) or normalize the domain string into a stable form and compare it to a test constant (e.g., EXPECTED_DOMAIN_TAG) using assertEqual or assertIn against a list of stable tags; update the assertion referring to roster["deep-review-project-checklist"]["domain"] (and add a small constant like EXPECTED_DOMAIN_TAG) so future wording changes won’t break the test.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@scripts/test_benchmark_deep_review_epic_matrix.py`:
- Around line 136-139: The test is brittle because it matches a human-readable
phrase; replace the prose match with a contract-stable assertion: instead of
assertIn("path-alias / loadEnv",
roster["deep-review-project-checklist"]["domain"]), extract the canonical/stable
identifier for that domain from the same entry (e.g., a slug/id/tag field) or
normalize the domain string into a stable form and compare it to a test constant
(e.g., EXPECTED_DOMAIN_TAG) using assertEqual or assertIn against a list of
stable tags; update the assertion referring to
roster["deep-review-project-checklist"]["domain"] (and add a small constant like
EXPECTED_DOMAIN_TAG) so future wording changes won’t break the test.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro Plus
Run ID: 008648f6-98d4-46c3-90c4-4fae0e2493b8
📒 Files selected for processing (15)
.claude/agents/deep-review-ci.md.claude/agents/deep-review-project-checklist.md.claude/agents/deep-review-qa.md.claude/agents/deep-review-security.md.claude/agents/deep-review-typescript.md.claude/skills/deep-review-pro/SKILL.mddocs/AI_ASSISTANTS.mddocs/deep-review-pro-benchmark/README.mddocs/deep-review-pro-benchmark/fixtures.jsondocs/deep-review-pro-benchmark/reports/587-epic-token-cost-matrix.jsondocs/deep-review-pro-benchmark/reports/587-epic-token-cost-matrix.mdscripts/benchmark_deep_review_epic_matrix.pyscripts/deep_review_benchmark_support.pyscripts/test_benchmark_deep_review_epic_matrix.pyscripts/test_benchmark_deep_review_pro.py
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In @.claude/skills/deep-review-pro/SKILL.md:
- Around line 418-424: The fenced code block that begins with "### aggregate" is
missing a language tag (causing markdownlint MD040); change the opening fence
from ``` to ```text so the block is fenced as ```text ... ``` and include the
same inner lines unchanged (the block starting at the "### aggregate" header).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro Plus
Run ID: dcefc071-9621-47cc-a89c-024c0105a599
📒 Files selected for processing (16)
.claude/agents/deep-review-ci.md.claude/agents/deep-review-project-checklist.md.claude/agents/deep-review-qa.md.claude/agents/deep-review-security.md.claude/agents/deep-review-typescript.md.claude/skills/deep-review-pro/SKILL.mddocs/AI_ASSISTANTS.mddocs/CONFIGURATION.mddocs/deep-review-pro-benchmark/README.mddocs/deep-review-pro-benchmark/fixtures.jsondocs/deep-review-pro-benchmark/reports/587-epic-token-cost-matrix.jsondocs/deep-review-pro-benchmark/reports/587-epic-token-cost-matrix.mdscripts/benchmark_deep_review_epic_matrix.pyscripts/deep_review_benchmark_support.pyscripts/test_benchmark_deep_review_epic_matrix.pyscripts/test_benchmark_deep_review_pro.py
✅ Files skipped from review due to trivial changes (5)
- docs/CONFIGURATION.md
- .claude/agents/deep-review-security.md
- docs/deep-review-pro-benchmark/reports/587-epic-token-cost-matrix.md
- docs/deep-review-pro-benchmark/reports/587-epic-token-cost-matrix.json
- docs/deep-review-pro-benchmark/README.md
🚧 Files skipped from review as they are similar to previous changes (5)
- .claude/agents/deep-review-typescript.md
- docs/deep-review-pro-benchmark/fixtures.json
- docs/AI_ASSISTANTS.md
- .claude/agents/deep-review-project-checklist.md
- scripts/benchmark_deep_review_epic_matrix.py
…tic pre-pass Co-authored-by: Cursor <cursoragent@cursor.com>
There was a problem hiding this comment.
🧹 Nitpick comments (1)
docs/deep-review-pro-benchmark/reports/584-shared-boilerplate.md (1)
9-11: ⚡ Quick winClarify which measurement surface includes aggregate output.
Line 9 states "aggregate output is unchanged from
#583compact mode," and the "Prompt-Footprint Estimate" table (lines 53-64) measures prompt-input only. However, the "Epic Comparable Benchmark" incremental delta (lines 21-24) shows combined chars of 2,037,880 → 1,975,708, which is ~11,561 chars larger than the prompt-only total of 2,026,319 on line 64.This suggests the Epic Comparable Benchmark section includes aggregate output in the combined totals, while the Prompt-Footprint Estimate table does not. Consider adding a brief note near line 15 or line 53 clarifying which tables measure prompt-only vs. combined (prompt + aggregate output) to prevent reader confusion.
Also applies to: 53-64
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@docs/deep-review-pro-benchmark/reports/584-shared-boilerplate.md` around lines 9 - 11, Summary: clarify which tables measure prompt-only vs prompt+aggregate output. Update the "Epic Comparable Benchmark" section to explicitly state that its incremental deltas include combined prompt + aggregate output, and add a short clarifying footnote or parenthetical to the "Prompt-Footprint Estimate" table header stating it measures prompt-input only; adjust the text near the "Epic Comparable Benchmark" and "Prompt-Footprint Estimate" headings so readers see the measurement surface immediately and ensure the incremental-delta wording in "Epic Comparable Benchmark incremental delta" references the combined-char totals.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@docs/deep-review-pro-benchmark/reports/584-shared-boilerplate.md`:
- Around line 9-11: Summary: clarify which tables measure prompt-only vs
prompt+aggregate output. Update the "Epic Comparable Benchmark" section to
explicitly state that its incremental deltas include combined prompt + aggregate
output, and add a short clarifying footnote or parenthetical to the
"Prompt-Footprint Estimate" table header stating it measures prompt-input only;
adjust the text near the "Epic Comparable Benchmark" and "Prompt-Footprint
Estimate" headings so readers see the measurement surface immediately and ensure
the incremental-delta wording in "Epic Comparable Benchmark incremental delta"
references the combined-char totals.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro Plus
Run ID: bfb77c74-5dc6-47f0-8891-d59e9193a2c3
📒 Files selected for processing (9)
docs/CONFIGURATION.mddocs/deep-review-pro-benchmark/reports/584-shared-boilerplate.mddocs/deep-review-pro-benchmark/reports/585-static-prepass.mddocs/deep-review-pro-benchmark/reports/587-epic-token-cost-matrix.jsondocs/deep-review-pro-benchmark/reports/587-epic-token-cost-matrix.mdscripts/benchmark_deep_review_epic_matrix.pyscripts/deep_review_benchmark_support.pyscripts/test_benchmark_deep_review_epic_matrix.pyscripts/test_benchmark_deep_review_pro.py
✅ Files skipped from review due to trivial changes (3)
- docs/deep-review-pro-benchmark/reports/587-epic-token-cost-matrix.json
- docs/deep-review-pro-benchmark/reports/587-epic-token-cost-matrix.md
- docs/CONFIGURATION.md
🚧 Files skipped from review as they are similar to previous changes (4)
- scripts/test_benchmark_deep_review_pro.py
- scripts/benchmark_deep_review_epic_matrix.py
- scripts/test_benchmark_deep_review_epic_matrix.py
- scripts/deep_review_benchmark_support.py
Co-authored-by: Cursor <cursoragent@cursor.com>
Summary
deep-review-prostatic pre-pass for TypeScript compile checks, formatting, actionlint/shellcheck coverage, secret scanning, and coverage-matrix validation.post-585checkpoint.Test Plan
python3 scripts/benchmark_deep_review_epic_matrix.pypython3 scripts/test_benchmark_deep_review_epic_matrix.pypython3 scripts/test_benchmark_deep_review_pro.pypython3 -m compileall scripts/benchmark_deep_review_epic_matrix.py scripts/deep_review_benchmark_support.py scripts/test_benchmark_deep_review_epic_matrix.py scripts/test_benchmark_deep_review_pro.pygit diff origin/main..HEAD --checkpython3 scripts/benchmark_deep_review_epic_matrix.py --issue-section 585Closes #585
Contributes to #587
Epic Comparable Benchmark
These rows are generated from
587-epic-token-cost-matrix.mdso this issue can be compared with every other #587 child story using the same units.Use this section for cross-ticket comparison. Story-specific tables below are retained as local evidence and may use a narrower prompt-only, output-only, dispatch-only, or rerun/cache proxy surface.
Incremental Delta: post-584 -> post-585
Cumulative Delta: original-580 -> post-585
Summary by CodeRabbit
New Features
Documentation
Tests