Skip to content

#585 Add static deep-review pre-pass#600

Merged
hubertgajewski merged 5 commits into
mainfrom
feature/585
Jun 7, 2026
Merged

#585 Add static deep-review pre-pass#600
hubertgajewski merged 5 commits into
mainfrom
feature/585

Conversation

@hubertgajewski

@hubertgajewski hubertgajewski commented Jun 6, 2026

Copy link
Copy Markdown
Owner

Summary

  • Adds a deep-review-pro static pre-pass for TypeScript compile checks, formatting, actionlint/shellcheck coverage, secret scanning, and coverage-matrix validation.
  • Narrows project-checklist ownership to Playwright/POM/fixture/tag/path-alias/loadEnv concerns while moving mechanical static checks to the orchestrator.
  • Updates specialist prompts, assistant docs, benchmark support, fixtures, and reports for the new post-585 checkpoint.

Test Plan

  • python3 scripts/benchmark_deep_review_epic_matrix.py
  • python3 scripts/test_benchmark_deep_review_epic_matrix.py
  • python3 scripts/test_benchmark_deep_review_pro.py
  • python3 -m compileall scripts/benchmark_deep_review_epic_matrix.py scripts/deep_review_benchmark_support.py scripts/test_benchmark_deep_review_epic_matrix.py scripts/test_benchmark_deep_review_pro.py
  • git diff origin/main..HEAD --check
  • python3 scripts/benchmark_deep_review_epic_matrix.py --issue-section 585

Closes #585
Contributes to #587

Epic Comparable Benchmark

These rows are generated from 587-epic-token-cost-matrix.md so this issue can be compared with every other #587 child story using the same units.

Use this section for cross-ticket comparison. Story-specific tables below are retained as local evidence and may use a narrower prompt-only, output-only, dispatch-only, or rerun/cache proxy surface.

Incremental Delta: post-584 -> post-585

Metric Before After Delta
Combined chars 1,975,708 1,961,647 -14,061 (-0.71%)
Combined est. tokens 493,934 490,417 -3,517 (-0.71%)

Cumulative Delta: original-580 -> post-585

Metric Original #580 baseline Current checkpoint Delta
Combined chars 3,822,193 1,961,647 -1,860,546 (-48.68%)
Combined est. tokens 955,555 490,417 -465,138 (-48.68%)

Summary by CodeRabbit

  • New Features

    • Added an orchestrator "static pre-pass" that gates readiness with compile/format, workflow, secret-scan and coverage availability checks; updated dispatch rules so project-checklist targets Playwright/Bruno surfaces and CI/security dispatch is tightened.
    • Output formatting: static-tool findings no longer require a "(static)" prefix.
  • Documentation

    • Revised Deep Review Pro docs, benchmarks and configs to document the static pre-pass, dispatch changes, compact-static output contract, and new post-585 checkpoint.
  • Tests

    • Extended benchmarks and unit tests to validate WORKTREE checkpoints and static pre-pass behaviors.

@coderabbitai

coderabbitai Bot commented Jun 6, 2026

Copy link
Copy Markdown

Review Change Stack

Warning

Review limit reached

@hubertgajewski, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 47 minutes and 2 seconds. Learn how PR review limits work.

Your organization has run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: f01145e2-69b6-4a00-bde3-35c76f09ec1a

📥 Commits

Reviewing files that changed from the base of the PR and between 63a4225 and 8e4d7af.

📒 Files selected for processing (1)
  • .claude/skills/deep-review-pro/SKILL.md
📝 Walkthrough

Walkthrough

Introduce an orchestrator-level static pre-pass, narrow specialist agents to semantic responsibilities, add dispatch-static-v1/compact-static-v1 and WORKTREE checkpoint support in benchmark scripts, update support dispatch selectors, update benchmark fixtures/reports, and expand tests to validate the new static pre-pass outputs and routing.

Changes

Static Pre-Pass Orchestration and Agent Refactoring

Layer / File(s) Summary
Agent prompt contract clarification
.claude/agents/deep-review-ci.md, .claude/agents/deep-review-project-checklist.md, .claude/agents/deep-review-qa.md, .claude/agents/deep-review-security.md, .claude/agents/deep-review-typescript.md
Agent prompts treat PROMPT_FRAME as data, remove mechanical static-tool ownership, restrict tool grants, and refocus agents on semantic review responsibilities.
Deep-review-pro static pre-pass specification
.claude/skills/deep-review-pro/SKILL.md, docs/AI_ASSISTANTS.md
Adds an orchestrator-owned Static pre-pass with a deterministic row schema, explicit pass/fail/unavailable semantics, changed dispatch triggers (project-checklist narrowed), reuse/iteration rebuild rules, aggregate emission ordering, and token-accounting guidance.
Dispatch-static-v1 and compact-static-v1 implementation
scripts/benchmark_deep_review_epic_matrix.py
Adds WORKTREE_REF, dispatch-static-v1 and compact-static-v1 contracts, threads diff_text through output builders, implements static_prepass_proxy_rows/section and compact_static_output_proxy, and adds static dispatch routing.
WORKTREE checkpoint support and ref handling
scripts/benchmark_deep_review_epic_matrix.py
Support for a final WORKTREE checkpoint that reads current working-tree files directly, resolves WORKTREE ref to itself, skips historical-ref validation for WORKTREE, and enforces final-position constraints for HEAD/WORKTREE.
Static dispatch selection functions
scripts/deep_review_benchmark_support.py
Adds static checklist-path predicates, collect_added_lines, dispatch_project_checklist_static_v1, dispatch_matches_static_v1, selected_agents_for_diff_static_v1, rebinds public selectors to static-v1, and tightens credential-line regex.
Documentation of static pre-pass behavior
docs/AI_ASSISTANTS.md, docs/deep-review-pro-benchmark/README.md, docs/CONFIGURATION.md
Document the static-pre-pass phase, refined agent triggering rules, WORKTREE behavior, compact/verbose output differences, and configuration mapping for actionlint/shellcheck to the orchestrator pre-pass.
Benchmark data and fixture updates
docs/deep-review-pro-benchmark/fixtures.json, docs/deep-review-pro-benchmark/reports/587-epic-token-cost-matrix.json, docs/deep-review-pro-benchmark/reports/587-epic-token-cost-matrix.md
Add post-585 checkpoint with dispatch-static-v1 + compact-static-v1, update workflow fixture dispatch/skip expectations, and include new checkpoint rows and deltas in JSON/MD reports.
Test assertions for static pre-pass and checkpoint
scripts/test_benchmark_deep_review_epic_matrix.py, scripts/test_benchmark_deep_review_pro.py
Extend tests to assert WORKTREE checkpoint handling, post-585 sequencing/contracts, static-pre-pass compact output behavior and blocking logic, updated prompt-frame static section presence, credential regex behavior, and fixture dispatch expectations.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related issues

Possibly related PRs

Suggested reviewers

  • claude

Poem

🐇 The static pass now hops ahead, so LLMs may think,

Mechanical weeds are pulled; the specialists now blink.
WORKTREE hums a pre-commit tune, the benchmark marks the way,
Specialists sniff semantics while the orchestrator keeps the day.
A joyful nibble, fewer tokens spent — hooray!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title clearly describes the main change: adding a static deep-review pre-pass, which directly reflects the primary objective of this PR.
Linked Issues check ✅ Passed The PR implements all acceptance criteria from #585: static pre-pass with TypeScript/format/actionlint/secret/coverage checks, project-checklist narrowed to Playwright conventions, fallback for unavailable checks, and comprehensive documentation and benchmarking.
Out of Scope Changes check ✅ Passed All changes are scoped to #585 objectives: agent prompt updates, static pre-pass implementation, benchmark support, dispatch logic refinement, and documentation updates directly supporting the static pre-pass feature.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feature/585

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
scripts/test_benchmark_deep_review_epic_matrix.py (1)

136-139: ⚡ Quick win

Use a contract-stable assertion instead of prose text matching.

Line 137 currently couples this test to a human-readable domain phrase ("path-alias / loadEnv"), which is brittle and can fail on harmless wording edits.

Suggested change
-        self.assertIn(
-            "path-alias / loadEnv",
-            roster["deep-review-project-checklist"]["domain"],
-        )
+        self.assertIn("deep-review-project-checklist", roster)
+        self.assertEqual(
+            roster["deep-review-project-checklist"]["dispatch"],
+            "project-checklist trigger",
+        )
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@scripts/test_benchmark_deep_review_epic_matrix.py` around lines 136 - 139,
The test is brittle because it matches a human-readable phrase; replace the
prose match with a contract-stable assertion: instead of assertIn("path-alias /
loadEnv", roster["deep-review-project-checklist"]["domain"]), extract the
canonical/stable identifier for that domain from the same entry (e.g., a
slug/id/tag field) or normalize the domain string into a stable form and compare
it to a test constant (e.g., EXPECTED_DOMAIN_TAG) using assertEqual or assertIn
against a list of stable tags; update the assertion referring to
roster["deep-review-project-checklist"]["domain"] (and add a small constant like
EXPECTED_DOMAIN_TAG) so future wording changes won’t break the test.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@scripts/test_benchmark_deep_review_epic_matrix.py`:
- Around line 136-139: The test is brittle because it matches a human-readable
phrase; replace the prose match with a contract-stable assertion: instead of
assertIn("path-alias / loadEnv",
roster["deep-review-project-checklist"]["domain"]), extract the canonical/stable
identifier for that domain from the same entry (e.g., a slug/id/tag field) or
normalize the domain string into a stable form and compare it to a test constant
(e.g., EXPECTED_DOMAIN_TAG) using assertEqual or assertIn against a list of
stable tags; update the assertion referring to
roster["deep-review-project-checklist"]["domain"] (and add a small constant like
EXPECTED_DOMAIN_TAG) so future wording changes won’t break the test.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 008648f6-98d4-46c3-90c4-4fae0e2493b8

📥 Commits

Reviewing files that changed from the base of the PR and between d9bebf2 and f82095c.

📒 Files selected for processing (15)
  • .claude/agents/deep-review-ci.md
  • .claude/agents/deep-review-project-checklist.md
  • .claude/agents/deep-review-qa.md
  • .claude/agents/deep-review-security.md
  • .claude/agents/deep-review-typescript.md
  • .claude/skills/deep-review-pro/SKILL.md
  • docs/AI_ASSISTANTS.md
  • docs/deep-review-pro-benchmark/README.md
  • docs/deep-review-pro-benchmark/fixtures.json
  • docs/deep-review-pro-benchmark/reports/587-epic-token-cost-matrix.json
  • docs/deep-review-pro-benchmark/reports/587-epic-token-cost-matrix.md
  • scripts/benchmark_deep_review_epic_matrix.py
  • scripts/deep_review_benchmark_support.py
  • scripts/test_benchmark_deep_review_epic_matrix.py
  • scripts/test_benchmark_deep_review_pro.py

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In @.claude/skills/deep-review-pro/SKILL.md:
- Around line 418-424: The fenced code block that begins with "### aggregate" is
missing a language tag (causing markdownlint MD040); change the opening fence
from ``` to ```text so the block is fenced as ```text ... ``` and include the
same inner lines unchanged (the block starting at the "### aggregate" header).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: dcefc071-9621-47cc-a89c-024c0105a599

📥 Commits

Reviewing files that changed from the base of the PR and between f82095c and 4e0161c.

📒 Files selected for processing (16)
  • .claude/agents/deep-review-ci.md
  • .claude/agents/deep-review-project-checklist.md
  • .claude/agents/deep-review-qa.md
  • .claude/agents/deep-review-security.md
  • .claude/agents/deep-review-typescript.md
  • .claude/skills/deep-review-pro/SKILL.md
  • docs/AI_ASSISTANTS.md
  • docs/CONFIGURATION.md
  • docs/deep-review-pro-benchmark/README.md
  • docs/deep-review-pro-benchmark/fixtures.json
  • docs/deep-review-pro-benchmark/reports/587-epic-token-cost-matrix.json
  • docs/deep-review-pro-benchmark/reports/587-epic-token-cost-matrix.md
  • scripts/benchmark_deep_review_epic_matrix.py
  • scripts/deep_review_benchmark_support.py
  • scripts/test_benchmark_deep_review_epic_matrix.py
  • scripts/test_benchmark_deep_review_pro.py
✅ Files skipped from review due to trivial changes (5)
  • docs/CONFIGURATION.md
  • .claude/agents/deep-review-security.md
  • docs/deep-review-pro-benchmark/reports/587-epic-token-cost-matrix.md
  • docs/deep-review-pro-benchmark/reports/587-epic-token-cost-matrix.json
  • docs/deep-review-pro-benchmark/README.md
🚧 Files skipped from review as they are similar to previous changes (5)
  • .claude/agents/deep-review-typescript.md
  • docs/deep-review-pro-benchmark/fixtures.json
  • docs/AI_ASSISTANTS.md
  • .claude/agents/deep-review-project-checklist.md
  • scripts/benchmark_deep_review_epic_matrix.py

Comment thread .claude/skills/deep-review-pro/SKILL.md Outdated

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
docs/deep-review-pro-benchmark/reports/584-shared-boilerplate.md (1)

9-11: ⚡ Quick win

Clarify which measurement surface includes aggregate output.

Line 9 states "aggregate output is unchanged from #583 compact mode," and the "Prompt-Footprint Estimate" table (lines 53-64) measures prompt-input only. However, the "Epic Comparable Benchmark" incremental delta (lines 21-24) shows combined chars of 2,037,880 → 1,975,708, which is ~11,561 chars larger than the prompt-only total of 2,026,319 on line 64.

This suggests the Epic Comparable Benchmark section includes aggregate output in the combined totals, while the Prompt-Footprint Estimate table does not. Consider adding a brief note near line 15 or line 53 clarifying which tables measure prompt-only vs. combined (prompt + aggregate output) to prevent reader confusion.

Also applies to: 53-64

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@docs/deep-review-pro-benchmark/reports/584-shared-boilerplate.md` around
lines 9 - 11, Summary: clarify which tables measure prompt-only vs
prompt+aggregate output. Update the "Epic Comparable Benchmark" section to
explicitly state that its incremental deltas include combined prompt + aggregate
output, and add a short clarifying footnote or parenthetical to the
"Prompt-Footprint Estimate" table header stating it measures prompt-input only;
adjust the text near the "Epic Comparable Benchmark" and "Prompt-Footprint
Estimate" headings so readers see the measurement surface immediately and ensure
the incremental-delta wording in "Epic Comparable Benchmark incremental delta"
references the combined-char totals.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@docs/deep-review-pro-benchmark/reports/584-shared-boilerplate.md`:
- Around line 9-11: Summary: clarify which tables measure prompt-only vs
prompt+aggregate output. Update the "Epic Comparable Benchmark" section to
explicitly state that its incremental deltas include combined prompt + aggregate
output, and add a short clarifying footnote or parenthetical to the
"Prompt-Footprint Estimate" table header stating it measures prompt-input only;
adjust the text near the "Epic Comparable Benchmark" and "Prompt-Footprint
Estimate" headings so readers see the measurement surface immediately and ensure
the incremental-delta wording in "Epic Comparable Benchmark incremental delta"
references the combined-char totals.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: bfb77c74-5dc6-47f0-8891-d59e9193a2c3

📥 Commits

Reviewing files that changed from the base of the PR and between 4e0161c and 63a4225.

📒 Files selected for processing (9)
  • docs/CONFIGURATION.md
  • docs/deep-review-pro-benchmark/reports/584-shared-boilerplate.md
  • docs/deep-review-pro-benchmark/reports/585-static-prepass.md
  • docs/deep-review-pro-benchmark/reports/587-epic-token-cost-matrix.json
  • docs/deep-review-pro-benchmark/reports/587-epic-token-cost-matrix.md
  • scripts/benchmark_deep_review_epic_matrix.py
  • scripts/deep_review_benchmark_support.py
  • scripts/test_benchmark_deep_review_epic_matrix.py
  • scripts/test_benchmark_deep_review_pro.py
✅ Files skipped from review due to trivial changes (3)
  • docs/deep-review-pro-benchmark/reports/587-epic-token-cost-matrix.json
  • docs/deep-review-pro-benchmark/reports/587-epic-token-cost-matrix.md
  • docs/CONFIGURATION.md
🚧 Files skipped from review as they are similar to previous changes (4)
  • scripts/test_benchmark_deep_review_pro.py
  • scripts/benchmark_deep_review_epic_matrix.py
  • scripts/test_benchmark_deep_review_epic_matrix.py
  • scripts/deep_review_benchmark_support.py

Co-authored-by: Cursor <cursoragent@cursor.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[refactor] Remove duplicate checklist ownership and add static pre-pass

1 participant