test(chat): add visual qa artifact reports#414
Conversation
📝 Walkthrough>Runner: throws or returnsRunner->>writeVisualQaReport: { status: passed/failed, metrics, failure } |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@tests/contract/scripts/verify-chat-visual-qa.mjs`:
- Line 51: The verification in verify-chat-visual-qa.mjs only rejects
approved-real when paired with real_tested:true, so it can still slip through
with real_tested:false. Update the assertion in the script check to explicitly
reject any occurrence of approved-real in the script text, alongside the
existing real_tested and pnpm tauri build guards, so the scope restriction is
enforced regardless of other flags.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro Plus
Run ID: 514be412-39c3-47ad-ae86-ac3603bac721
📒 Files selected for processing (6)
app/desktop/scripts/manual-chat-flow-check.mjsapp/web/scripts/manual-chat-flow-check.mjsdocs/progress/MASTER.mdscripts/verify/chat-acceptance.mjstests/contract/scripts/verify-chat-acceptance.mjstests/contract/scripts/verify-chat-visual-qa.mjs
| assert(script.includes('Visual QA screenshot:'), `${label} stdout prints screenshot path`); | ||
| assert(script.includes('Visual QA metrics:'), `${label} stdout prints metrics path`); | ||
| assert(script.includes('Visual QA report:'), `${label} stdout prints report path`); | ||
| assert(!/real_tested:\s*true|approved-real.+real_tested:\s*true|pnpm\s+tauri\s+build/i.test(script), `${label} script does not claim approved-real or packaged Desktop evidence`); |
There was a problem hiding this comment.
🎯 Functional Correctness | 🟡 Minor | ⚡ Quick win
Block approved-real explicitly.
This regex still passes if the script starts emitting approved-real while keeping real_tested: false, so it does not actually enforce the stated scope restriction.
Suggested fix
- assert(!/real_tested:\s*true|approved-real.+real_tested:\s*true|pnpm\s+tauri\s+build/i.test(script), `${label} script does not claim approved-real or packaged Desktop evidence`);
+ assert(!/real_tested:\s*true/i.test(script), `${label} script does not claim real_tested=true`);
+ assert(!/approved-real/i.test(script), `${label} script does not claim approved-real evidence`);
+ assert(!/pnpm\s+tauri\s+build/i.test(script), `${label} script does not claim packaged Desktop evidence`);📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| assert(!/real_tested:\s*true|approved-real.+real_tested:\s*true|pnpm\s+tauri\s+build/i.test(script), `${label} script does not claim approved-real or packaged Desktop evidence`); | |
| assert(!/real_tested:\s*true/i.test(script), `${label} script does not claim real_tested=true`); | |
| assert(!/approved-real/i.test(script), `${label} script does not claim approved-real evidence`); | |
| assert(!/pnpm\s+tauri\s+build/i.test(script), `${label} script does not claim packaged Desktop evidence`); |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@tests/contract/scripts/verify-chat-visual-qa.mjs` at line 51, The
verification in verify-chat-visual-qa.mjs only rejects approved-real when paired
with real_tested:true, so it can still slip through with real_tested:false.
Update the assertion in the script check to explicitly reject any occurrence of
approved-real in the script text, alongside the existing real_tested and pnpm
tauri build guards, so the scope restriction is enforced regardless of other
flags.
Summary
Evidence
node tests/contract/scripts/verify-chat-visual-qa.mjs --repo-root .node tests/contract/scripts/verify-chat-acceptance.mjs --repo-root .corepack.cmd pnpm --dir app/desktop run test:visual:chat-flowcorepack.cmd pnpm --dir app/web run test:visual:chat-flowcorepack.cmd pnpm --dir app run test:acceptance:chat-flow -- --command-timeout-sec 420pwsh ./scripts/verify/verify-doc-ssot.ps1pwsh ./scripts/verify/verify-project-skills.ps1pwsh ./scripts/verify/verify-real-e2e-contract.ps1git diff --checkArtifact Examples
app/desktop/.tmp/manual-chat-flow-uiux/desktop-1440x810-chat-flow.pngapp/desktop/.tmp/manual-chat-flow-uiux/desktop-1440x810-chat-flow.metrics.jsonapp/desktop/.tmp/manual-chat-flow-uiux/desktop-chat-flow-visual-qa.jsonapp/web/.tmp/manual-chat-flow-uiux/web-1440x810-chat-flow.pngapp/web/.tmp/manual-chat-flow-uiux/web-1440x810-chat-flow.metrics.jsonapp/web/.tmp/manual-chat-flow-uiux/web-chat-flow-visual-qa.json.tmp/chat-acceptance/run-76904/chat-acceptance-manifest.jsonBoundary
real_tested=false. This proves fixture/unit, Playwright UI, and Visual QA evidence only. It does not claim real TokenDance ID login, real CLI/model/API execution, packaged Desktop, signing, release upload, or production deployment.Closes #390
Summary by CodeRabbit
New Features
Bug Fixes
Tests