Add gcov-based test pruning with file-level coverage cache#1316
Add gcov-based test pruning with file-level coverage cache#1316sbryngelson wants to merge 10 commits intoMFlowCode:masterfrom
Conversation
Outdated review (resolved)This review was on an earlier revision. All findings have been addressed. |
c4efbc6 to
68b035b
Compare
Outdated review (resolved)This review was on an earlier revision. All findings have been addressed. |
Claude Code ReviewHead SHA: c6cf6c6 Files changed: 14
Summary:
Findings: 1. Blocked tests if The (needs.rebuild-cache.result == 'success' || needs.rebuild-cache.result == 'skipped')If 2. git rebase origin/master
git push origin HEAD:refs/heads/masterThe 3. Phase 2 parallelism cap is hardcoded — phase2_jobs = min(n_jobs, 16)The cap of 16 concurrent tests (each spawning MPI processes under gcov) is documented as a memory guard, but the No issues found with:
|
There was a problem hiding this comment.
Pull request overview
Introduces file-level gcov coverage–based test pruning to speed up PR CI runs by running only tests whose covered .fpp files overlap with the PR’s changed files, with CI support to rebuild and distribute the coverage cache when needed.
Changes:
- Adds a coverage cache builder and a
--only-changespruning path to themfc.sh testcommand. - Extends CI to (optionally) rebuild/upload the coverage cache and to run PR tests with
--only-changes. - Updates build/test infrastructure to improve gcov fidelity (Fypp line markers, gcov-friendly flags) and adds offline unit tests for the coverage logic.
Reviewed changes
Copilot reviewed 12 out of 14 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
toolchain/mfc/test/coverage.py |
Implements coverage cache build + changed-file-based pruning logic. |
toolchain/mfc/test/test.py |
Wires --only-changes and --build-coverage-cache into test execution and filtering. |
toolchain/mfc/cli/commands.py |
Adds CLI flags: --build-coverage-cache, --only-changes, --changes-branch. |
toolchain/mfc/test/test_coverage_unit.py |
Adds offline unit tests for the coverage module. |
toolchain/mfc/test/test_coverage_cache.json.gz |
Adds the committed coverage cache artifact. |
toolchain/mfc/test/case.py |
Refactors post_process-related parameter mods for reuse by coverage builder. |
toolchain/mfc/test/cases.py |
Skips 1D_qbmm example due to a gfortran 12 overflow issue. |
CMakeLists.txt |
Adjusts gcov builds (Fypp line markers; disables LTO for gcov; uses -O1 in Release for accuracy/portability). |
.github/workflows/test.yml |
Adds cache rebuild job + artifact download; runs PR tests with --only-changes. |
.github/workflows/common/test.sh |
Enables pruning only on PR events in SLURM-based CI runs. |
.github/workflows/common/rebuild-cache.sh |
Adds SLURM script to rebuild the gcov cache. |
.github/scripts/submit-slurm-job.sh |
Exports GITHUB_EVENT_NAME into SLURM job environment. |
.github/file-filter.yml |
Adds cases_py filter and includes common workflow paths. |
.gitignore |
Ignores legacy raw (non-gz) coverage cache file. |
Comments suppressed due to low confidence (1)
toolchain/mfc/test/test.py:90
- __filter operates on the objects returned by list_cases() (TestCaseBuilder instances), and later test() converts them via
.to_case(). The current return type hint usesList[TestCase], which is misleading and makes type-checking harder; consider changing the annotations toList[TestCaseBuilder](and importing that type) for both inputs and returned lists.
def __filter(cases_) -> typing.Tuple[typing.List[TestCase], typing.List[TestCase]]:
cases = cases_[:]
selected_cases = []
skipped_cases = []
toolchain/mfc/test/coverage.py
Outdated
| def _parse_gcov_json_output(raw_bytes: bytes, root_dir: str) -> set: | ||
| """ | ||
| Parse gcov JSON output and return the set of .fpp file paths with coverage. | ||
| Handles both gzip-compressed (gcov 13+) and raw JSON (gcov 12) formats. | ||
| Handles concatenated JSON objects from batched gcov calls (multiple .gcno | ||
| files passed to a single gcov invocation). | ||
| Only .fpp files with at least one executed line are included. | ||
| """ | ||
| try: | ||
| text = gzip.decompress(raw_bytes).decode("utf-8", errors="replace") | ||
| except (gzip.BadGzipFile, OSError): | ||
| text = raw_bytes.decode("utf-8", errors="replace") | ||
|
|
||
| result = set() | ||
| real_root = os.path.realpath(root_dir) | ||
| parsed_any = False | ||
|
|
||
| # Parse potentially concatenated JSON objects (one per .gcno file). | ||
| decoder = json.JSONDecoder() | ||
| pos = 0 | ||
| while pos < len(text): | ||
| while pos < len(text) and text[pos] in " \t\n\r": | ||
| pos += 1 | ||
| if pos >= len(text): | ||
| break | ||
| try: | ||
| data, end_pos = decoder.raw_decode(text, pos) | ||
| pos = end_pos | ||
| parsed_any = True | ||
| except json.JSONDecodeError: | ||
| remaining = len(text) - pos | ||
| if remaining > 0: | ||
| cons.print(f"[yellow]Warning: gcov JSON parse error at offset {pos} ({remaining} bytes remaining) — partial coverage recorded for this test.[/yellow]") | ||
| break | ||
|
|
||
| for file_entry in data.get("files", []): | ||
| file_path = file_entry.get("file", "") | ||
| if not file_path.endswith(".fpp"): | ||
| continue | ||
| if any(line.get("count", 0) > 0 for line in file_entry.get("lines", [])): | ||
| try: | ||
| rel_path = os.path.relpath(os.path.realpath(file_path), real_root) | ||
| except ValueError: | ||
| rel_path = file_path | ||
| # Only keep src/ paths — build/staging/ artifacts from | ||
| # case-optimized builds are auto-generated and never | ||
| # appear in PR diffs. | ||
| if rel_path.startswith("src/"): | ||
| result.add(rel_path) | ||
|
|
||
| # If no JSON was parsed at all (complete garbage input), return None | ||
| # so the caller omits this test from the cache (conservatively included | ||
| # on future runs). An empty set after successful parsing means the test | ||
| # genuinely covers no .fpp files. | ||
| if not parsed_any: | ||
| return None | ||
|
|
||
| return result |
| cache["_meta"] = { | ||
| "created": datetime.datetime.now(datetime.timezone.utc).isoformat(), | ||
| "cases_hash": cases_hash, | ||
| "gcov_version": gcov_version, | ||
| } |
| if echo "$DIFF" | \ | ||
| grep -qE '^[+-][[:space:]]*(use[[:space:],]+[a-zA-Z_]|#:include[[:space:]]|include[[:space:]]+['"'"'"])'; then | ||
| echo "dep_changed=true" >> "$GITHUB_OUTPUT" | ||
| echo "Fortran dependency change detected — will rebuild coverage cache." | ||
| else | ||
| echo "dep_changed=false" >> "$GITHUB_OUTPUT" | ||
| fi |
| cons.print() | ||
| cons.print("[bold cyan]Coverage Change Analysis[/bold cyan]") | ||
| cons.print("-" * 50) | ||
| cons.print("[green]No Fortran source changes detected — skipping all tests.[/green]") | ||
| cons.print("-" * 50) | ||
| cons.print() | ||
| skipped_cases += cases | ||
| cases = [] |
Outdated review (resolved)This review was on an earlier revision. All findings have been addressed. |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #1316 +/- ##
==========================================
- Coverage 45.36% 45.01% -0.36%
==========================================
Files 70 70
Lines 20499 20561 +62
Branches 1953 1961 +8
==========================================
- Hits 9300 9256 -44
- Misses 10074 10177 +103
- Partials 1125 1128 +3 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
…rify empty gcno_copies return
…uld_run_all_tests
c6cf6c6 to
f5d7344
Compare
Summary
.fppsource files (gzip JSON, committed to repo)--only-changesflag prunes tests by intersecting PR-changed files against coverage cache--build-coverage-cacheflag + 3-phase parallel cache builder (prepare, run, gcov collect)rebuild-cacheCI job runs on Phoenix via SLURM whencases.pyor Fortran dependency graph changesuse/includestatementsALWAYS_RUN_ALLfiles trigger full suiteReplaces #1284.
Test plan
rebuild-cachejob triggers (dep_changed detection)--only-changes🤖 Generated with Claude Code