Add LMDI signal-noise diagnosis for A-matrix physical residual#405
Draft
MoLi7 wants to merge 13 commits into
Draft
Add LMDI signal-noise diagnosis for A-matrix physical residual#405MoLi7 wants to merge 13 commits into
MoLi7 wants to merge 13 commits into
Conversation
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds rebuild_run_index_from_drive.py — lists Google Sheets in a diagnostics Drive folder and parses each title into (approach, baseline, sheet_id, year?, scenario?) rows for ef_run_index.csv. Closes the manual- flow gap: users who triggered diagnostics via the GH Actions UI can now auto-build the run index instead of hand-typing sheet IDs. Title regex handles both formats currently in use: Manual: [DATE, BASELINE based, A matrix with APPROACH] EFs diagnostics Dispatch: [DATE, YEAR, BASELINE based, A matrix with APPROACH, SCENARIO] EFs diagnostics Also rewords the FileNotFoundError in compile_ef_diagnostics.py to point at the new script as the auto-rebuild path, with hand-write as a fallback. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…patcher Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…-series dispatch Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…race Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ta-overwrite Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Each diagnostics cell runs at model_base_year=Y so D_new/N_new are denominated in year-Y dollars. Cross-year comparison requires a single dollar reference; add D_new_ref/N_new_ref columns that apply inflation_adjust_ef_denom_to_new_base_year to land every cell on REFERENCE_DOLLAR_YEAR (2023). Step 6 single-year rows (empty year column) skip the step and behave unchanged. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Dispatcher previously crashed mid-run when `gh run list` returned non-zero exit (transient API hiccup mid-batch). Now retries 3× per status with a 5s backoff and falls back to a "still busy" sentinel so the poll loop keeps spinning instead of unwinding the dispatch and losing the queue. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
mypy flags ta.cast as redundant since pd.to_numeric/concat/__getitem__ return inferable types. The casts were added to placate Pyright, which isn't a CI check. Switching back to no-cast satisfies mypy + black; ruff unchanged. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Removed 6 redundant casts that mypy flagged. The casts placated Pyright but mypy infers the same types without them. - pd.ExcelFile.sheet_names is typed as list[int|str]; explicitly str() the tab name before passing to _parse_tab and pd.read_excel. - Apply black formatting. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Member
Author
|
Warning This pull request is not mergeable via GitHub because a downstack PR is open. Once all requirements are satisfied, merge this PR as a stack on Graphite.
This stack of pull requests is managed by Graphite. Learn more about stacking. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.

cc:
Closes:
What changed? Why?
Adds a 7-script pipeline under
bedrock/analysis/a_matrix_time_series/signal_noise/that diagnoses whether the A-matrix physical residual (LMDIQ_phys = A_summary / A_pi) carries real signal or BEA-revision noise.The pipeline runs in three phases:
derive_A_snapshots.py→compute_lmdi_phys.py→plot_lmdi_phys.py) — per-year A snapshots forsummary_tablesandcommodity_price_index; cell-levelQ_phys+ LMDI aggregation to output sector / NAICS-3.compute_consistency_tests.py→plot_consistency_tests.py→extract_signal_clean_naics3.py) — lag-1 autocorr, within-NAICS-3 coherence (LMDI-weighted ICC), magnitude/shape distribution, and a 3-threshold pass/fail flag.validate_klems.py) — Pearson correlation against BEA-BLS KLEMS TFP and Materials/Output ratio at NAICS-3 level.Side changes:
bedrock/utils/config/usa_config.py: adds2018to themodel_base_yearLiteral — Phase A.1 needs the full 2017–2024 window.bedrock/analysis/a_matrix_time_series/compare_method_stability.py: expands the 2-panel|YoY|boxplot into 3 panels — pooled, per-transition, and an ECDF reading view.Testing
compute_lmdi_physproduces 816,590 active cells × 7 transitions;compute_consistency_testsheadlines (ICC 0.182/0.187 dom/imp, pooled lag-1 r −0.054/−0.195) reproduce stably;extract_signal_clean_naics3round-trips. Black, ruff, mypy clean acrossbedrock/analysis/a_matrix_time_series/andbedrock/utils/config/usa_config.py.