feat(analysis): step 7 dispatch — workflow input + EF time-series dispatcher#380
Merged
Conversation
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Member
Author
Adds rebuild_run_index_from_drive.py — lists Google Sheets in a diagnostics Drive folder and parses each title into (approach, baseline, sheet_id, year?, scenario?) rows for ef_run_index.csv. Closes the manual- flow gap: users who triggered diagnostics via the GH Actions UI can now auto-build the run index instead of hand-typing sheet IDs. Title regex handles both formats currently in use: Manual: [DATE, BASELINE based, A matrix with APPROACH] EFs diagnostics Dispatch: [DATE, YEAR, BASELINE based, A matrix with APPROACH, SCENARIO] EFs diagnostics Also rewords the FileNotFoundError in compile_ef_diagnostics.py to point at the new script as the auto-rebuild path, with hand-write as a fallback. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…patcher Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…-series dispatch Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…race Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ta-overwrite Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Each diagnostics cell runs at model_base_year=Y so D_new/N_new are denominated in year-Y dollars. Cross-year comparison requires a single dollar reference; add D_new_ref/N_new_ref columns that apply inflation_adjust_ef_denom_to_new_base_year to land every cell on REFERENCE_DOLLAR_YEAR (2023). Step 6 single-year rows (empty year column) skip the step and behave unchanged. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Dispatcher previously crashed mid-run when `gh run list` returned non-zero exit (transient API hiccup mid-batch). Now retries 3× per status with a 5s backoff and falls back to a "still busy" sentinel so the poll loop keeps spinning instead of unwinding the dispatch and losing the queue. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
mypy flags ta.cast as redundant since pd.to_numeric/concat/__getitem__ return inferable types. The casts were added to placate Pyright, which isn't a CI check. Switching back to no-cast satisfies mypy + black; ruff unchanged. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
9be63f6 to
50931d0
Compare
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds rebuild_run_index_from_drive.py — lists Google Sheets in a diagnostics Drive folder and parses each title into (approach, baseline, sheet_id, year?, scenario?) rows for ef_run_index.csv. Closes the manual- flow gap: users who triggered diagnostics via the GH Actions UI can now auto-build the run index instead of hand-typing sheet IDs. Title regex handles both formats currently in use: Manual: [DATE, BASELINE based, A matrix with APPROACH] EFs diagnostics Dispatch: [DATE, YEAR, BASELINE based, A matrix with APPROACH, SCENARIO] EFs diagnostics Also rewords the FileNotFoundError in compile_ef_diagnostics.py to point at the new script as the auto-rebuild path, with hand-write as a fallback. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…patcher Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…-series dispatch Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…race Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ta-overwrite Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Each diagnostics cell runs at model_base_year=Y so D_new/N_new are denominated in year-Y dollars. Cross-year comparison requires a single dollar reference; add D_new_ref/N_new_ref columns that apply inflation_adjust_ef_denom_to_new_base_year to land every cell on REFERENCE_DOLLAR_YEAR (2023). Step 6 single-year rows (empty year column) skip the step and behave unchanged. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Dispatcher previously crashed mid-run when `gh run list` returned non-zero exit (transient API hiccup mid-batch). Now retries 3× per status with a 5s backoff and falls back to a "still busy" sentinel so the poll loop keeps spinning instead of unwinding the dispatch and losing the queue. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
mypy flags ta.cast as redundant since pd.to_numeric/concat/__getitem__ return inferable types. The casts were added to placate Pyright, which isn't a CI check. Switching back to no-cast satisfies mypy + black; ruff unchanged. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
81eed45 to
ddb35d7
Compare
50931d0 to
627fb5b
Compare
…/cornerstone-data/bedrock into mo__step7-ef-time-series-dispatch
WesIngwersen
requested changes
May 12, 2026
WesIngwersen
left a comment
Member
There was a problem hiding this comment.
I'm getting a read error where its reading in year data from the tab for scenarios where its getting a float and expecting an int
Pulling tabs for scenario=bundle_v0_2, approach=commodity_price_index, year=2019.0, baseline=ceda
Traceback (most recent call last):
File "c:\Users\ingwersw\bedrock\bedrock\analysis\a_matrix_time_series\compile_ef_diagnostics.py", line 295, in <module>
main()
File "c:\Users\ingwersw\bedrock\bedrock\analysis\a_matrix_time_series\compile_ef_diagnostics.py", line 231, in main
joined, source_year=int(year), ref_year=REFERENCE_DOLLAR_YEAR
^^^^^^^^^
ValueError: invalid literal for int() with base 10: '2019.0'
(bedrock)
Member
Addressed with 1a0b35c |
WesIngwersen
approved these changes
May 12, 2026
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.

cc:
Closes:
What changed? Why?
Step 7 / Phase 1 — infra to dispatch the EF time-series across
(scenario, approach, year)cells without YAML proliferation, with year-aligned GHG inventory data and Sheets-API-quota-safe serialization.Workflow + config plumbing
Two new optional workflow inputs on
generate_diagnostics.yml, mirrored as CLI options onbedrock/utils/validation/generate_diagnostics.py:model_base_year— overridescfg.model_base_year. Drives the A-matrix scaling target year + inflation target year.usa_ghg_data_year— overridescfg.usa_ghg_data_year. Drives the GHG (E) inventory year + gross-output x-vector year.Both flow through
set_global_usa_config'sdiagnostics_cli_overridespath.USAConfig.model_base_yearLiteral expands to{2019…2024}(was{2022, 2023, 2024});usa_ghg_data_yearLiteral expands to{2019…2024}(was{2023, 2024}). Both keys added toDIAGNOSTICS_CLI_OVERRIDE_KEYS.GHG FBS year templatization
load_E_from_flowsa()inbedrock/transform/allocation/derived.pypreviously hard-coded'GHG_national_Cornerstone_2023'/'GHG_national_CEDA_2023'. Templatizes thenew_ghg_methodand CEDA fallback branches withcfg.usa_ghg_data_year— those FBS YAMLs and GCS parquets exist for 2019–2023. Variant FBSes (*_coa_allocation,*_electricity, etc.) only exist for 2023, so the function raises a clearValueErrorif anyupdate_*_methodflag is set with year ≠ 2023, instead of failing later with an opaque "FBS not found".Workflow-level serialization
Add
concurrency: { group: generate_diagnostics, cancel-in-progress: false }togenerate_diagnostics.yml. Belt-and-suspenders against the Sheets API write quota (60/min/user) — guarantees only onegenerate_diagnosticsjob is writing to Sheets at a time, regardless of dispatch source.Dispatch script
bedrock/analysis/a_matrix_time_series/dispatch_ef_time_series.py. Per(scenario, approach, year)cell:[{run_date}, {model_year}, {baseline} based, {approach_label}, {scenario}] EFs diagnostics.generate_diagnosticsviagh workflow runwithconfig_name,model_base_year,usa_ghg_data_year,sheet_id,use_useeio_baseline. Both year overrides get the same value per cell.output/results/ef_run_index.csv(audit trail).Idempotent — already-recorded cells are skipped, so re-running picks up only unfilled cells. CEDA-only baseline as the starting cut.
Two scenarios with explicit names:
isolate_a_matrix— vary only A-matrix scaling, hold everything else to v0 defaults. Reuses the four Step 6 candidate YAMLs. Currently parked.bundle_v0_2(default) — single config2025_usa_cornerstone_full_modelrepresenting the full v0.2 release-candidate stack. 5 cells (1 approach × 5 years).Throttle modes (
--throttle):poll(default) — block until prior workflow runs clear viagh run list.sleep:N— fixed N-second sleep between triggers.none— fire immediately (only safe with bumped Sheets quota).Re-dispatch from CSV (
--re-dispatch-from-csv): re-trigger workflows for cells already inef_run_index.csv— used to recover from rate-limit batch failures, re-uses existing Sheets.Compile-script extension
compile_ef_diagnostics.pynow tolerates the optionalscenario/yearcolumns. When populated, per-pair tab names get a{scenario}_{year}_prefix and summary rows stamp the dimensions. Step 6's existing CSV (no scenario/year) still works — backfilled to empty strings on load.Testing
bundle_v0_2× 2019–2023 with the agreed title format.compile_ef_diagnostics.pystill compiles Step 6's existing 7-row index unchanged.black --check,ruff check .,mypy bedrock(385 files) all clean.Pre-merge note
Live dispatch needs the workflow on the target ref. Either land this PR to
mainfirst and run with--git-ref main, or run with--git-ref mo__step7-ef-time-series-dispatchto dispatch against the branch (current state).