Skip to content

feat(analysis): step 7 dispatch — workflow input + EF time-series dispatcher#380

Merged
MoLi7 merged 23 commits into
mainfrom
mo__step7-ef-time-series-dispatch
May 13, 2026
Merged

feat(analysis): step 7 dispatch — workflow input + EF time-series dispatcher#380
MoLi7 merged 23 commits into
mainfrom
mo__step7-ef-time-series-dispatch

Conversation

@MoLi7

@MoLi7 MoLi7 commented May 5, 2026

Copy link
Copy Markdown
Member

cc:
Closes:

What changed? Why?

Step 7 / Phase 1 — infra to dispatch the EF time-series across (scenario, approach, year) cells without YAML proliferation, with year-aligned GHG inventory data and Sheets-API-quota-safe serialization.

Workflow + config plumbing

Two new optional workflow inputs on generate_diagnostics.yml, mirrored as CLI options on bedrock/utils/validation/generate_diagnostics.py:

  • model_base_year — overrides cfg.model_base_year. Drives the A-matrix scaling target year + inflation target year.
  • usa_ghg_data_year — overrides cfg.usa_ghg_data_year. Drives the GHG (E) inventory year + gross-output x-vector year.

Both flow through set_global_usa_config's diagnostics_cli_overrides path. USAConfig.model_base_year Literal expands to {2019…2024} (was {2022, 2023, 2024}); usa_ghg_data_year Literal expands to {2019…2024} (was {2023, 2024}). Both keys added to DIAGNOSTICS_CLI_OVERRIDE_KEYS.

GHG FBS year templatization

load_E_from_flowsa() in bedrock/transform/allocation/derived.py previously hard-coded 'GHG_national_Cornerstone_2023' / 'GHG_national_CEDA_2023'. Templatizes the new_ghg_method and CEDA fallback branches with cfg.usa_ghg_data_year — those FBS YAMLs and GCS parquets exist for 2019–2023. Variant FBSes (*_coa_allocation, *_electricity, etc.) only exist for 2023, so the function raises a clear ValueError if any update_*_method flag is set with year ≠ 2023, instead of failing later with an opaque "FBS not found".

Workflow-level serialization

Add concurrency: { group: generate_diagnostics, cancel-in-progress: false } to generate_diagnostics.yml. Belt-and-suspenders against the Sheets API write quota (60/min/user) — guarantees only one generate_diagnostics job is writing to Sheets at a time, regardless of dispatch source.

Dispatch script

bedrock/analysis/a_matrix_time_series/dispatch_ef_time_series.py. Per (scenario, approach, year) cell:

  1. Creates a Sheet in the v0.3 Diagnostics Drive folder with deterministic title [{run_date}, {model_year}, {baseline} based, {approach_label}, {scenario}] EFs diagnostics.
  2. Triggers generate_diagnostics via gh workflow run with config_name, model_base_year, usa_ghg_data_year, sheet_id, use_useeio_baseline. Both year overrides get the same value per cell.
  3. Appends a row to output/results/ef_run_index.csv (audit trail).

Idempotent — already-recorded cells are skipped, so re-running picks up only unfilled cells. CEDA-only baseline as the starting cut.

Two scenarios with explicit names:

  • isolate_a_matrix — vary only A-matrix scaling, hold everything else to v0 defaults. Reuses the four Step 6 candidate YAMLs. Currently parked.
  • bundle_v0_2 (default) — single config 2025_usa_cornerstone_full_model representing the full v0.2 release-candidate stack. 5 cells (1 approach × 5 years).

Throttle modes (--throttle):

  • poll (default) — block until prior workflow runs clear via gh run list.
  • sleep:N — fixed N-second sleep between triggers.
  • none — fire immediately (only safe with bumped Sheets quota).

Re-dispatch from CSV (--re-dispatch-from-csv): re-trigger workflows for cells already in ef_run_index.csv — used to recover from rate-limit batch failures, re-uses existing Sheets.

Compile-script extension

compile_ef_diagnostics.py now tolerates the optional scenario/year columns. When populated, per-pair tab names get a {scenario}_{year}_ prefix and summary rows stamp the dimensions. Step 6's existing CSV (no scenario/year) still works — backfilled to empty strings on load.

Testing

  • Dry-run dispatches the expected 5 cells for bundle_v0_2 × 2019–2023 with the agreed title format.
  • compile_ef_diagnostics.py still compiles Step 6's existing 7-row index unchanged.
  • black --check, ruff check ., mypy bedrock (385 files) all clean.
  • Live dispatch run on this branch (post-merge of feat(analysis): step 6 phase 2/3 — EF diagnostics compile + plot scripts #379, with the new GHG-year wiring) currently in flight to validate end-to-end.

Pre-merge note

Live dispatch needs the workflow on the target ref. Either land this PR to main first and run with --git-ref main, or run with --git-ref mo__step7-ef-time-series-dispatch to dispatch against the branch (current state).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

MoLi7 commented May 5, 2026

Copy link
Copy Markdown
Member Author

Adds rebuild_run_index_from_drive.py — lists Google Sheets in a
diagnostics Drive folder and parses each title into (approach, baseline,
sheet_id, year?, scenario?) rows for ef_run_index.csv. Closes the manual-
flow gap: users who triggered diagnostics via the GH Actions UI can now
auto-build the run index instead of hand-typing sheet IDs.

Title regex handles both formats currently in use:
  Manual:    [DATE, BASELINE based, A matrix with APPROACH] EFs diagnostics
  Dispatch:  [DATE, YEAR, BASELINE based, A matrix with APPROACH, SCENARIO] EFs diagnostics

Also rewords the FileNotFoundError in compile_ef_diagnostics.py to point
at the new script as the auto-rebuild path, with hand-write as a
fallback.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@MoLi7 MoLi7 changed the base branch from mo__step6-ef-diagnostics to graphite-base/380 May 6, 2026 16:13
MoLi7 and others added 8 commits May 6, 2026 09:13
…patcher

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…-series dispatch

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…race

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ta-overwrite

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Each diagnostics cell runs at model_base_year=Y so D_new/N_new are denominated
in year-Y dollars. Cross-year comparison requires a single dollar reference;
add D_new_ref/N_new_ref columns that apply inflation_adjust_ef_denom_to_new_base_year
to land every cell on REFERENCE_DOLLAR_YEAR (2023). Step 6 single-year rows
(empty year column) skip the step and behave unchanged.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Dispatcher previously crashed mid-run when `gh run list` returned non-zero
exit (transient API hiccup mid-batch). Now retries 3× per status with a 5s
backoff and falls back to a "still busy" sentinel so the poll loop keeps
spinning instead of unwinding the dispatch and losing the queue.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
mypy flags ta.cast as redundant since pd.to_numeric/concat/__getitem__
return inferable types. The casts were added to placate Pyright, which
isn't a CI check. Switching back to no-cast satisfies mypy + black; ruff
unchanged.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@MoLi7 MoLi7 force-pushed the mo__step7-ef-time-series-dispatch branch from 9be63f6 to 50931d0 Compare May 6, 2026 16:14
@MoLi7 MoLi7 changed the base branch from graphite-base/380 to mo__step6-ef-diagnostics May 6, 2026 16:14
@WesIngwersen WesIngwersen removed their request for review May 6, 2026 21:40
MoLi7 and others added 10 commits May 11, 2026 09:59
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds rebuild_run_index_from_drive.py — lists Google Sheets in a
diagnostics Drive folder and parses each title into (approach, baseline,
sheet_id, year?, scenario?) rows for ef_run_index.csv. Closes the manual-
flow gap: users who triggered diagnostics via the GH Actions UI can now
auto-build the run index instead of hand-typing sheet IDs.

Title regex handles both formats currently in use:
  Manual:    [DATE, BASELINE based, A matrix with APPROACH] EFs diagnostics
  Dispatch:  [DATE, YEAR, BASELINE based, A matrix with APPROACH, SCENARIO] EFs diagnostics

Also rewords the FileNotFoundError in compile_ef_diagnostics.py to point
at the new script as the auto-rebuild path, with hand-write as a
fallback.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…patcher

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…-series dispatch

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…race

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ta-overwrite

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Each diagnostics cell runs at model_base_year=Y so D_new/N_new are denominated
in year-Y dollars. Cross-year comparison requires a single dollar reference;
add D_new_ref/N_new_ref columns that apply inflation_adjust_ef_denom_to_new_base_year
to land every cell on REFERENCE_DOLLAR_YEAR (2023). Step 6 single-year rows
(empty year column) skip the step and behave unchanged.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Dispatcher previously crashed mid-run when `gh run list` returned non-zero
exit (transient API hiccup mid-batch). Now retries 3× per status with a 5s
backoff and falls back to a "still busy" sentinel so the poll loop keeps
spinning instead of unwinding the dispatch and losing the queue.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
mypy flags ta.cast as redundant since pd.to_numeric/concat/__getitem__
return inferable types. The casts were added to placate Pyright, which
isn't a CI check. Switching back to no-cast satisfies mypy + black; ruff
unchanged.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@MoLi7 MoLi7 force-pushed the mo__step6-ef-diagnostics branch from 81eed45 to ddb35d7 Compare May 12, 2026 20:17
@MoLi7 MoLi7 force-pushed the mo__step7-ef-time-series-dispatch branch from 50931d0 to 627fb5b Compare May 12, 2026 20:17

@WesIngwersen WesIngwersen left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm getting a read error where its reading in year data from the tab for scenarios where its getting a float and expecting an int

Pulling tabs for scenario=bundle_v0_2, approach=commodity_price_index, year=2019.0, baseline=ceda
Traceback (most recent call last):
  File "c:\Users\ingwersw\bedrock\bedrock\analysis\a_matrix_time_series\compile_ef_diagnostics.py", line 295, in <module>
    main()
  File "c:\Users\ingwersw\bedrock\bedrock\analysis\a_matrix_time_series\compile_ef_diagnostics.py", line 231, in main
    joined, source_year=int(year), ref_year=REFERENCE_DOLLAR_YEAR
                        ^^^^^^^^^
ValueError: invalid literal for int() with base 10: '2019.0'
(bedrock) 

@WesIngwersen

Copy link
Copy Markdown
Member

I'm getting a read error where its reading in year data from the tab for scenarios where its getting a float and expecting an int

Pulling tabs for scenario=bundle_v0_2, approach=commodity_price_index, year=2019.0, baseline=ceda
Traceback (most recent call last):
  File "c:\Users\ingwersw\bedrock\bedrock\analysis\a_matrix_time_series\compile_ef_diagnostics.py", line 295, in <module>
    main()
  File "c:\Users\ingwersw\bedrock\bedrock\analysis\a_matrix_time_series\compile_ef_diagnostics.py", line 231, in main
    joined, source_year=int(year), ref_year=REFERENCE_DOLLAR_YEAR
                        ^^^^^^^^^
ValueError: invalid literal for int() with base 10: '2019.0'
(bedrock) 

Addressed with 1a0b35c

@WesIngwersen WesIngwersen self-requested a review May 12, 2026 20:56
Base automatically changed from mo__step6-ef-diagnostics to main May 12, 2026 21:51
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@MoLi7 MoLi7 merged commit 8c6f38c into main May 13, 2026
5 checks passed
@MoLi7 MoLi7 deleted the mo__step7-ef-time-series-dispatch branch May 13, 2026 00:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants