Skip to content

Harden US L0 refit H5 reconstruction#235

Merged
MaxGhenis merged 1 commit into
mainfrom
codex/l0-refit-source-column-guard-20260701
Jul 1, 2026
Merged

Harden US L0 refit H5 reconstruction#235
MaxGhenis merged 1 commit into
mainfrom
codex/l0-refit-source-column-guard-20260701

Conversation

@MaxGhenis

Copy link
Copy Markdown
Contributor

Summary

  • require US release source-stage tax-unit columns when exporting an L0/refit H5 by default
  • add an explicit --root-attrs-h5 provenance source for Populace-owned root attrs such as congressional-district vintage metadata
  • record source-column checks and root-attr source provenance in the reconstruction manifest

Validation

  • uv run ruff format packages/populace-build/src/populace/build/us_runtime/l0_refit_export.py packages/populace-build/tests/test_us_l0_refit_export.py
  • uv run ruff check packages/populace-build/src/populace/build/us_runtime/l0_refit_export.py packages/populace-build/tests/test_us_l0_refit_export.py
  • uv run pytest packages/populace-build/tests/test_us_l0_refit_export.py packages/populace-build/tests/test_us_fiscal_refresh_builder.py::test_l0_refit_export_subsets_clean_base_frame

Reconstructed sparse H5 check

  • output: out/sparse-default-release-20260701/artifacts/populace_us_2024.h5
  • output sha256: d4264cc76d57ef76e28d4dc43ecfe3f8411b2e017d86321c2bfebbc2d3a25ea3
  • base_h5 sha256: b84e9b083a282b29bd10560c04f1c5c00ef353521496a2bba183dd84fe6e05b0
  • root_attrs_h5 sha256: ec290055a1856e8528b13818e506501f160398b8660f67a3edc6bbea869fbe08
  • weights sha256: 3c2a872c7e624218f9fe2aa920210419c23c4b5b9c33cb38455f5f04fe3e3e16
  • selected households: 57,240 of 337,704
  • required source columns checked: true
  • copied root attrs: populace_congressional_district_vintage_crosswalk_sha256, populace_congressional_district_vintage_target

Current-surface score

  • score dir: out/sparse-default-release-20260701/score-reconstructed-enriched-h5-current-surface-main-20260701
  • targets: 32,637
  • final_loss: 0.061503469580690834
  • fraction_within_10pct: 0.7071115604988204
  • gates: base_population_scale=true, health_input_signal=true, target_profile_coverage=true

The earlier raw-base reconstruction scored at final_loss 0.06695155785659872 but failed health_input_signal because it lacked selected_marketplace_plan_benchmark_ratio and takes_up_aca_if_eligible. This PR makes that failure impossible by default.

@MaxGhenis MaxGhenis merged commit b3e29b8 into main Jul 1, 2026
4 checks passed
@MaxGhenis MaxGhenis deleted the codex/l0-refit-source-column-guard-20260701 branch July 1, 2026 10:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant