Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
50 changes: 34 additions & 16 deletions packages/populace-build/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,11 +15,9 @@ names its donor survey and fails loudly — no silent fallbacks), and the
short-term capital gains to −$3.9T);
- **export surface** — every replacement artifact can prove that its
exported variables match a reference surface, with only documented
structural extras or reviewed exclusions (for UK, this is the eFRS
compatibility check);
structural extras or reviewed exclusions;
- **target surface** — the calibration target set covers the reference
target surface and may only be wider, not narrower (for UK, Populace must
calibrate to at least the eFRS target surface);
target surface and may only be wider, not narrower;
- **per-family fit** — the calibration's within-10% share is reported per
source family, while only broad family-level misses block publication so
one family cannot hide inside the global average;
Expand All @@ -41,11 +39,13 @@ modules; guard tests enforce this so country content stays declarative.
## UK local-geography path

`populace.build.uk_runtime.local_geography` holds the Populace-owned replacement shape
for UK constituency and local-authority geography. It uses the same stacked
local-area layout as the US local ECPS flow:
for UK constituency and local-authority geography. It supports both the stacked
local-area layout used by the US local ECPS flow and an assigned row-wise path
for UK builds that have a finest-available area assignment:

```text
column = area_index * n_households + household_index
stacked: column = area_index * n_households + household_index
assigned: column = household_index, and target rows only see households assigned to that area code
```

The solved weights export to a long sidecar with `(area_type, area_code,
Expand All @@ -54,22 +54,25 @@ the format PolicyEngine can group by directly for constituency and local
authority outputs, and it avoids preserving the legacy dense
`areas x households` matrix artifact.

The module does not import the incumbent UK data package. Engine runners and
The module does not import an incumbent UK data package. Engine runners and
target providers pass household metric tables and aligned target tables into
`build_stacked_local_matrix`; this keeps Populace clean while the target source
files move over. The helper `sort_households_by_id` also codifies the 2024-25
FRS fix: household attributes and weights must be sorted by the same stable
household ID before any positional assignment.
`build_stacked_local_matrix`, `build_assigned_local_matrix`, or
`build_local_candidate`; this keeps Populace as the owner of the build surface
while historical incumbent comparisons remain external migration benchmarks.
The helper `sort_households_by_id` also
codifies the 2024-25 FRS fix: household attributes and weights must be sorted
by the same stable household ID before any positional assignment.

`populace.build.uk_runtime.local_targets` declares the constituency and local-authority
metric surface used by the local build: HMRC employment/self-employment amount
and count rows, ONS age bands, Universal Credit household rows, constituency
UC-by-children rows, and the LA income/tenure/rent rows. It accepts a
PolicyEngine-UK-like simulation object and returns household-indexed metric
tables; it still takes target values as explicit input tables. `local_solver`
wraps the Populace calibrator's log-weight optimizer for stacked local weights
and records per-area/per-metric diagnostics before the solved weights are
exported with `stacked_weights_to_long`.
tables. It can derive the metric subset from Ledger target profiles while
target values remain explicit build inputs. `local_solver` wraps the Populace
calibrator's log-weight optimizer for stacked and assigned local weights and
records per-area/per-metric diagnostics before the solved weights are exported
with `stacked_weights_to_long` or `assigned_weights_to_long`.

`populace.build.uk_runtime.local_runner` is the Populace-owned candidate build path. It
loads explicit area and target tables, aligns a sorted household frame with
Expand Down Expand Up @@ -99,6 +102,21 @@ postcode sources. It writes the cloned row-wise H5, a geography coverage CSV,
and `rowwise_build_manifest.json` with input/output hashes, row counts, target
coverage, weight preservation, and weakest local-support diagnostics.

Like the US plan, UK migration comparisons against earlier production datasets
belong in release/benchmark harnesses outside this package. The build code here
must not import or depend on the incumbent UK data package; `source_manifest.py`
rejects incumbent country data-package references in declarative source specs.

The packaged `uk/source_stages.json` is the Populace-owned raw-input parity
contract for the UK build: FRS base tables, WAS wealth/debt/vehicles with the
cash ISA and stocks-and-shares ISA split, LCFS consumption and bus fare spend,
ETB VAT and public services, DfT bus/rail amount anchors, NHS usage, SPI
high-income income/reliefs, FRS-only pension/savings/reported-benefit fill,
Advani-Summers capital gains, salary sacrifice, SLC student-loan plan
assignment, and row-wise OA/LA/constituency geography. It is a declarative
resource, not a country Python runtime; shared Populace runtimes load and
execute specs.

## US plan status

`populace.build.us_runtime` declares the US build: stage order, donor graph with
Expand Down
10 changes: 5 additions & 5 deletions packages/populace-build/src/populace/build/gates.py
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,8 @@
member names, not raw source-system codes.
- :func:`export_surface_gate` and :func:`target_surface_gate` — replacement
builds can prove they cover a reference artifact's export variables and
calibration targets, e.g. UK Populace against eFRS.
calibration targets. Reference artifacts are comparison surfaces, not build
inputs.

Scoring uses :func:`relative_error_loss` — the calibrator's own objective —
so there is no calibrator-vs-scorer objective mismatch: what the solver
Expand Down Expand Up @@ -750,10 +751,9 @@ def export_surface_gate(
This is stricter than :func:`parity_gate`: parity checks whether populated
reference layers are also populated, while this gate checks the exported
variable *surface* itself. It is intended for live release blocking where a
country has a known incumbent-compatible artifact, such as UK Populace
matching eFRS exported variables. Extra columns are refused unless the
build declares them as structural/compatibility additions; missing
reference columns require a named reviewed exclusion.
country has a known reference export surface. Extra columns are refused
unless the build declares them as structural/compatibility additions;
missing reference columns require a named reviewed exclusion.
"""
candidate = {str(name) for name in candidate_columns}
reference = {str(name) for name in reference_columns}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -2,5 +2,5 @@
"schema_version": 1,
"country": "uk",
"policy": "spec-only country package; Python execution lives in shared runtime modules",
"resources": []
"resources": ["source_stages.json"]
}
Loading
Loading