Skip to content

Add Populace Ledger firm comparison#19

Merged
vahid-ahmadi merged 4 commits into
mainfrom
codex/populace-ledger-comparison-20260630
Jun 30, 2026
Merged

Add Populace Ledger firm comparison#19
vahid-ahmadi merged 4 commits into
mainfrom
codex/populace-ledger-comparison-20260630

Conversation

@MaxGhenis

@MaxGhenis MaxGhenis commented Jun 30, 2026

Copy link
Copy Markdown
Contributor

Summary

  • add firm-microsim-populace-ledger, a paper-facing comparison command for the pinned Populace/Ledger UK firm-generation snapshot
  • check in Markdown and JSON provenance artifacts showing 2024-25 Ledger-vs-paper target parity and Populace-vs-paper synthetic-population deltas
  • document that this is a migration check tied to merged upstream Ledger and Populace snapshots, not a silent replacement for the paper's archived generator/results
  • keep firm_microsim.generate import-compatible while avoiding a heavy torch import for the lightweight comparison CLI

Migration snapshot

  • PolicyEngine/ledger#67 at cd98b5cb7b1604fbf7750689a429bbc356e5603a; merged on 2026-06-30 at ac643afa0c1d45fc4abd0268dc5aa7c843440b38
  • PolicyEngine/populace#223 at fa20daf75ff023e5e88731a140f456f58e0b864e; merged on 2026-06-30 at 8271d767244161631253ad1d9ad792a82e2b96b4
  • Ledger facts file used for the checked run: 1,439 facts, SHA256 58b6c2752adec5baa6a6260fe8cd9e9b85d0a78b5ba76ea28201f4c7986dce50

Key result

The Ledger-backed target surface matches the paper's processed 2024-25 numeric inputs exactly after normalizing away presentation-only labels/totals/Unknown: 6 tables checked, 0 mismatches, max numeric difference 0.

The Populace-generated population does not exactly reproduce the paper population. The checked full Populace optimizer snapshot lands at 93.8% overall under its validator versus the paper's 90.5%, but this is documented as not like-for-like: HMRC turnover-band accuracy uses different band sets, and sector distribution reflects project-specific calibration-target definitions. The directly comparable rows are ONS population, employment bands, and VAT liability by turnover band.

Review response

  • H1: added a comparability column and removed the implicit like-for-like overall ranking claim.
  • H2: documented the checked constants as manually captured snapshots and added a regression test that checked Markdown/JSON artifacts match the reference renderer/provenance payload.
  • H3: updated the Populace/Ledger language after the upstream Ledger and Populace PRs merged, including merge commits in checked provenance.
  • H4: reran the paper 2024-25 generator and updated the paper snapshot: weighted population is 2,577,078 and the VAT liability by sector diagnostic is 44.5%, not 21.7%.
  • M2: the source-parity verification command uses --reference-population to reproduce the checked artifacts and Ledger-vs-paper parity without rerunning the full optimizer; the full optimizer metrics are the separately captured pinned Populace snapshot recorded in the artifacts.

Upstream rebuild

After both upstream PRs merged, I rebuilt from clean detached upstream worktrees: Ledger origin/main at ac643af and Populace origin/main at 8271d76. The fresh Ledger bundle emitted the same 1,439 consumer facts with the same SHA256, and the full 1,000-iteration Populace optimizer rebuild diffed cleanly against the checked Markdown and JSON artifacts.

Hugging Face status

Authenticated HF API check found policyengine/populace-uk-private, but it contains the household UK release (populace_uk_2023.h5) and no firm files. I found no visible public or private policyengine/*firm* Populace firm dataset.

Verification

  • env -u UV_FROZEN uv run --extra dev ruff check .
  • env -u UV_FROZEN uv run --extra dev python -m pytest tests (19 passed)
  • git diff --check
  • full upstream rebuild from fresh Ledger facts: PYTHONPATH=/Users/maxghenis/.codex-worktrees/populace-upstream-firm-rebuild-20260630/packages/populace-build/src:/Users/maxghenis/.codex-worktrees/populace-upstream-firm-rebuild-20260630/packages/populace-calibrate/src:/Users/maxghenis/.codex-worktrees/populace-upstream-firm-rebuild-20260630/packages/populace-fit/src:/Users/maxghenis/.codex-worktrees/populace-upstream-firm-rebuild-20260630/packages/populace-frame/src:/Users/maxghenis/.codex-worktrees/populace-upstream-firm-rebuild-20260630/packages/populace-data/src env -u UV_FROZEN uv run --extra dev python -m firm_microsim.populace_ledger --facts-jsonl /tmp/uk-firm-ledger-upstream-bundle/consumer_facts.jsonl --iterations 1000 --output /tmp/populace_ledger_upstream_full_rebuild.txt --json-output /tmp/populace_ledger_upstream_full_rebuild.json
  • diff -u results/populace_ledger_comparison.txt /tmp/populace_ledger_upstream_full_rebuild.txt && diff -u results/populace_ledger_provenance.json /tmp/populace_ledger_upstream_full_rebuild.json
  • reference-population parity rebuild from fresh Ledger facts also diffed cleanly
  • cd paper && python3 build_site.py
  • rm -rf paper/out && cd paper && quarto render index.qmd --to pdf passed, with existing footnote destination warnings only

@vahid-ahmadi vahid-ahmadi merged commit f3d0762 into main Jun 30, 2026
2 checks passed
vahid-ahmadi added a commit that referenced this pull request Jun 30, 2026
Sync onto the new main (PRs #19, #20 merged) and reflect Section 7's value-added
recast of the behavioural simulator.

- New slide "The firm's problem: VAT on value added (formulation A)":
  pi(y)=(1-delta)(1-tau f(y))y - C(y;n,e), with delta the deductible-input share
  (value added (1-delta)y), C the own-factor cost (inputs already netted, so
  reading C as total cost would double-count), optimum n[(1-delta)(1-tau)]^e, and
  the new formulation_a_optima figure (delta in {0,0.4,0.8} across notch and taper).
- Behavioural slide: note the simulator is a value-added base (formulation A).
  The e-sensitivity numbers (GBP508m->GBP292m, etc.) are unchanged.
- Limitations: the "turnover-tax approximation" bullet now states the simulator
  taxes value added with a parametric delta (ONS Supply-Use to come), still
  abstracting from voluntary registration.

Rebuilt slides.pdf (30 pages).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants