Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
55 changes: 52 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -50,6 +50,12 @@ The result is ~2.94M firm rows weighted to ~2.5M UK firms. Because the populatio
is calibrated **to** the HMRC aggregates, agreement with them is an internal
consistency check, not external validation.

The official target surface is also being mirrored into PolicyEngine Ledger and
Populace. This repository keeps the paper's archived CSV inputs and generator for
reproducibility, and includes a Populace/Ledger comparison command so the pinned
migration snapshot can be audited without silently changing the published paper
population.

## Data vintages — single version, one-line switch

The pipeline is **single-version**: there is one `VAT_THRESHOLD`, not separate
Expand Down Expand Up @@ -83,6 +89,7 @@ firm-microsim --vintage 2024-25 # one vintage only (£90k)
firm-microsim --threshold 88 --seed 7 --output my_run.csv
firm-microsim-report # calibration report only
firm-microsim-figures # descriptive figures only
firm-microsim-populace-ledger # Populace/Ledger comparison
```

```python
Expand Down Expand Up @@ -145,16 +152,58 @@ firm-microsim-report
| **Overall (5 calibrated dimensions)** | **89.9%** | **90.5%** |

**VAT liability by *sector*** is **not** a calibration target — it is reported as
an informational diagnostic only (47.1% / 21.7%). The model fixes firm inputs
an informational diagnostic only (47.1% / 44.5%). The model fixes firm inputs
and sets liability = turnover − input but does not yet calibrate the
**input/output tax structure**, so per-sector net liability is structurally
unhittable and, while targeted, competed with the dimensions above (it scored
43.9% / −121.1% and dragged the naive mean down). It is gated off via
unhittable and is gated off via
`Config.calibrate_vat_liability_sector = False`. Restoring it after input/output
calibration is tracked in issues
[#1](https://github.com/PolicyEngine/firm-microsim-paper/issues/1) and
[#2](https://github.com/PolicyEngine/firm-microsim-paper/issues/2).

## Populace/Ledger migration check

`firm-microsim-populace-ledger` reports the current migration comparison. The
checked reference run used the 2024-25 Ledger target surface from
[PolicyEngine/ledger#67](https://github.com/PolicyEngine/ledger/pull/67)
at `cd98b5cb7b1604fbf7750689a429bbc356e5603a` and Populace's experimental UK
firm generator from
[PolicyEngine/populace#223](https://github.com/PolicyEngine/populace/pull/223)
at `fa20daf75ff023e5e88731a140f456f58e0b864e`. Both upstream PRs merged on
June 30, 2026: Ledger at merge commit
`ac643afa0c1d45fc4abd0268dc5aa7c843440b38`, and Populace at merge commit
`8271d767244161631253ad1d9ad792a82e2b96b4`. The reference population uses
1,000 calibration iterations:

```bash
firm-microsim-populace-ledger \
--output results/populace_ledger_comparison.txt \
--json-output results/populace_ledger_provenance.json
```

When `populace-build` is installed from the Populace source tree, the same command
can recompute the table and paper-CSV parity from Ledger consumer facts:

```bash
firm-microsim-populace-ledger \
--facts-jsonl /path/to/uk_firm_consumer_facts.jsonl \
--iterations 1000 \
--output results/populace_ledger_comparison.txt \
--json-output results/populace_ledger_provenance.json
```

The current reference comparison shows exact parity between the Ledger-backed
targets and the paper's processed 2024-25 numeric inputs: six normalized source
tables checked, zero mismatches, max numeric difference 0. It does **not** exactly
replicate the paper's generated synthetic population: Populace's shared optimizer
lands at 93.8% overall accuracy under its own validator versus the paper's 90.5%,
but that overall pair is **not like-for-like**: HMRC turnover-band accuracy uses
different band sets, and sector distribution reflects different calibration-target
definitions. The directly comparable rows are ONS population, employment bands,
and VAT liability by turnover band. The Populace/Ledger path is now based on
merged upstream inputs, while remaining a migration check rather than a silent
replacement for the paper's archived generator/results.

## Figures

Figures follow the project house style: single clean panels (no embedded titles,
Expand Down
27 changes: 27 additions & 0 deletions paper/Appendix/a_data.tex
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,33 @@ \subsection{Data construction detail}
diagnostic, because the current generator does not calibrate sector-specific
input/output VAT structure.

\paragraph{Ledger and Populace migration check.} The official target tables are
being moved into PolicyEngine Ledger, with Populace providing the shared
synthetic-population generator. I therefore keep the paper's archived processed
CSVs as the reproduction source for the reported results, and treat the pinned
Populace/Ledger path as an auditable migration check rather than a silent
replacement. The snapshot used here is PolicyEngine/ledger pull request 67
at commit \texttt{cd98b5c} and PolicyEngine/populace pull request 223 at commit
\texttt{fa20daf}; both upstream pull requests merged on June 30, 2026. For the
2024--25 vintage, the Ledger-backed targets match the paper's
processed numeric inputs exactly after dropping presentation-only labels,
totals, and the HMRC ``Unknown'' column that the generator does not calibrate:
six normalized source tables checked, zero mismatches, and maximum numeric
difference zero. A 1,000-iteration Populace run from those targets generated
2,946,015 firm rows. Its own validator reports a 93.8 percent overall
calibration score, compared with the paper validator's 90.5 percent score for
the same vintage, but this overall comparison is not like-for-like: HMRC
turnover-band accuracy uses different band sets, and sector distribution
reflects different calibration-target definitions. The directly comparable rows
are ONS population, employment bands, and VAT liability by turnover band.
Populace hits VAT liability by turnover band more closely, while its weighted
population (2,945,777 versus 2,577,078) differs more from the paper population.
VAT liability by sector remains an informational diagnostic
in both runs. The comparison table and structured provenance are reproduced by
\texttt{firm-microsim-populace-ledger} and checked into
\texttt{results/populace\_ledger\_comparison.txt} and
\texttt{results/populace\_ledger\_provenance.json}.

\paragraph{Counterfactual exclusion window and polynomial degree.} The no-VAT
counterfactual density of Section~\ref{sec:bunching} is fitted by polynomial
regression on the observed density outside a manipulation window around the
Expand Down
6 changes: 6 additions & 0 deletions paper/Sections/data.tex
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,12 @@ \section{Data}
level. The generator takes the registration threshold as a parameter and is
documented in Appendix~\ref{app:data}.

The archived CSV inputs in this repository remain the reproduction source for
the paper's reported numbers; a pinned migration snapshot also represents the
same 2024--25 numeric target surface through PolicyEngine Ledger and the
experimental Populace firm generator, and Appendix~\ref{app:data} reports the
parity check between that shared source-of-truth path and the paper inputs.

\paragraph{Construction.} Index firms by $i$. For each sector $s$ and ONS
turnover band $b=[\underline{b},\overline{b}]$, the \emph{UK Business: Activity,
Size and Location} table gives a firm count $N_{s,b}$, and I draw $N_{s,b}$ firms
Expand Down
1 change: 1 addition & 0 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,7 @@ firm-microsim-dynamic = "firm_microsim.dynamic.__main__:main"
firm-microsim-placebo = "firm_microsim.analysis.placebo_bunching:cli"
firm-microsim-dominated-region = "firm_microsim.analysis.dominated_region_mass:cli"
firm-microsim-reform-menu = "firm_microsim.analysis.reform_menu_common_base:cli"
firm-microsim-populace-ledger = "firm_microsim.populace_ledger:cli"

[tool.hatch.build.targets.wheel]
packages = ["src/firm_microsim"]
Expand Down
4 changes: 2 additions & 2 deletions results/calibration_accuracy.txt
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ dimensions and excludes the VAT-liability-by-sector diagnostic.
Vintage 2024-25 | threshold £90k
================================================================
rows (firm types): 2,945,974
weighted population: 2,577,076 firms
weighted population: 2,577,078 firms
----------------------------------------------------------------
Dimension accuracy error
----------------------------------------------------------------
Expand All @@ -41,7 +41,7 @@ dimensions and excludes the VAT-liability-by-sector diagnostic.
Overall (5 calibrated dims) 90.5% 9.5%
----------------------------------------------------------------
Informational diagnostic (not a calibration target):
VAT Liability by Sector 21.7% 78.3%
VAT Liability by Sector 44.5% 55.5%
================================================================

Done: 2/2 vintage(s) reported.
35 changes: 35 additions & 0 deletions results/populace_ledger_comparison.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
# Populace/Ledger firm-generation comparison

Reference run:

- Status: pinned to merged Ledger and Populace snapshots
- Vintage: 2024-25
- Seed: 42
- Populace iterations: 1,000
- Ledger input surface: 1,439 consumer facts
- Ledger facts SHA256: `58b6c2752adec5baa6a6260fe8cd9e9b85d0a78b5ba76ea28201f4c7986dce50`
- Ledger target snapshot: https://github.com/PolicyEngine/ledger/pull/67 at `cd98b5cb7b1604fbf7750689a429bbc356e5603a` (MERGED on 2026-06-30, merge commit `ac643afa0c1d45fc4abd0268dc5aa7c843440b38`)
- Populace snapshot: https://github.com/PolicyEngine/populace/pull/223 at `fa20daf75ff023e5e88731a140f456f58e0b864e` (MERGED on 2026-06-30, merge commit `8271d767244161631253ad1d9ad792a82e2b96b4`)
- Normalized source-table parity: 6 tables checked, 0 mismatched, max absolute numeric difference 0

The Ledger-backed targets match the paper's processed 2024-25 numeric input tables exactly after dropping presentation-only labels, totals, and the HMRC Unknown column that the generator does not calibrate.

| Metric | Comparability | Truth / target where it exists | Paper 2024-25 | Populace/Ledger 2024-25 |
| --- | --- | ---: | ---: | ---: |
| Rows | Descriptive | N/A, synthetic support size | 2,945,974 | 2,946,015 |
| Weighted population | Direct | 2,734,615 ONS firms | 2,577,078 | 2,945,777 |
| HMRC turnover bands | Not like-for-like | 2,171,200 VAT-registered firms excluding Unknown | 92.7% | 99.9% |
| ONS population | Direct | 2,734,615 ONS firms | 94.2% | 92.3% |
| Employment bands | Direct | ONS employment-band distribution, sum 2,734,615 | 89.7% | 92.3% |
| Sector distribution | Project-specific | 2,330,230 VAT-registered firms by SIC sector | 94.5% | 85.0% |
| VAT liability by band | Direct | GBP 177.17bn net VAT liability by turnover band | 81.4% | 99.5% |
| Overall | Not like-for-like | N/A, mean of calibrated accuracy scores | 90.5% | 93.8% |
| VAT liability by sector diagnostic | Diagnostic | GBP 177.29bn net VAT liability by SIC sector | 44.5% | 42.2% |

Interpretation: this is not a silent replacement for the paper's published synthetic population. The target surface is identical, and the directly comparable rows are ONS population, employment bands, and VAT liability by turnover band. HMRC turnover-band accuracy, sector distribution, and overall accuracy are computed under project-specific definitions, so the overall scores are not a like-for-like quality ranking. VAT liability by sector remains an informational diagnostic, not a calibrated target.

Recompute command:

```bash
firm-microsim-populace-ledger --facts-jsonl /tmp/uk_firm_consumer_facts.jsonl --reference-population --output results/populace_ledger_comparison.txt --json-output results/populace_ledger_provenance.json
```
103 changes: 103 additions & 0 deletions results/populace_ledger_provenance.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,103 @@
{
"ledger_paper_parity": {
"facts_count": 1439,
"facts_sha256": "58b6c2752adec5baa6a6260fe8cd9e9b85d0a78b5ba76ea28201f4c7986dce50",
"max_abs_numeric_diff": 0.0,
"mismatched_tables": [],
"tables": [
{
"max_abs_numeric_diff": 0.0,
"name": "ons_turnover_by_sic_band",
"paper_rows_after_filter": 88,
"rows": 88,
"same_keys": true,
"values_equal": true
},
{
"max_abs_numeric_diff": 0.0,
"name": "ons_employment_by_sic_band",
"paper_rows_after_filter": 88,
"rows": 88,
"same_keys": true,
"values_equal": true
},
{
"max_abs_numeric_diff": 0.0,
"name": "hmrc_population_by_turnover_band",
"paper_rows_after_filter": 1,
"rows": 1,
"same_keys": true,
"values_equal": true
},
{
"max_abs_numeric_diff": 0.0,
"name": "hmrc_population_by_sic",
"paper_rows_after_filter": 88,
"rows": 88,
"same_keys": true,
"values_equal": true
},
{
"max_abs_numeric_diff": 0.0,
"name": "hmrc_liability_by_turnover_band",
"paper_rows_after_filter": 1,
"rows": 1,
"same_keys": true,
"values_equal": true
},
{
"max_abs_numeric_diff": 0.0,
"name": "hmrc_liability_by_sic",
"paper_rows_after_filter": 88,
"rows": 88,
"same_keys": true,
"values_equal": true
}
],
"tables_checked": 6
},
"migration_snapshot": {
"arch_data_commit": "cd98b5cb7b1604fbf7750689a429bbc356e5603a",
"arch_data_merge_commit": "ac643afa0c1d45fc4abd0268dc5aa7c843440b38",
"arch_data_pr": "https://github.com/PolicyEngine/ledger/pull/67",
"arch_data_state_at_check": "MERGED on 2026-06-30",
"comparison_command": "firm-microsim-populace-ledger --facts-jsonl /tmp/uk_firm_consumer_facts.jsonl --reference-population --output results/populace_ledger_comparison.txt --json-output results/populace_ledger_provenance.json",
"populace_commit": "fa20daf75ff023e5e88731a140f456f58e0b864e",
"populace_merge_commit": "8271d767244161631253ad1d9ad792a82e2b96b4",
"populace_pr": "https://github.com/PolicyEngine/populace/pull/223",
"populace_state_at_check": "MERGED on 2026-06-30",
"source_packages": [
"ons-uk-business-firm-targets-2025",
"ons-uk-business-firm-sector-targets-2025",
"hmrc-vat-firm-targets-2024-25",
"hmrc-vat-firm-sector-targets-2024-25"
]
},
"paper_2024_25": {
"employment": 89.7,
"hmrc_bands": 92.7,
"ons_population": 94.2,
"overall": 90.5,
"rows": 2945974,
"sector": 94.5,
"vat_liability_band": 81.4,
"vat_liability_sector": 44.5,
"weighted_population": 2577078.0
},
"populace_ledger_2024_25": {
"employment": 92.27463056446766,
"hmrc_bands": 99.92894355826047,
"ons_population": 92.27819089707326,
"overall": 93.79009306789898,
"rows": 2946015,
"sector": 85.0102525769766,
"vat_liability_band": 99.45844774271694,
"vat_liability_sector": 42.17514113123149,
"weighted_population": 2945776.75
},
"run_parameters": {
"populace_iterations": 1000,
"seed": 42
},
"vintage": "2024-25"
}
44 changes: 42 additions & 2 deletions src/firm_microsim/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,9 +9,12 @@
>>> df = firm_microsim.generate(threshold=85) # doctest: +SKIP
"""

from __future__ import annotations

import sys
import types

from .config import DEFAULT_CONFIG, VAT_THRESHOLD, Config
from .generate import generate
from .validate import ValidationReport

__version__ = "1.0.0"

Expand All @@ -23,3 +26,40 @@
"ValidationReport",
"__version__",
]


def generate(*args, **kwargs):
"""Generate a synthetic firm population without importing torch at package import."""

from .generate import generate as _generate

return _generate(*args, **kwargs)


def __getattr__(name: str):
"""Lazily expose heavyweight public helpers."""

if name == "ValidationReport":
from .validate import ValidationReport

return ValidationReport
raise AttributeError(f"module {__name__!r} has no attribute {name!r}")


class _FirmMicrosimModule(types.ModuleType):
"""Keep the historic package-level generate function stable.

Importing the ``firm_microsim.generate`` submodule makes Python assign that
module to ``firm_microsim.generate``. Before lazy imports this package
eagerly re-exported the callable and kept that public attribute stable; this
hook preserves that behavior without importing torch during package import.
"""

def __setattr__(self, name: str, value):
if name == "generate" and isinstance(value, types.ModuleType):
super().__setattr__("_generate_submodule", value)
return
super().__setattr__(name, value)


sys.modules[__name__].__class__ = _FirmMicrosimModule
Loading
Loading