Summary
The dataset calibration process is inflating the UK population significantly above ONS targets. The base FRS 2025 dataset has ~69M people, but after calibration this jumps to ~74M - about 6% above the ONS mid-2024 actual estimate of 69.3M.
Evidence
| Dataset |
2025 Population |
| Base dataset (frs_2025_with_ss.h5) |
68.97M |
| Calibrated dataset (frs_2025_calibrated_v3.h5) |
73.57M |
| ONS mid-2024 actual estimate |
69.3M |
| ONS 2022-based projection for 2025 |
~70M |
Root Cause Investigation
- The
uk_population target IS included in the calibration loss function (loss.py line 318)
- The population index in policyengine-uk (
ons.population) uses reasonable growth rates
- BUT the calibration is not constraining population properly - other targets are pulling weights in a direction that inflates total population
Potential Solutions
- Increase the weight on population targets in the calibration loss function
- Add a hard constraint that total population must match the target
- Review conflicting targets that may be inflating population (e.g., regional age bands sum to more than national total)
Impact
This is causing CI test failures in PR #216:
- Test expected: 69.5M (ONS 2022-based)
- Test got: 73.7M
Data Sources
Summary
The dataset calibration process is inflating the UK population significantly above ONS targets. The base FRS 2025 dataset has ~69M people, but after calibration this jumps to ~74M - about 6% above the ONS mid-2024 actual estimate of 69.3M.
Evidence
Root Cause Investigation
uk_populationtarget IS included in the calibration loss function (loss.pyline 318)ons.population) uses reasonable growth ratesPotential Solutions
Impact
This is causing CI test failures in PR #216:
Data Sources