Correct net-VAT liability scaling; rewrite paper on the corrected model#21
Conversation
v_i = 0.20 x (turnover - inputs) in generator, calibration, and validator; rho recentred to mean ~0.6 and clamped to [0.1, 0.95] so value added is strictly positive; secondary-notch mass block in dominated_region_mass. Cherry-picked from origin/fix-vat-liability-scaling (issue #15) without the tex changes, which are superseded by the rewrite in this branch. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
- Replace the damped fixed-point forward solve (no fixed point at reform notches; iterates oscillate) with a closed-form region-confined solve: each firm re-optimises within the schedule region containing its observed turnover, using the formulation-A response ratio y* = y_obs [(1-tau f1)/(1-tau f0)]^e (deductible share cancels). e->0 nesting is now exact; a pure threshold raise provably has zero intensive-margin offset (behavioural == static at every e) — asserted in the crosscheck; baseline reproduction asserted in reform_revenue. - Remove the non-identified elasticity outputs (sigma/Pi/eps) and the tau/2 wedge normalisation inherited from a deleted sigmoid model from the bunching estimator; keep the geometry (b, b_llat, E, Delta_R, y_R). - Fix bootstrap replicate weights to sum to the population mass W rather than the row count n (E replicates were scaled by n/W). - Redesign the recovery test: KW-consistent injection with missing mass extending beyond the exclusion window, scored by the headline estimator (the previous injection was invisible to the counterfactual fit by construction). - Relabel e sweep values as assumptions (0.05 external KW anchor); remove the false 'calibrated to reduced-form estimates' provenance. - report.py: fall back to the generic output CSV name; taper guard in the dynamic CLI. Addresses issues #15 and the methodology findings in the review. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Both vintages rebuilt with v = 0.20(turnover - inputs); all artifacts regenerated: calibration report (89.4%/89.4%), static sweep + anchor (new static_sweep.txt artifact with voluntary-retention sensitivity), reform menu (raise -698m, taper -466m, 10% band -460m, 15% band -230m on the common 184.7bn base), behavioural table (raise == static at every e; band offsets second-order), bunching (E=0 at 85k; spurious E=196k at the 90k band edge on the 2024-25 vintage), placebo (E=0 under all treatments), redesigned recovery test, dominated-region masses, iso-optimum verification. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Abstract/intro/data/static/bunching/model/behavioural/conclusion and both appendices rewritten against the corrected pipeline and the review findings: - Section 5 reframed: the corrected data contain no bunching at 85k (E=0, b=-0.139) and a spurious spec-robust signal at the 90k band edge (E~196k); the earlier draft's 'reproduced step' is documented as a liability-scaling artifact and its correction reported openly. - Anchor comparison rewritten: full-deregistration (-248m, 2025-26) and 43% voluntary-retention (-141m) conventions bracket HMRC's -185m in every forecast year; 'fiscal drag lifts the frozen baseline' replaced with the uprated counterfactual threshold path the code implements. - Behavioural section rewritten around the region-confined solver: raise-to-100k behavioural == static at every e (derived and asserted); reduced-rate offsets under 4%; taper exclusion restated structurally; the orphaned 'pathological FOC solve' claim removed; e sweep relabelled as assumptions (KW 0.05 anchor; LLAT UK 0.09-0.14 noted). - Dominated region: formulation-A invariance replaces the erroneous per-firm tau0/(1-tau0) robustness paragraph (which implied widths above the statutory cap); masses updated to the corrected population. - Static: new sweep (2024-25 vintage, explicitly introduced), menu on the 184.7bn base, direct-vs-smooth method reconciliation, +5.8% base overshoot disclosed against HMRC liability-target sums. - Neutrality/style: metric-scoped verdicts only; hedges consolidated; self-praise removed; findings renumbered to four consistently. - Citations: LLAT setting corrected (2004-14 sample, 58k-81k thresholds); Chetty et al. credited for the polynomial counterfactual; URLs/DOIs added throughout; benedek2015 to techreport; belloncopestake year fixed; HMT 2018 call for evidence and Council Directive (EU) 2020/285 added; ONS panel relabelled to the VAT/PAYE-registered frame with a BPE note; 150k step FRS attribution removed (band edge; step absent in corrected data). - Title: 'An Open Firm-Level Microsimulation of the UK VAT Registration Threshold'. README tables updated; upstream Populace inheritance of the mis-scaling flagged. Closes #15. Supersedes #17. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Adds the OBR Mar-2023 EFO Chart C data (already in-repo) as calibration targets for the 2023-24 vintage: five bins at/above the threshold as direct counts (same universe as the coarse HMRC band they refine), twenty below as shape targets normalised over the £65k-£85k window (the chart universe is narrower than the ONS frame there). The 2023-24 profile is interpolated between the 2019-20 outturn and the 2025-26 frozen-threshold projection. The near-threshold density now rises into the threshold and steps down across it - the administratively observed bunching profile, held as an explicit cited target. Placebo B regenerates without the fine targets and returns E=0. Responds to Nikhil's review comment that the previous optimizer-equilibrium shape looked economically backwards. Numbers on the OBR-target build: menu -783/-550/-497/-248 on £184.8bn; anchor conventions overshoot HMRC early (-366/-209 vs -185) and match by 2026-27 (-125.2 vs -125), sign flip reproduced; dominated-band mass ~160k; bunching at 85k reads the target-inherited E=14,663 (placebo->0); 90k spurious signal unchanged; recovery 55/75/82%; calibration holds at 89.4%. All propagated through paper, artifacts, and figures; anchor figure y-limits now scale with data. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…s; add --fast stratified sampling Root-cause fix for the seam Nikhil's review surfaced: the HMRC GBP1-to-Threshold liability total is remitted by voluntary registrants (~GBP2,150 average net, input-reclaim traders) the liability model does not represent, but was calibrated against the whole below-threshold population at the model's ~8% net rate (1.76x over-subscribed), draining weights in the GBP50k-85k region. Now an informational diagnostic, like liability by sector. One exclusion fixes three artifacts: the weight cliff at the OBR window edge (shoulder now continuous at ~1.05-1.15 average weight), the 2024-25 vintage's weak liability calibration (79% -> 92%; overall 89.4% -> 92.8%), and the spec-robust E~196k monster at the 90k band edge (now E=11,072 with b_LLAT=1.23 - within 10% of the administrative 1.361, the paper's cautionary exhibit in its sharpest form). Bootstrap SEs removed throughout: the synthetic file is a deterministic construction, so row-resampling has no estimand and its dispersion scales with the analyst-chosen row count. Replaced by specification grids plus generator-seed sensitivity (results/seed_sensitivity.txt: E +/-2%, reform costs +/-GBP1m across three full-size seeds). --fast mode: stratified thinning (30% in the GBP15k-150k window, 5% tails, per-stratum floors and exact ratio base weights so all targets stay true totals; optimizer initialised at and penalised toward base). 15 seconds per vintage vs ~13 minutes; aggregates within 0.3%, local bunching stats within ~5%. Definitive numbers: menu -784/-550/-497/-249 on GBP184.8bn; anchor -358/-366/-219/-75/+101 full-dereg, -204/-208/-125/-43/+57 retention (2026-27 matches -125.1 vs -125); sweep base GBP200.9bn; E=7,873 (b_LLAT 2.55) target-inherited at 85k, placebos -> 0; recovery 15/50/62%. All propagated through paper, README, artifacts, figures. Closes #23. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…sons Responds to review: the turnover-distribution levels reflect the ONS VAT/PAYE-registered frame; unregistered sole traders - most of the below-threshold mass and the cross-threshold cliff in all-business administrative charts (e.g. Tax Policy Associates 2018-19) - are outside it, so level comparisons to such charts are not like-for-like. Above the threshold the universes coincide and levels match (ours 10.7-11.5k/1k vs ~10.4k all-business). Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
|
cc @MaxGhenis — flagging some data-generation design questions on this branch before we lock it in. Since this PR rewrites the generation pipeline substantially, I think we should capture sector heterogeneity of VAT liability properly at the same time. Right now it isn't, and the new setup makes it structurally harder: 1. Sector heterogeneity is a three-level caricatureThe input/output ratio is a single global Beta(4,2) mapped to [0.2, 0.8], plus two hand-coded SIC sets (17 "negative-liability" sectors shifted up by U[0, 0.3], 5 "high-liability" sectors shifted down by U[0, 0.2]) — calibrated to nothing ( 2. The new clamp makes negative sectors unreachable by constructionWe have the right data in-repo ( This matters for the paper directly: sector composition near £85–90k sets the net-rate distribution τ₀ there, which drives the anchor cost, every behavioural wedge, and the firm-level dominated-region analysis. 3. The HMRC overshoot is mostly convention, not shapeAnchor now −358/−365 vs HMRC −150/−185 — but this branch's own artifact shows that with 43% voluntary retention (Liu et al.) of released-firm liability the series becomes −204 / −208 / −125 / −43 / +57 vs HMRC's −150 / −185 / −125 / −50 / +65: within ~£25m every year. So the headline gap is dominated by the assume-everyone-deregisters convention, not by the Beta parameters. The residual is where sector/VA-share composition near the threshold matters. 4. The near-threshold density shape is an optimizer artifactOn the corrected model the 2023-24 file has a density step up at £85k (below/above ≈ 0.74 — the opposite of admin data), and the 2024-25 vintage piles ~54k firms/£1k just below £90k vs ~6.9k above (7.8×). Root cause: turnover is drawn ~uniform within coarse ONS bands and the Adam reweighting has no target constraining within-band density shape, so it manufactures spikes to reconcile count and liability totals. No Beta(a,b) choice fixes this — it lives in the weights. Proposed fixes (in order of value)
One dependency to sequence deliberately: with per-sector rates and negative liabilities restored, the behavioural layer's creditor handling becomes first-order again (creditors rationally never deregister), so the solver convention needs to be fixed at the same time. |
Both sides of the threshold now enter as shape-only targets scaled to the synthetic frame's own mass per side (below over £65-85k, above over £85-90k). The previous build imported OBR chart levels as direct counts above the threshold while frame-scaling the shape below, which inverted the cross-threshold ordering (more mass just above than just below the notch - economically backwards). The cross-threshold step now comes from the frame's own band structure; the OBR data supply only the within-side geometry. Regenerated the 2023-24 vintage and every downstream artifact, and propagated through the paper: - Menu (common £85k base): raise-to-100k -753m, taper -520m, 10% band -484m, 15% band -242m - Anchor (85k->90k): retention convention now matches HMRC's published 2025-26 costing to £1m (-184.1m vs -185m); full-dereg -323m - Near-threshold density 13,839 -> 10,319 per £1k across the notch (correct ordering, -25% step) - Bunching: E=7,933, b_LLAT=1.91 (target-inherited as before); placebos still 0; recovery 15/49/62% - Dominated-region mass 156,040; reduced-rate totals 152,675 (15%) / 154,840 (10%) - Seed sensitivity re-run under the new scaling via new reproducible scripts/seed_sensitivity.py (E +/-161, costs +/-£2m) - data.tex/conclusion.tex now describe the side-consistent convention; turnover-distribution caption flags the £90k fine-window edge as a target boundary Responds to review discussion of the cross-threshold ordering. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Propagates the side-consistent frame-scaling build (paper PR #21) on top of Vahid's presentation trims: menu -753/-520/-484/-242; anchor retention matches HMRC 2025-26 to £1m (-184.1m vs -185m), full-dereg -323m; E=7,933; dominated region ~156k firms; behavioural table -753 flat, offsets <6%; appendix build slide states the shape-only side-consistent convention. Data figures re-synced. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Summary
Completes the net-VAT liability fix from #15 (superseding draft PR #17), re-runs the entire pipeline on the corrected model, and rewrites the paper against the corrected numbers plus a seven-referee review (methodology, red-team, domain, citations, neutrality, style, reproducibility).
The defect and its consequences
vat_liability_kwasturnover − input(no ×0.20), inflating per-firm liabilities ~5×, with ρ ∈ [0.6, 1.5] leaving ~47% of firms with negative value added. The weight optimiser compensated on band totals (aggregate scores barely moved) while distorting the near-threshold density — manufacturing the paper's "reproduction" of the administrative bunching step.Model fixes
v = 0.20 × (turnover − input); ρ ∈ [0.1, 0.95], mean VA share 40%, zero negative-VA firms (from Fix vat liability scaling #17, applied to current main)y* = y_obs[(1−τf₁)/(1−τf₀)]^e(deductible share cancels). e→0 nesting now exact; baseline reproduction asserted; raise-to-£100k behavioural ≡ static at every e — a derived invariance, asserted in the crosscheckdominated_region_mass; newresults/static_sweep.txtartifact (the sweep table was previously figure-pixels only); voluntary-retention sensitivity on the anchorHeadline results on the corrected model
Paper rewrite
Every section updated. Section 5 reframed around the correction: band-calibrated synthetic data cannot support bunching inference in either direction — the corrected data show nothing at £85k while the same generator shows a huge spurious signal at the £90k band edge, robust in all 16 sensitivity cells. The earlier draft's artifact is documented openly as the sharpest demonstration of the thesis. Value-added "robustness" paragraph (which implied dominated widths above the £21,250 statutory cap) replaced by the formulation-A invariance result. Anchor mechanism corrected (uprated counterfactual threshold path, not "fiscal drag"). Citations: LLAT sample setting corrected (2004–14, £58k–£81k thresholds — not "the same £85,000 notch"); Chetty et al. credited for the polynomial counterfactual; HMT 2018 call for evidence + Council Directive (EU) 2020/285 added; URLs/DOIs throughout. Full referee reports available on request.
Upstream
Closes #15. Supersedes #17.
🤖 Generated with Claude Code