PolicyEngine · vahid-ahmadi · Jun 28, 2026 · Jun 28, 2026 · Jun 28, 2026 · Jun 28, 2026
diff --git a/README.md b/README.md
@@ -46,7 +46,7 @@ threshold:
    heavily. VAT registration is then assigned: mandatory above the threshold,
    voluntary below at the HMRC-calibrated rate.
 
-The result is ~2.94M firm rows weighted to ~2.0M UK firms. Because the population
+The result is ~2.94M firm rows weighted to ~2.5M UK firms. Because the population
 is calibrated **to** the HMRC aggregates, agreement with them is an internal
 consistency check, not external validation.
 

diff --git a/paper/Appendix/a_data.tex b/paper/Appendix/a_data.tex
@@ -33,7 +33,7 @@ \subsection{Data construction detail}
 counterfactual density of Section~\ref{sec:bunching} is fitted by polynomial
 regression on the observed density outside a manipulation window around the
 threshold. The baseline excludes $[T^{*}-15\text{k}, T^{*}+15\text{k}]$ and fits
-a cubic. Widening the window guards against contamination of the counterfactual
+a degree-7 polynomial. Widening the window guards against contamination of the counterfactual
 by the behavioural response but reduces the data available for the fit; raising
 the degree improves in-sample fit but risks over-fitting the tails. The
 sensitivity of the excess-mass and bunching-ratio estimates to window width and

diff --git a/paper/Appendix/a_inference.tex b/paper/Appendix/a_inference.tex
@@ -63,7 +63,14 @@ \subsubsection*{Bootstrap standard errors}
 distribution, and the $95\%$ confidence interval is the $2.5$th--$97.5$th
 percentile interval. Resampling firms (rather than residuals) propagates both
 sampling variation and the variation induced by re-binning. Table~\ref{tab:boot}
-reports the results.
+reports the results. Because these replications resample the $\approx 2.94$
+million \emph{constructed} synthetic firm rows, whose number is itself a
+modelling choice, the resulting standard errors and $95\%$ intervals quantify
+only the estimator's dispersion under resampling of a synthetic file of
+analyst-chosen size; they scale with that arbitrary row count and should
+\emph{not} be read as population sampling error about the universe of UK firms.
+They are reported for completeness and reproducibility, not as confidence
+statements about UK firms.
 
 \begin{table}[t]
 \centering
@@ -95,7 +102,16 @@ \subsubsection*{Non-identification: the elasticities are not behavioural}
 density step in the calibration targets rather than from any behavioural
 response (the placebo test below makes this precise). The elasticities are
 therefore reported only as a record of what the estimator returns when applied
-to these data; the paper draws no behavioural conclusion from them. This is an
+to these data; the paper draws no behavioural conclusion from them. A reader
+should note that the iso-elastic turnover elasticities $e\in\{0.05,0.17,0.32\}$
+swept in the body (Section~\ref{sec:behavioural}) are a \emph{different} object
+from the notch-width elasticity ($\approx 0.45$) and marginal-buncher elasticity
+($\approx 0.66$) reported here: the former are the median and mean of a per-firm
+marginal-buncher mapping under the iso-elastic firm specification, while the
+latter are aggregate estimator outputs computed under different normalisations.
+All of these numbers are mechanical artefacts of the calibration and are
+non-identified on these synthetic data---they admit no precise numerical mapping
+between the two sets---so none is used anywhere as a behavioural estimate. This is an
 instance of the general non-identification of bunching elasticities absent
 exogenous variation or strong functional-form assumptions
 \citep{blomquistetal2021, bertanhaetal2023}.
@@ -125,12 +141,22 @@ \subsubsection*{Sensitivity to polynomial degree and exclusion window}
 \hline
 Degree & Window & $b$ & $E$ & $\sigma$ & Notch-width (median) \\
 \hline
+5 & \pounds10k & $0.150$ & $13{,}339$ & $5.62$ & $0.47$ \\
+5 & \pounds15k & $0.139$ & $18{,}823$ & $5.16$ & $0.71$ \\
+5 & \pounds20k & $0.141$ & $25{,}596$ & $4.95$ & $0.87$ \\
+5 & \pounds25k & $0.152$ & $34{,}172$ & $5.11$ & $1.04$ \\
 6 & \pounds10k & $0.077$ & $7{,}262$ & $5.42$ & $0.41$ \\
 6 & \pounds15k & $0.045$ & $6{,}575$ & $4.99$ & $0.44$ \\
 6 & \pounds20k & $0.015$ & $3{,}822$ & $4.21$ & $0.35$ \\
+6 & \pounds25k & $-0.028$ & $471$ & --- & $0.15$ \\
 7 & \pounds10k & $0.091$ & $8{,}539$ & $5.07$ & $0.47$ \\
 \textbf{7} & \textbf{\pounds15k} & $\mathbf{0.060}$ & $\mathbf{8{,}712}$ & $\mathbf{4.85}$ & $\mathbf{0.45}$ \\
 7 & \pounds20k & $0.034$ & $7{,}313$ & $4.62$ & $0.43$ \\
+7 & \pounds25k & $-0.004$ & $3{,}089$ & --- & $0.25$ \\
+8 & \pounds10k & $0.118$ & $10{,}806$ & $4.92$ & $0.42$ \\
+8 & \pounds15k & $0.118$ & $16{,}329$ & $4.53$ & $0.59$ \\
+8 & \pounds20k & $0.166$ & $29{,}364$ & $4.96$ & $0.77$ \\
+8 & \pounds25k & $0.301$ & $59{,}931$ & $7.57$ & $0.95$ \\
 \hline
 \end{tabular}
 \end{table}

diff --git a/paper/Sections/behavioural.tex b/paper/Sections/behavioural.tex
@@ -7,7 +7,7 @@ \section{Behavioural reform costs}
 Section~\ref{sec:bunching} then shows that the bunching in my synthetic data is
 mechanical, so a behavioural elasticity is not identified from it---the reduced-form
 response is a placebo. Section~\ref{sec:model} supplies the one object that needs
-no behavioural calibration at all: the exact, model-free dominated region. What is
+no behavioural calibration at all: the exact, elasticity-free dominated region. What is
 missing from this sequence is the behavioural layer itself---a costing that lets
 firms re-optimise turnover when a reform changes the effective rate they face. This
 section supplies that layer. I build an iso-elastic structural simulator that, for
@@ -89,20 +89,27 @@ \subsection{An iso-elastic structural simulator}
 \subsection{Results}
 \label{ssec:behavioural-results}
 
-I sweep $e\in\{0.05,\,0.17,\,0.32\}$ and take $e=0.17$ as the headline. The low
+I sweep $e\in\{0.05,\,0.17,\,0.32\}$ and report the full sweep, treating
+$e=0.17$ as an illustrative midpoint rather than a headline estimate. The low
 value $0.05$ is of the order reported by \citet{klevenwaseem2013}; $0.32$ is the
 mean implied by mapping the marginal-buncher condition through my notch; $0.17$ is
-the median of that mapping. I flag the provenance plainly: the $0.17$ and $0.32$
+the median of that mapping. (These per-firm-mapping summaries are a different object
+from the single aggregate marginal-buncher elasticity $\approx0.66$ tabulated in
+Appendix~\ref{app:inference}, and differ from it numerically; both are mechanical,
+non-identified artefacts of the calibration.) I flag the provenance plainly: the $0.17$ and $0.32$
 values are read off the marginal-buncher condition applied to my notch geometry,
 which inherits the mechanical bunching that Section~\ref{sec:bunching} shows to be
-non-behavioural, so they are model-implied, not data-identified; I use them only to
-delimit a plausible sweep, anchored at the lower end by the external
-\citet{klevenwaseem2013} value, and a reader preferring a fully external range may
-substitute literature values for $e$ directly without changing the conditional
-structure of the costing. The same mapping pins the marginal buncher
+non-behavioural, so they are model-implied, not data-identified; the range's only
+externally-anchored point is its lower end (the \citet{klevenwaseem2013} value of
+$\approx 0.05$). I use them only to delimit a plausible sweep, and a reader
+preferring a fully external range---for instance \citet{liuetal2021}'s UK VAT-base
+elasticity---may substitute literature values for $e$ directly without changing the
+conditional structure of the costing. The same mapping pins the marginal buncher
 $n_H(e)$---the highest ability that still finds it optimal to bunch at the
-threshold---at \pounds112{,}795 for $e=0.05$ (a frictionless excess of
-$\Delta y^{*}\approx\pounds27.8$k), \pounds127{,}382 for $e=0.17$
+threshold---at \pounds112{,}795 for $e=0.05$ (a frictionless excess
+$\Delta y_{H}\equiv n_H-T^{*}\approx\pounds27.8$k, distinct from the empirical
+bunching span $\Delta y^{*}=\pounds5.6$k of Appendix~\ref{app:inference}),
+\pounds127{,}382 for $e=0.17$
 ($\approx\pounds42.4$k), and \pounds143{,}527 for $e=0.32$ ($\approx\pounds58.5$k),
 consistent with the analytic dominated-region width $a=\pounds21{,}250$ of
 Section~\ref{sec:model}.
@@ -118,7 +125,7 @@ \subsection{Results}
 static loss. The offset is substantial: at $e=0.17$ the raised threshold costs
 \(-\)\pounds292m against a static \(-\)\pounds508m, and the $10\%$ and $15\%$ bands
 cost \(-\)\pounds273m and \(-\)\pounds135m against static \(-\)\pounds343m and
-\(-\)\pounds171m. At the headline $e=0.17$, the number of firms re-optimising is
+\(-\)\pounds171m. At the illustrative midpoint $e=0.17$, the number of firms re-optimising is
 $50{,}155$ under the raised threshold, $61{,}187$ under the $10\%$ reduced rate, and
 $51{,}226$ under the $15\%$ reduced rate. Figure~\ref{fig:reform_dist} traces these
 costs across the full range of $e$, with each curve nesting onto its static value as
@@ -152,7 +159,7 @@ \subsection{Results}
 ($e\to0$ limit) figure for comparison. Each behavioural figure re-solves every
 firm's iso-elastic optimum~\eqref{eq:isoelastic} under the reform schedule, holding
 recovered abilities and $e$ fixed. The final column reports the number of firms that
-re-optimise at the headline $e=0.17$. Every behavioural figure is conditional on the
+re-optimise at the illustrative midpoint $e=0.17$. Every behavioural figure is conditional on the
 assumed, non-identified $e$. The graduated taper is excluded from the behavioural
 costing; see the note below the table and the discussion that follows.}
 \label{tab:behavioural_costs}

diff --git a/paper/Sections/bunching.tex b/paper/Sections/bunching.tex
@@ -20,7 +20,7 @@ \section{Bunching at the VAT threshold}
 \subsection{Methodology}
 
 Following \citet{liuetal2021}, I group firms into fine turnover bins of
-\pounds100 and estimate a counterfactual density---the distribution that would
+\pounds1{,}000 and estimate a counterfactual density---the distribution that would
 obtain absent the registration notch---by polynomial regression on bin counts,
 excluding a manipulation window around the threshold $y^{*}$:
 \[
@@ -53,7 +53,7 @@ \subsection{Results}
 the mass-conservation restriction that excess mass below the threshold equal
 missing mass above it---reported in Appendix~\ref{app:inference} and reproducible
 from the released code on the regenerated synthetic data. The observed
-density displays a clear step at the threshold: more mass in the \pounds100 bins
+density displays a clear step at the threshold: more mass in the \pounds1{,}000 bins
 immediately below the cutoff than just above it.
 
 I want to be precise about what this number is. It is the magnitude of a density

diff --git a/paper/Sections/conclusion.tex b/paper/Sections/conclusion.tex
@@ -43,7 +43,13 @@ \section{Conclusion}
 accounting identity rather than a new structural result, but one spanning a
 populous range of roughly $137{,}000$ firms. Setting cost against distortion is
 instructive: raising the threshold is the most expensive option yet merely
-relocates the zone, whereas the taper removes it entirely for less. Third, the open
+relocates the zone, and a banded reduced rate does not shrink the distortion
+either---it lowers the largest single notch (from \pounds17{,}000 to \pounds12{,}750
+at a $15\%$ band) but, because the band reverts to the standard rate at its
+\pounds105{,}000 top, adds a secondary notch, leaving the total dominated-turnover
+width essentially unchanged (\pounds21{,}562 at $15\%$, \pounds22{,}569 at $10\%$,
+both marginally above the \pounds21{,}250 baseline). Only the graduated taper,
+being continuous, removes the distortion entirely, and for less. Third, the open
 data reproduce the published density step at the threshold, but a placebo test
 shows the apparent excess mass is inherited from the coarse HMRC band targets
 rather than firm behaviour---it collapses from $8{,}712$ to roughly $99$ weighted
@@ -57,10 +63,13 @@ \section{Conclusion}
 identified on these synthetic data---the placebo shows the bunching is
 mechanical---I report the behavioural cost as an $e$-sensitivity range that nests
 the static cost as $e\to0$. Conditional on an assumed $e$, modelled responses partly
-offset the static costs: at $e=0.17$ the raise to \pounds100{,}000 falls from
-\pounds508m static to about \pounds292m, the $10\%$ band from \pounds343m to
-\pounds273m, and the $15\%$ band from \pounds171m to \pounds135m, as firms broaden
-the base when the effective rate falls. I report no behavioural cost for the
+offset the static costs: at an illustrative midpoint $e=0.17$ the raise to
+\pounds100{,}000 falls from \pounds508m static to about \pounds292m, the $10\%$
+band from \pounds343m to \pounds273m, and the $15\%$ band from \pounds171m to
+\pounds135m, as firms broaden the base when the effective rate falls. The
+$e$-range $\{0.05,0.17,0.32\}$ is a sensitivity range whose only externally
+anchored point is the lower end $e=0.05$; the $0.17$ and $0.32$ values are implied
+by the notch geometry, not identified from the data. I report no behavioural cost for the
 graduated taper: its behavioural response is not credibly identified, because its
 base-broadening and marginal-rate channels pull in opposite directions and the
 iso-elastic cost cannot net them, so only its static cost and dominated-region
@@ -69,9 +78,18 @@ \section{Conclusion}
 
 \paragraph{Limitations and future work.} The model is a turnover-tax-notch
 approximation: it treats VAT as a cost on turnover, whereas real VAT is levied on
-value added with input reclaim, and roughly half of below-threshold firms register
+value added with input reclaim, and roughly 43\% of below-threshold firms register
 voluntarily \citep{liuetal2021}, so the effective notch is smaller than a flat
-treatment implies. The microdata are synthetic and calibrated to coarse aggregate
+treatment implies. The dominated-region width indexes the misallocation distortion
+only and omits the compliance and administrative-cost saving that a \emph{higher}
+threshold delivers by removing small firms from VAT---the classic rationale for a
+high registration threshold \citep{keenmintz2004}---so the efficiency comparison is
+partial; a full welfare/MVPF accounting would net these compliance costs against
+the misallocation distortion. The menu reforms are also costed only in aggregate:
+because VAT liability by sector is excluded from the calibration optimiser
+(Section~\ref{sec:data}), the framework does not yet report the sectoral incidence
+of a taper or reduced rate---which sectors gain and by how much---a cut of obvious
+policy interest that I leave to future work. The microdata are synthetic and calibrated to coarse aggregate
 bands---which is why the behavioural magnitudes are reported only conditional on an
 assumed elasticity rather than as estimates---and the analysis is
 partial-equilibrium, abstracting from pass-through into prices \citep{benedek2015,

diff --git a/paper/Sections/data.tex b/paper/Sections/data.tex
@@ -23,7 +23,9 @@ \section{Data}
 distribution. Each firm is given input expenditure $x_i = \rho_i\,y_i$, with the
 input--output ratio $\rho_i$ drawn from a rescaled Beta distribution with
 sector-specific shifts and bounded to $\rho_i \in [0.6,1.5]$; its value added, the
-net VAT base, is $v_i = y_i - x_i$. Employment is assigned from the ONS
+net VAT base, is $v_i = y_i - x_i$ (the upper bound $\rho_i=1.5$ permits $x_i>y_i$,
+and hence negative value added, for a minority of input-heavy firms, representing
+net input-VAT-reclaim positions). Employment is assigned from the ONS
 employment-band shares.
 
 The base population reproduces the ONS structure but not