Imagine a newly seated government with a treasury, a parliament, and a country to run. It must raise money, and every way of raising it takes more from the economy than it puts in the treasury. It must decide what to buy with that money — which goods it will provide itself and which it will leave to the market. It must decide whom to insure against the accidents of life, and how generously, knowing that insurance changes the behavior of the insured. It must decide how much to move from richer households to poorer ones, knowing that some of what it moves disappears on the way. And it must decide which of these jobs belongs to the national capital and which to the town hall. Public economics is the study of how to make those decisions with apparatus rather than instinct.
The earlier chapters built the tools this one uses. Surplus, deadweight loss, and tax incidence came from the welfare chapter; public goods, externalities, adverse selection, and moral hazard came from the market-failures chapter. There the question was diagnostic: where does the unaided market fail? Here the question is prescriptive: given that it fails, and given that the state's own instruments are imperfect, what should the state actually do? That shift — from diagnosis to design — is the whole subject, and it runs into a single tension at every turn. Almost everything a government does to make the distribution of welfare more equal also makes the economy less efficient, and almost everything it does to raise efficiency does nothing for the distribution. The equity-efficiency tradeoff is the spine of the field, and it is named in the first section and never put down.
The treatment runs in the order the government itself faces the questions: who really bears a tax and what it costs to levy one, what the state should provide, what tax schedule is best once we admit that taxes distort behavior, how to insure citizens against risk, how much to redistribute and at what cost, how to design transfers without destroying the incentive to work, and which level of government should do each of these. Throughout, the apparatus is presented as it is taught today — the formal models live in the boxes marked for the formal reader, and a reader who never opens one will still leave able to argue every result.
Prerequisites: surplus, deadweight loss, and tax incidence (Ch. 3); public goods, the Samuelson condition, externalities, adverse selection, and moral hazard (Ch. 4). Helpful but not required: the labor-leisure and intertemporal choice of the consumer (Ch. 5), the government budget constraint (Ch. 16), and the elasticity evidence of the econometrics chapter (Ch. 10).
A government decides to raise revenue from the sale of a good and writes the tax into law as a charge on the seller. The sellers protest that they will bear it; the buyers congratulate themselves that they will not. Both are wrong, and the reason they are wrong is the first result of the subject. Who legally hands the money to the tax authority — the statutory incidence — has nothing to do with who ends up poorer for the tax — the economic incidence. The market reallocates the burden through the price, and it does so according to a rule that ignores the wording of the statute entirely.
When a per-unit tax of $t$ is imposed, the price the buyer pays and the price the seller keeps part ways by exactly $t$. How far the buyer's price rises and the seller's falls depends on which side of the market can more easily walk away. A buyer with elastic demand — many substitutes, no urgency — shrinks purchases rather than pay much more, so the seller's price must fall to keep the good moving, and the seller bears the tax. A seller with elastic supply — able to redeploy resources elsewhere — exits rather than absorb the tax, so the buyer's price must rise, and the buyer bears it. The general rule is that the inelastic side bears more, in proportion to the elasticities themselves.
With $\varepsilon_d$ the (absolute) demand elasticity and $\varepsilon_s$ the supply elasticity, the buyer absorbs a fraction $\varepsilon_s/(\varepsilon_s+\varepsilon_d)$ of the per-unit tax. The expression is symmetric in a revealing way: the share borne by each side is governed by the other side's elasticity. Nothing in it refers to the statutory assignment, which confirms the independence result — levying the same $t$ on the buyer instead of the seller leaves Eq. 25.1 unchanged.
Why it matters: A tax is a wedge driven between what the buyer pays and what the seller receives. Whoever can most easily change their behavior in response — stop buying, stop selling — escapes most of it, and the burden slides onto the side that is stuck. A tax on cigarettes falls mostly on smokers because smokers do not quit over a few cents; a tax on a luxury with close substitutes falls mostly on the firms that make it, because customers simply buy something else. The law names a payer; the market overrules it. Slide the elasticities on the figure and watch the burden split move while the statutory toggle changes nothing at all.
Raising revenue would be costless if behavior never changed. It is not costless precisely because behavior does change: the tax pushes the quantity traded below the level at which the good was worth more to its buyer than it cost its seller, and every one of those forgone trades was a gain that now goes unmade. The lost surplus is the tax's deadweight loss — the resources destroyed rather than transferred. Unlike the revenue, which moves from taxpayer to treasury, the deadweight loss benefits no one. It is the efficiency cost of taxation, and its size has a shape that governs everything that follows.
The excess burden of a small tax is approximately one-half the relevant compensated elasticity $\varepsilon$ times the squared proportional tax $(t/p)^2$ times the expenditure $pQ$. The decisive feature is the square: doubling the rate roughly quadruples the deadweight loss. The first units of taxation cost almost nothing — the triangle starts at a point — while each additional point of rate is more expensive than the last. The compensated (rather than uncompensated) elasticity appears because deadweight loss is the pure substitution distortion, with the income effect of the tax stripped out.
Why it matters: The cost of a tax grows much faster than the tax itself. A small tax sits in a corner where the trades it kills were barely worth making, so it destroys almost nothing while raising real money — this is why small broad taxes are the workhorse of public finance. But the cost climbs with the square of the rate: a rate twice as high does roughly four times the damage. That single fact is why economists distrust very high tax rates on anything elastic, and why the optimal-tax problem two sections from now is worth solving at all. On the figure, drag the rate slider and watch the triangle bloom — when you double the rate, the area readout roughly quadruples.
Figure 25.1. Tax incidence and the Harberger triangle. The tax wedge splits the price; the burden falls more heavily on the inelastic side, independent of which side the law charges. The shaded triangle is the deadweight loss; raising the rate slider grows it with the square of the rate. Drag sliders to explore.
These two facts — that the burden lands on the inelastic and that the cost rises with the square of the rate — already set up the field's organizing tension. A government that cared only about efficiency would tax the most inelastic things it could find, since those taxes change behavior least and so destroy the least surplus. But the most inelastic things are often necessities, bought in similar quantities by rich and poor alike, so taxing them heavily falls hardest on the poor. Efficiency points one way and equity the other. The remaining sections are the discipline's attempts to find, for each kind of public decision, the least-bad compromise. Raising the revenue, it turns out, costs something real; the next question is what the state should buy with it.
The new government of Meridia needs revenue and reaches first for what looks easy: a tax on bread, which everyone buys and no one stops buying. The finance ministry is pleased — demand for bread is almost perfectly inelastic, so the tax raises money with almost no lost trades, a small Harberger triangle. Then the social-affairs ministry objects. Precisely because bread is inelastic, Meridian households cannot escape the tax by buying less; and because poor households spend a larger share of income on bread, the burden lands hardest on them. The same inelasticity that makes the tax efficient makes it regressive. Meridia has met the equity-efficiency tradeoff on its first day, and has not yet decided anything.
Should the government build the lighthouse, or subsidize a private firm to build it? The question only has bite because of a property of the lighthouse the market-failures chapter named: its light, once lit, shines for every ship at once, and one ship's use of it subtracts nothing from another's. A good with that property — non-rival in consumption, and hard to withhold from non-payers — cannot be sold the way bread is sold, because the price mechanism that allocates bread has nothing to grip. Each ship would rather let another captain pay for the light and sail by it free. The market under-provides the lighthouse not because anyone is irrational but because rational free-riding is the equilibrium. The provision question begins where the market-failure diagnosis ends.
For an ordinary private good, the market finds the efficient quantity by adding up demand horizontally: at a given price, each person buys until their marginal benefit equals that price, and the quantities sum. A public good inverts this. Because everyone consumes the same single quantity at once, the relevant question is not how many units to allocate across people but how much benefit the whole population draws from one more unit of the shared good. So the marginal benefits are added vertically — every household's marginal valuation of the same extra unit, stacked — and the efficient level is where that vertical sum meets the marginal cost of providing it. This is the Samuelson condition, and the switch from horizontal to vertical summation is the entire conceptual content of public provision.
Here $\text{MRS}_i$ is household $i$'s marginal rate of substitution between the public good and a private numeraire — its marginal willingness to pay for one more unit — and $\text{MRT}$ is the marginal rate of transformation, the units of the private good given up to produce one more unit of the public good (its marginal cost). The marginal rate of transformation is introduced here for the first time; it is the production-side counterpart to the marginal rate of substitution and measures the slope of the economy's production-possibility frontier. The free-riding equilibrium instead has each household contributing only until its own $\text{MRS}_i = \text{MRT}$, ignoring the benefit its contribution confers on everyone else, which lands provision far below the Samuelson level.
Why it matters: For a loaf of bread, you ask how many loaves to make and parcel them out one to a buyer. For a streetlight, everyone on the street gets the same light at the same time, so the question is not how to divide it but how bright to make it — and the answer is to add up how much that extra brightness is worth to every resident at once, and keep adding light until that total stops being worth the cost. That vertical stacking of everyone's small benefit is why a streetlight worth building is one that no single resident would pay for alone. Voluntary contributions fall short because each person counts only their own slice of the benefit. On the figure, toggle between vertical and horizontal summation and watch the efficient level jump; the free-rider marker sits well below it.
Figure 25.2. Public-good provision by vertical summation. Three households' marginal-benefit schedules are stacked vertically; the efficient level is where the vertical sum crosses marginal cost. The free-rider level — each contributing only to its own benefit — sits below it. Toggle to horizontal summation to see how a private good is priced instead. Drag sliders to explore.
The Samuelson condition tells the government how much of a pure public good to provide, but most of what governments actually supply is not pure. Two intermediate categories matter. A merit good is one the state provides or subsidizes not because the market would fail to supply it — schooling and basic healthcare can be sold privately — but because, left alone, people are judged to buy too little of it for their own good or for reasons of paternalistic or behavioral concern. The argument is no longer about non-rivalry; it is about whether individuals' own choices track their own welfare. That rationale leans on the behavioral apparatus — present bias, mispredicted future tastes, framing — developed in the behavioral chapter, and is invoked here without re-deriving it.
Between the pure public good (provide it publicly, finance it by tax) and the pure private good (let the market price it) sit the club goods — non-rival until they congest, but excludable, so a price can be charged at the gate. A toll bridge with light traffic serves an extra car at no cost to the others, like a public good; once it jams, each car slows the rest, like a private one. Buchanan's club-size problem balances these: a larger club spreads the fixed cost over more members but worsens congestion, and the optimal membership is where the saving from one more member's dues just offsets the crowding that member adds. The provision menu, then, runs from market to club to public, and the state's first design decision is to place each good on it correctly. Having decided what to provide, the government must raise the revenue to pay for it — and since taxes distort, the schedule it chooses is the hardest problem in the field.
Apparatus stop. The walkthrough argues over whether the state should let a market house people or provide housing itself. The provision menu here — market, merit, public — is the tool that decision needs.
Whether to treat housing as something a market should price or something the state should provide is a placement question on the provision menu of this section. Housing is mostly rival and excludable — closer to a private good than a lighthouse — so the pure-public-good case is weak. The live arguments run instead through the merit-good channel (do people under-provide for their own housing security in ways a paternalistic state should correct?) and through externalities and distribution (homelessness imposes costs on others; secure housing has been argued to raise health and schooling outcomes). The Samuelson condition does not settle it; the merit-good and distributional rationales do the work.
This section gives the walkthrough its vocabulary — public, merit, club, market — and the discipline of asking which rationale actually applies before reaching for direct provision. The contested call (is housing's under-consumption a genuine merit-good case, and is direct public provision better than a housing subsidy that lets a market clear?) is the walkthrough's to argue, not the chapter's. The apparatus is presented straight; the live position lives in the walkthrough.
Applied-field stop. The walkthrough tracks where behavioral findings reshaped real fields. The merit-good rationale for public provision is one of those places.
The merit-good case for state provision is the cleanest place public economics absorbed behavioral economics. The classical justification for provision is non-rivalry or an externality; the merit-good justification is that people, judged against their own long-run welfare, under-consume schooling, preventive care, or retirement saving because of present bias and mispredicted future tastes. That is a behavioral claim doing public-finance work — it converts a paternalism that the rational-agent model could not license into a defensible design argument.
This section names the behavioral rationale and uses it to justify merit-good provision, but it does not re-derive prospect theory or present bias — that apparatus lives in the behavioral chapter and is cross-linked, not rebuilt. The walkthrough's job is to argue how far the paternalistic license should run; the chapter's job is to show where it legitimately enters the provision decision.
Meridia's coast needs a lighthouse and its towns need preschools. The lighthouse is easy once the ministry sees the structure: no shipping firm will pay for a light every other firm sails by free, and the light costs nothing extra per ship, so Meridia provides it directly and funds it from general revenue — the textbook public good. Preschools are harder. A private market for them could exist, so this is not a free-rider failure; the case for public funding rests on the claim that parents, left alone, under-invest in early education relative to their children's long-run gain. Meridia treats preschool as a merit good and subsidizes it, while leaving the toll bridge across the river as a club good — priced at the gate, sized so the dues collected from one more daily user just offset the congestion that user adds.
Suppose the government must raise a fixed sum and can tax several goods. If taxes were costless it would not matter how the burden was split, but the previous section's square law makes it matter intensely: a tax's deadweight loss rises with the rate and with how much the taxed quantity responds. The efficient way to raise a given revenue, then, is to lean on the goods whose quantities respond least — the inelastic ones — because taxing them changes behavior, and so destroys surplus, the least. This is the Ramsey rule, and in its simplest form it says to set the tax rates so that the proportional reduction in compensated demand is the same for every good, which under the standard assumptions means taxing each good roughly in inverse proportion to its elasticity.
In the general Ramsey problem the planner minimizes total excess burden subject to a revenue constraint, and the first-order conditions equalize the proportional compensated quantity reduction across goods. Under the simplifying assumption that demands are independent (no cross-price effects), this collapses to the inverse-elasticity rule shown: the proportional tax on good $k$ varies inversely with its own compensated elasticity $\varepsilon_k$. The result is purely an efficiency prescription — it minimizes the distortion of raising the revenue and says nothing about who is made poorer.
Why it matters: If you must collect a fixed sum and every tax destroys some trades, collect it where the fewest trades will die — tax what people will keep buying anyway. That is the whole efficiency logic. And it has a sting the government cannot avoid: the things people keep buying regardless of price are the necessities, and taxing necessities heavily is exactly what falls hardest on the poor. The most efficient commodity-tax system is regressive almost by construction. On the figure, search the split between two goods and watch the total excess burden bottom out where the inelastic good carries the heavier rate — then notice that the inelastic good is the necessity.
Figure 25.3. Ramsey inverse-elasticity tax assignment. Total excess burden as a function of how a fixed revenue target is split between an inelastic good and an elastic good; it is minimized when more revenue is drawn from the inelastic good. The regressivity annotation fires when the necessity carries the heavier rate. Drag sliders to explore.
The commodity-tax problem treats the regressivity as a side effect to be regretted. The income tax confronts it directly, because the income tax is the instrument equity actually runs on. James Mirrlees posed the problem in its modern form: a government wants to redistribute from high earners to low earners but cannot observe ability, only income. If it taxes high incomes heavily, the able have less reason to earn — and worse, a high earner can always choose to work less and look like a low earner, so any redistributive schedule must be designed so that no type prefers to mimic a lower one. This is a screening problem, the same incentive-compatibility logic the mechanism-design chapter develops, applied to the whole population's earnings. The optimal income tax is the schedule that buys the most redistribution net of the output it discourages.
For the very top of the distribution the problem has a clean and famous answer. Peter Diamond and Emmanuel Saez showed that the optimal top marginal rate balances three forces. The first pushes the rate up: a high earner has a large income above any threshold, so each extra point of rate raises a lot of revenue from that inframarginal base. The second pushes it down: a higher rate makes top earners report less income — work less, avoid more — and how strongly they respond is the elasticity of taxable income, $e$. The third sets the floor: how much society values an extra dollar in a top earner's hands rather than redistributed, the social welfare weight on the rich. When that weight is essentially zero — when the government cares only about the revenue raised from the top to spend on everyone else — the formula reduces to a single expression in the elasticity and the shape of the top tail.
In the revenue-maximizing (zero top welfare weight) limit, the optimal top marginal rate $\tau^{*}$ depends on the elasticity of taxable income $e$ and the Pareto-tail parameter $a$ describing the thickness of the top of the income distribution (empirically $a \approx 1.5$–$2$). The rate falls monotonically as $e$ rises: the more top earners respond to taxation, the lower the revenue-maximizing rate. With a positive welfare weight on top earners the optimum is lower still. The whole policy disagreement about top rates is, formally, a disagreement about the value of $e$ — which is why the next paragraph treats it as a contested band rather than a number.
Why it matters: The best top tax rate is a tug-of-war between three pulls. Pulling the rate up: a top earner has a huge income above the threshold, so every extra point of tax raises real money with no further behavioral loss on the dollars they keep earning anyway. Pulling it down: tax them harder and they report less — they work less, they shelter more, they leave — and how hard they pull back is the one number, the elasticity, that nobody has pinned down. Setting the floor: how much you think a dollar is worth in a billionaire's pocket versus a poor family's. The rate where the rope settles is the optimum. You do not need the formula to see it move — on the figure, raise the elasticity and watch the optimal rate fall, because behavior fighting back is exactly what makes high rates self-defeating.
Figure 25.4. The Diamond-Saez optimal top marginal rate as a function of the elasticity of taxable income. The optimal rate falls as the elasticity rises. The shaded band marks the contested mainstream range of the elasticity — the disagreement over the right top rate is, at bottom, this band. Drag sliders to explore.
Where the mainstream lands on the top rate is genuinely unsettled, and honesty requires saying where it leans rather than averaging the disagreement away. The decisive input is the elasticity of taxable income. Saez and collaborators have argued for relatively low elasticities for ordinary earnings — implying optimal top rates in the region of fifty to seventy percent — on the grounds that much of the apparent response to high rates is avoidance and income-shifting that better tax design can curb rather than a real reduction in productive effort. Others read the same evidence as showing larger behavioral responses, which pushes the optimal rate well below that range. The mainstream center of gravity sits at a substantial but not confiscatory top rate, with a frank acknowledgment that the number rests on an elasticity that the profession has not pinned down. The figure shows the disagreement honestly: not a point, but a band.
Apparatus home. The walkthrough argues over taxing stocks of wealth versus flows of income. The incidence, deadweight-loss, and optimal-income-tax machinery it leans on is taught here at full depth.
The wealth-versus-income argument runs on three tools this chapter owns. Incidence (§25.1) settles who actually bears a tax once markets adjust — a wealth tax on capital can fall on workers if capital is mobile. The deadweight-loss square law (§25.1) governs how costly each instrument is at the margin. And the Diamond-Saez framework (§25.3) reframes the top-rate question as a balance of the inframarginal base, the behavioral elasticity, and the social weight — the same logic extends, with a different and higher elasticity, to taxing the return on wealth. The walkthrough depth-defers to this section rather than re-deriving any of it.
The live disagreement — whether a wealth tax's higher behavioral elasticity and administrative leakage make it worse than a well-designed capital-income tax, or whether wealth concentration is a target income taxation cannot reach — is the walkthrough's to argue. This chapter supplies the elasticity-band discipline and the optimal-tax frame; it names where the mainstream leans on the top income rate and stops there. The wealth-tax verdict is downstream, in the walkthrough.
Meridia's commodity taxes were efficient and regressive, so the government turns to the income tax to do the redistributive work. Its first instinct — a very high rate on top earners — runs straight into the screening problem: the most able can always earn less and look ordinary, so the schedule has to be designed around the prospect of mimicry. Meridia's analysts compute the revenue-maximizing top rate and find it pivots entirely on one number, how much top earners' reported incomes shrink when taxed. They cannot pin it down, so they report a range rather than a point and set a substantial top rate near the upper end of the elasticity-implied band, with the explicit caveat that if the behavioral response proves larger than assumed, the rate is too high. Honesty about the unknown elasticity, not false precision, is the deliverable.
A worker wants insurance against losing a job, a household against catastrophic illness, an old person against outliving their savings. In principle a private market could sell each of these. In practice the markets thin out or vanish, and the reason is the same one the market-failures chapter named: adverse selection. The people who most want unemployment insurance are those most likely to be laid off; the people who most want health insurance are those who expect to fall ill. An insurer who cannot tell them apart must price for the worst risks, which drives the good risks out, which raises the price further, until the market unravels. The state has one lever no private insurer has — compulsion. By requiring everyone into the pool, it stops the good risks from leaving, and the pool survives. This is the distinctive economic case for social insurance, and it is a case the unaided market genuinely cannot meet.
Compulsion solves adverse selection but reintroduces the other classic insurance problem: moral hazard. An insured worker searches less hard for a new job; an insured patient consumes more care; an annuitant is freed from the discipline of self-funded retirement. The more generous the insurance, the more it smooths consumption when the bad event strikes — its benefit — and the more it dulls the incentive to avoid or shorten the bad event — its cost. The optimal generosity sits where these balance. Martin Baily and Raj Chetty formalized this for unemployment insurance: the optimal replacement rate is the one at which the marginal consumption-smoothing benefit of a more generous benefit just equals its marginal moral-hazard cost.
In the Baily-Chetty condition the optimal replacement rate equates a benefit term — proportional to the coefficient of relative risk aversion $\gamma$ times the proportional consumption drop $\Delta c/c$ that the benefit prevents — to a cost term, the elasticity of the insured behavior (here the duration of unemployment $D$) with respect to the benefit. The condition is a sufficient-statistics result: it requires only the observable consumption drop, a measure of risk aversion, and the behavioral elasticity, not a fully specified structural model. As the behavioral elasticity rises, the cost term grows and the optimal replacement rate falls.
Why it matters: Insurance is a seesaw. On one side is the relief it buys — a laid-off worker who would otherwise see their living standard collapse instead keeps the lights on, and that smoothing is worth a great deal when the fall would be steep and the worker fears it. On the other side is the slack it induces — paid most of their wage to be unemployed, a worker takes longer to find the next job. The right replacement rate is where the seesaw balances: insure against the suffering, but not so generously that you are paying people to stay out of work. And the balance point moves: the more strongly behavior responds to the benefit, the further the seesaw tips toward less insurance. On the figure, raise the moral-hazard elasticity and watch the optimal replacement rate slide down.
Figure 25.5. The Baily-Chetty insurance-incentive balance. The marginal consumption-smoothing benefit (falling in the replacement rate) and the marginal moral-hazard cost (rising in it) cross at the optimal replacement rate. Raising the moral-hazard elasticity steepens the cost curve and slides the crossing point — the optimal rate — down. Drag sliders to explore.
The welfare state's three largest programs are this same apparatus applied to three risks. Unemployment insurance is the Baily-Chetty problem in its native form, and the empirical work centers on the duration elasticity — how much longer the insured stay unemployed — which is the moral-hazard cost the optimal replacement rate trades against. Public pensions answer a different failure: the private annuity market is thin because of adverse selection (those who buy annuities expect to live long), so the state pools longevity risk compulsorily and lets people smooth consumption across a life they cannot predict the length of. Public health insurance answers adverse selection in its sharpest form — the sick most want coverage — and the case for compulsory or single-payer pooling is precisely that it defeats the unraveling a voluntary health-insurance market suffers. Each program is one structural insight, not a tour of benefit rules.
That pension example exposes a distinction that runs through the entire welfare state and is easy to miss. The same transfer can be doing two completely different things. When a pension takes from a worker's earning years and pays them in retirement, the recipient is the contributor's own future self, and the transfer is insurance — smoothing one person's consumption across time and across the risk of an unknown lifespan. When a pension takes more from high lifetime earners and pays more to low lifetime earners, the recipient is a different household, and the transfer is redistribution — moving resources across the distribution. Real programs mix both, and which one is first-order differs by program. A flat contributory pension is mostly insurance; a means-tested old-age benefit is mostly redistribution. Seeing the welfare state as two programs sharing one budget line is the conceptual key to the redistribution section that follows.
Figure 25.6. The same pension, decomposed. A stylized pay-as-you-go pension's total transfer split into its across-time insurance component (the worker's own smoothed life-cycle consumption) and its across-household redistribution component (from high to low lifetime earners). As the benefit formula is made more progressive, the redistribution slice grows and the insurance slice shrinks. Toggle the progressivity to see the mix shift.
Apparatus home for two of the three rationales. The walkthrough separates insurance, redistribution, and macro stabilization. The insurance machinery — and the insurance-vs-redistribution distinction — is taught here.
The walkthrough's argument turns on keeping three jobs of the welfare state distinct. Two of them are this chapter's: the insurance rationale (compulsory pooling defeats adverse selection; the Baily-Chetty replacement rate sets its generosity, §25.4) and the redistribution rationale (the social welfare function and the leaky bucket, §25.5). The insurance-vs-redistribution decomposition — the same transfer reading as one or the other depending on who the recipient is — is the spine the walkthrough peeks here.
The third rationale — that the welfare state acts as an automatic macroeconomic stabilizer, propping up demand in downturns through transfers that rise as incomes fall — is not taught here. That is the multiplier and automatic-stabilizer apparatus of the intro-macro and monetary-fiscal chapters, and the walkthrough routes its stabilization beat there, not to this chapter. This section owns the microeconomics of insuring and redistributing; the macro stabilization role stays where it is built.
Meridia's private unemployment insurers had collapsed — only workers who feared layoffs bought policies, prices rose, the rest left, the market unraveled. The government makes coverage compulsory, restoring the pool, then has to choose how generous to be. Its analysts measure the consumption drop that unemployed Meridians suffer (large, which argues for generosity) and the degree to which benefits lengthen the time people stay unemployed (the duration elasticity, which argues for restraint), and set the replacement rate where the two balance. For pensions they pool longevity risk compulsorily, since no Meridian knows how long they will live; for healthcare they mandate a single pool, since the sick would otherwise be the only buyers. When a critic complains that the pension also takes more from the rich than it returns to them, Meridia's analysts agree — and note that this part of the pension is not insurance at all but redistribution, the subject of the next decision.
How much should a government redistribute? The question has two parts that are easy to run together and must be kept apart. The first is a value question: how much does society care about an extra dollar in a poor household's hands relative to a rich one's? The second is a positive question: how much of a dollar taken from the rich actually arrives in the poor household's hands, and how much is lost on the way? Economics has no answer to the first — it is a judgment about social priorities — but it gives that judgment a precise form, the social welfare function, and it has a great deal to say about the second.
The social welfare function $W$ maps the profile of individual utilities into a social ordering. Two special cases anchor the dial: the utilitarian $\sum_i u_i$, which is indifferent to the distribution of utility itself and redistributes only because money has diminishing marginal utility, and the Rawlsian $\min_i u_i$, which evaluates society by its worst-off member and so redistributes until the bottom can be raised no further. A constant-elasticity form $W = \sum_i u_i^{1-\rho}/(1-\rho)$ spans the two as the inequality-aversion parameter $\rho$ runs from $0$ (utilitarian) toward $\infty$ (Rawlsian).
The positive part of the question is Arthur Okun's leaky bucket. Carrying water from the rich to the poor, the bucket leaks: some of every transferred dollar is lost to administrative cost and, more importantly, to the behavioral distortions the earlier sections measured — the taxes that fund the transfer deaden effort, and the transfer itself can blunt the recipient's incentive to earn. So a dollar taken from the rich arrives as less than a dollar. Whether redistribution is worth it depends jointly on how much society weights the recipient (the social welfare function) and how leaky the bucket is (the empirical magnitude of the distortions). With a small leak and a strong concern for the poor, almost any transfer is worth making; with a large leak and weak concern, even a modest transfer fails the test. The verdict can flip with either input.
The social-welfare change from a marginal dollar of redistribution is the distributional gain — the difference between the social marginal value $g$ of a dollar to the recipient and to the donor, which is large under a Rawlsian $W$ and small under a near-utilitarian $W$ once incomes are close — minus the leak $L$, the per-dollar loss to administration and to the behavioral deadweight loss of the funding tax and the transfer's own incentive effects. Redistribution is worth expanding while $dW > 0$ and stops at $dW = 0$, which is the equity-efficiency frontier for transfers. Both terms are needed: the sign of $dW$ can be reversed by changing either $W$ or $L$.
Why it matters: Every redistributed dollar arrives as less than a dollar — some leaks out as paperwork, but mostly as the work and effort that the funding tax and the handout together discourage. Whether the transfer is worth it comes down to two things, and reasonable people who agree on all the economics can still disagree because they disagree on these two. The first is how much you weight a dollar in a poor family's pocket against a dollar in a rich one's — a value, not a fact. The second is how leaky the bucket actually is — a fact, but a contested one. On the figure, set a small leak and a Rawlsian concern for the worst-off and the transfer is plainly worth it; crank the leak up or flatten your concern toward pure utilitarianism with incomes already close, and the same transfer turns net-negative. The verdict flips.
Figure 25.7. The leaky bucket and the equity-efficiency frontier. The net social-welfare change from a dollar of redistribution, as the leak (administrative plus behavioral loss) and the social welfare function vary. The curve traces net welfare against leak size; the verdict turns positive or negative as the SWF and the leak change. Drag sliders and toggle the SWF to flip the verdict.
Where does the mainstream land on the size of the leak? As with the top tax rate, the honest answer is a calibrated position rather than a settled number. The mainstream view is that the leak is real but, for moderate redistribution, smaller than the strongest efficiency-pessimist claims — most empirical estimates of the behavioral response to the taxes and transfers that fund a moderate welfare state are large enough to matter but not large enough to make redistribution self-defeating. The crucial caveat is that the leak does not stay constant: because deadweight loss rises with the square of the rate, the bucket leaks much faster at high rates of taxation and high transfer-withdrawal rates. A moderate welfare state carries a modest leak; pushing redistribution toward the limits of the schedule runs into the square law, and there the efficiency cost climbs steeply. Naming that shape — modest in the middle, steep at the extremes — is the calibrated position the chapter takes.
One honest qualification belongs here before the chapter moves to instrument design. Every optimum in this chapter — the optimal tax, the optimal level of provision, the optimal replacement rate, the optimal transfer — has assumed a government that wants to maximize social welfare and is competent to do it. That is a strong assumption. A rival tradition asks whether real governments behave that way at all, or whether they are captured by organized interests, distorted by the incentives of officials and voters, and prone to a government failure that is the mirror image of the market failure the chapter began with. The apparatus here takes the benevolent-planner posture because that is what lets the design problem be posed; it does not establish that actual states achieve the optima it computes. That counter-argument is a tradition in its own right, treated where the history of economic thought takes it up, and the reader should carry the caveat forward rather than mistake the optima for descriptions of what governments do.
With insurance settled, Meridia confronts redistribution proper. Its cabinet splits not over the economics but over the value question: some weight the worst-off heavily, near the Rawlsian pole, and want large transfers; others, closer to utilitarian once incomes are not far apart, want little. The analysts cannot settle that — it is a judgment — but they can price the leak. They estimate that for the moderate transfers under discussion, each redistributed dollar arrives as roughly seventy cents, the rest lost to administration and to dulled work incentives. At that leak, the Rawlsian faction's transfers clear the bar and the near-utilitarian faction's barely do; both sides learn that their disagreement is about values plus the contested leak, not about whether redistribution "works." And both accept the analysts' warning that if they push the rates much higher, the square law means the bucket will leak far faster than seventy cents on the dollar.
Once a government decides to redistribute, it must choose how. The first choice is between targeting and universalism. A means-tested transfer goes only to those below an income threshold, concentrating resources where need is greatest and so delivering more equity per dollar spent. But targeting has three costs. It is administratively expensive to verify eligibility; it carries stigma that suppresses take-up among the eligible; and, most importantly, withdrawing the benefit as income rises imposes an implicit marginal tax rate on the recipient. A dollar earned reduces the benefit, so the recipient keeps far less than a dollar — the phase-out is a tax in everything but name, and a steep one. A universal transfer avoids these by going to everyone regardless of income, but it spends heavily on households that do not need it and must be funded by higher general taxation.
Milton Friedman's negative income tax was the clean idea that cuts through the targeting-versus-universalism dilemma: a single integrated schedule that pays a guaranteed minimum to those with no income and taxes earnings at a constant rate, so that the transfer phases out smoothly rather than at a benefit cliff. Its practical descendant is the earned income tax credit, which goes one step further by making the transfer conditional on work. The EITC has three segments. In the phase-in, the credit grows with earnings — the government adds to each dollar earned, so the effective marginal tax rate is negative, and the program pays people to work. In the plateau, the credit is flat. In the phase-out, the credit is withdrawn as income rises, imposing a positive implicit marginal tax rate. The contrast with a means-tested transfer is exact and instructive: means-testing taxes the first dollar a poor household earns at a high implicit rate; the EITC subsidizes those first dollars and only claws back later, gently.
The effective marginal tax rate (Eq. 25.11) adds the statutory income-tax rate $\tau$ to the benefit phase-out rate $\phi$, so a benefit withdrawn at rate $\phi$ taxes earnings exactly as a statutory tax of $\phi$ would. The EITC schedule (Eq. 25.12) pays a subsidy at rate $s$ per dollar in the phase-in (where the effective marginal rate is $-s$, a subsidy to work), holds the credit flat across the plateau (effective marginal rate equal to the statutory rate alone), and withdraws it at rate $\phi$ in the phase-out (effective marginal rate $\tau + \phi$). A pure means-tested transfer is the special case with no phase-in: the full phase-out rate applies from the first dollar earned.
Why it matters: Targeting the poor by means-testing has a hidden trap: because the benefit shrinks as you earn, every dollar a poor person earns is partly clawed back, so they face a steeper effective tax on their first dollar than a rich person faces on their last. The earned income tax credit turns this upside down at the bottom — it pays you more for working more, so the first dollars earned are subsidized, not taxed, and only further up the scale does it withdraw the credit gently. Means-testing punishes the first dollar earned; the credit rewards it, then withdraws softly. On the figure, walk a worker's earnings from zero upward and watch the effective tax rate dive negative across the EITC's phase-in, sit flat on the plateau, and climb in the phase-out — then overlay a means-tested transfer and see it start high from the very first dollar.
Figure 25.8. The effective marginal tax rate across an EITC schedule. Negative in the phase-in (work is subsidized), zero plus the statutory rate on the plateau, positive in the phase-out (the credit claws back). Overlay a means-tested transfer to see its high implicit rate apply from the first dollar earned. Drag sliders and toggle the overlay.
Meridia's redistribution could go to households below a poverty line, sharply targeted. But the analysts show the cabinet the trap: withdrawing that benefit as recipients earn would tax their first earned dollar at over fifty percent, and Meridians on the program would face a steeper effective rate on a dollar of work than the country's top earners face on theirs. So Meridia instead builds a work-conditioned credit on Friedman's logic — it tops up low earnings, so the first dollars a poor Meridian earns are subsidized rather than clawed back, and only at higher earnings does the credit withdraw gently. The cabinet accepts that this spends some money on households just above the bottom, the price of not punishing work. Which level of government should run it is the last question.
Every decision so far has assumed a single government. Most countries have several layers — national, regional, local — and the last question is which layer should do which job. This is the assignment problem, and it has a clean efficiency logic with one sharp catch. The efficiency case for decentralization is that local governments can match public goods to local tastes. A coastal town wants harbors; a mountain town wants ski-patrol roads; a single national bundle would over-provide one and under-provide the other. If households can move, they will sort themselves into the jurisdiction whose bundle of public goods and taxes best fits their preferences — voting with their feet. Charles Tiebout showed that this mobility can, in principle, reveal preferences for local public goods that voting in a single jurisdiction cannot, and discipline local governments through the threat of exit.
The catch is that the same mobility that makes decentralized provision efficient makes decentralized redistribution impossible. If a town raises a local tax to fund transfers to its poor, its rich households can simply move to the next town, taking their tax base with them, while poor households move in to claim the benefit. The redistributive tax erodes its own base. Redistribution can only be sustained at a level no one can easily exit — the national government — because exit is what disciplines local provision and what destroys local redistribution at the same time. The decentralization theorem, Wallace Oates's result, captures the resulting assignment: provide a public good locally when preferences differ across jurisdictions and spillovers are small, centrally when they are uniform or spill across borders, and assign redistribution to the center regardless. The figure makes the redistribution catch concrete: switch on a local redistributive tax and watch the high-income households leave.
Why it matters: Local government is good at one thing and hopeless at another, and it is the same fact that makes it both. People can move between towns, so towns must compete to offer the public goods residents want — that is what keeps local provision responsive. But that same ability to move means a town that tries to tax its rich to help its poor will simply watch its rich leave for the next town. You cannot redistribute at a level people can walk away from. So the rule falls out almost by itself: let towns provide the goods that local tastes differ over, and leave redistribution to the level no one can exit. On the figure, switch on a local tax meant to redistribute and watch the high-income households vanish to the untaxed jurisdiction, collapsing the very revenue the tax was supposed to raise.
Figure 25.9. Tiebout sorting and the redistribution catch. Households sort across jurisdictions by preference. When jurisdiction A imposes a local redistributive tax, its high-income households exit to untaxed jurisdictions, collapsing the local tax base the redistribution depended on. Toggle the local tax to see the exit.
Meridia's coastal and mountain provinces want different things, so the national government devolves local public goods — harbors, ski roads, parks — to the provinces, which compete to attract residents by tailoring the mix. But when one prosperous province tries to fund generous transfers from a local tax on its wealthy residents, those residents quietly relocate to a neighboring province, and the redistribution collapses with the base. Meridia learns Oates's lesson directly: it keeps redistribution and the social-insurance pools national, where no household can opt out by moving, and leaves the preference-sensitive public goods to the provinces. The government's seven decisions are made — and each was a point chosen on the equity-efficiency tradeoff that the chapter opened with.
The optimal-tax program, 1971–2011. The modern theory of optimal income taxation began with James Mirrlees's 1971 solution to the screening problem of an unobservable-ability population and reached its most usable form with Peter Diamond and Emmanuel Saez's 2011 statement of the top-rate formula in terms of estimable sufficient statistics. The intellectual descent of that program — from Pigou's welfare economics through Frank Ramsey's 1927 commodity-tax rule to Mirrlees and Diamond-Saez — belongs to the history of economic thought rather than to the apparatus, and is taken up there.
The welfare state was built, not derived. The compulsory social-insurance pools justified above as apparatus were constructed in waves — Bismarck's German social insurance in the 1880s, the Beveridge-era British settlement, and the broad postwar OECD expansion — a history narrated in the Economic History book's postwar golden age chapter rather than re-told here.
| Label | Equation | Description |
|---|---|---|
| Eq. 25.1 | $\text{buyer's share} = \varepsilon_s/(\varepsilon_s+\varepsilon_d)$ | Tax incidence by elasticity (the inelastic side bears more) |
| Eq. 25.2 | $\text{EB} \approx \tfrac{1}{2}\varepsilon(t/p)^2 pQ$ | Excess burden — deadweight loss rises with the square of the rate |
| Eq. 25.4 | $\sum_i \text{MRS}_i = \text{MRT}$ | Samuelson condition for efficient public-good provision |
| Eq. 25.5 | $t_k/p_k \propto 1/\varepsilon_k$ | Ramsey inverse-elasticity commodity-tax rule |
| Eq. 25.7 | $\tau^* = 1/(1+ae)$ | Diamond-Saez optimal top marginal rate |
| Eq. 25.8 | $\gamma \cdot \Delta c/c = d\log D / d\log R$ | Baily-Chetty optimal replacement rate |
| Eq. 25.9 | $W = W(u_1,\ldots,u_n)$; $\sum u_i$ / $\min u_i$ | Social welfare function (utilitarian / Rawlsian) |
| Eq. 25.10 | $dW = (g_{\text{poor}}-g_{\text{rich}}) - L$ | Leaky-bucket net welfare of a marginal transfer |
| Eq. 25.11 | $\text{effective MTR} = \tau + \phi$ | Effective marginal tax rate of a means-tested transfer |
| Eq. 25.12 | $\text{credit}(w)$: piecewise phase-in / plateau / phase-out | EITC schedule |
Ramsey (1927); Samuelson (1954); Buchanan (1965); Mirrlees (1971); Atkinson & Stiglitz (1976); Diamond & Saez (2011); Saez (2001); Baily (1978); Chetty (2006, 2008); Rothschild & Stiglitz (1976); Okun (1975); Friedman (1962); Tiebout (1956); Oates (1972); Musgrave (1959).