Chapter 25Public Economics

Intro

Imagine a newly seated government with a treasury, a parliament, and a country to run. It must raise money, and every way of raising it takes more from the economy than it puts in the treasury. It must decide what to buy with that money — which goods it will provide itself and which it will leave to the market. It must decide whom to insure against the accidents of life, and how generously, knowing that insurance changes the behavior of the insured. It must decide how much to move from richer households to poorer ones, knowing that some of what it moves disappears on the way. And it must decide which of these jobs belongs to the national capital and which to the town hall. Public economics is the study of how to make those decisions with apparatus rather than instinct.

The earlier chapters built the tools this one uses. Surplus, deadweight loss, and tax incidence came from the welfare chapter; public goods, externalities, adverse selection, and moral hazard came from the market-failures chapter. There the question was diagnostic: where does the unaided market fail? Here the question is prescriptive: given that it fails, and given that the state's own instruments are imperfect, what should the state actually do? That shift — from diagnosis to design — is the whole subject, and it runs into a single tension at every turn. Almost everything a government does to make the distribution of welfare more equal also makes the economy less efficient, and almost everything it does to raise efficiency does nothing for the distribution. The equity-efficiency tradeoff is the spine of the field, and it is named in the first section and never put down.

The treatment runs in the order the government itself faces the questions: who really bears a tax and what it costs to levy one, what the state should provide, what tax schedule is best once we admit that taxes distort behavior, how to insure citizens against risk, how much to redistribute and at what cost, how to design transfers without destroying the incentive to work, and which level of government should do each of these. Throughout, the apparatus is presented as it is taught today — the formal models live in the boxes marked for the formal reader, and a reader who never opens one will still leave able to argue every result.

À la fin de ce chapitre, vous serez capable de :
  1. Explain why a tax's economic burden is set by elasticities, not by who legally pays it, and why deadweight loss rises with the square of the tax rate
  2. Find the efficient level of a public good using the Samuelson condition, and distinguish public, merit, and club goods from the goods a market provides well
  3. State the Ramsey commodity-tax rule, the Mirrlees income-tax problem, and read the Diamond-Saez top rate as a balance of three forces
  4. Justify social insurance from adverse selection and set its optimal generosity from the Baily-Chetty balance of consumption-smoothing against moral hazard
  5. Use the social welfare function and Okun's leaky bucket to characterize the optimal amount of redistribution
  6. Trace the effective marginal tax rate through means-tested transfers and the earned income tax credit, and explain their opposite work incentives
  7. Apply the assignment problem and the Tiebout model to decide which level of government should tax, provide, and redistribute

Prerequisites: surplus, deadweight loss, and tax incidence (Ch. 3); public goods, the Samuelson condition, externalities, adverse selection, and moral hazard (Ch. 4). Helpful but not required: the labor-leisure and intertemporal choice of the consumer (Ch. 5), the government budget constraint (Ch. 16), and the elasticity evidence of the econometrics chapter (Ch. 10).

25.1 The Economics of Taxation: Incidence, Deadweight Loss, and the Equity-Efficiency Tradeoff

A government decides to raise revenue from the sale of a good and writes the tax into law as a charge on the seller. The sellers protest that they will bear it; the buyers congratulate themselves that they will not. Both are wrong, and the reason they are wrong is the first result of the subject. Who legally hands the money to the tax authority — the statutory incidence — has nothing to do with who ends up poorer for the tax — the economic incidence. The market reallocates the burden through the price, and it does so according to a rule that ignores the wording of the statute entirely.

Statutory vs. economic incidence. Statutory incidence is the legal assignment of a tax — who is required to remit it to the government. Economic incidence is the real distribution of the burden — whose purchasing power actually falls once prices adjust. The two coincide only by accident; the price mechanism determines the economic incidence regardless of the legal one.

When a per-unit tax of $t$ is imposed, the price the buyer pays and the price the seller keeps part ways by exactly $t$. How far the buyer's price rises and the seller's falls depends on which side of the market can more easily walk away. A buyer with elastic demand — many substitutes, no urgency — shrinks purchases rather than pay much more, so the seller's price must fall to keep the good moving, and the seller bears the tax. A seller with elastic supply — able to redeploy resources elsewhere — exits rather than absorb the tax, so the buyer's price must rise, and the buyer bears it. The general rule is that the inelastic side bears more, in proportion to the elasticities themselves.

$$\text{buyer's share of } t = \frac{\varepsilon_s}{\varepsilon_s + \varepsilon_d}, \qquad \text{seller's share} = \frac{\varepsilon_d}{\varepsilon_s + \varepsilon_d}$$ (Eq. 25.1)

With $\varepsilon_d$ the (absolute) demand elasticity and $\varepsilon_s$ the supply elasticity, the buyer absorbs a fraction $\varepsilon_s/(\varepsilon_s+\varepsilon_d)$ of the per-unit tax. The expression is symmetric in a revealing way: the share borne by each side is governed by the other side's elasticity. Nothing in it refers to the statutory assignment, which confirms the independence result — levying the same $t$ on the buyer instead of the seller leaves Eq. 25.1 unchanged.

Intuition

Pourquoi c’est important : A tax is a wedge driven between what the buyer pays and what the seller receives. Whoever can most easily change their behavior in response — stop buying, stop selling — escapes most of it, and the burden slides onto the side that is stuck. A tax on cigarettes falls mostly on smokers because smokers do not quit over a few cents; a tax on a luxury with close substitutes falls mostly on the firms that make it, because customers simply buy something else. The law names a payer; the market overrules it. Slide the elasticities on the figure and watch the burden split move while the statutory toggle changes nothing at all.

Raising revenue would be costless if behavior never changed. It is not costless precisely because behavior does change: the tax pushes the quantity traded below the level at which the good was worth more to its buyer than it cost its seller, and every one of those forgone trades was a gain that now goes unmade. The lost surplus is the tax's deadweight loss — the resources destroyed rather than transferred. Unlike the revenue, which moves from taxpayer to treasury, the deadweight loss benefits no one. It is the efficiency cost of taxation, and its size has a shape that governs everything that follows.

Deadweight loss (excess burden) of a tax. The surplus destroyed by a tax beyond the revenue it raises — the value of the mutually beneficial trades the tax prevents. Geometrically it is the Harberger triangle between the demand and supply curves over the range of suppressed quantity. It is the efficiency cost of raising public revenue, distinct from the equity question of who pays.
$$\text{EB} \approx \tfrac{1}{2}\,\varepsilon\left(\frac{t}{p}\right)^{2} p\,Q$$ (Eq. 25.2)

The excess burden of a small tax is approximately one-half the relevant compensated elasticity $\varepsilon$ times the squared proportional tax $(t/p)^2$ times the expenditure $pQ$. The decisive feature is the square: doubling the rate roughly quadruples the deadweight loss. The first units of taxation cost almost nothing — the triangle starts at a point — while each additional point of rate is more expensive than the last. The compensated (rather than uncompensated) elasticity appears because deadweight loss is the pure substitution distortion, with the income effect of the tax stripped out.

Intuition

Pourquoi c’est important : The cost of a tax grows much faster than the tax itself. A small tax sits in a corner where the trades it kills were barely worth making, so it destroys almost nothing while raising real money — this is why small broad taxes are the workhorse of public finance. But the cost climbs with the square of the rate: a rate twice as high does roughly four times the damage. That single fact is why economists distrust very high tax rates on anything elastic, and why the optimal-tax problem two sections from now is worth solving at all. On the figure, drag the rate slider and watch the triangle bloom — when you double the rate, the area readout roughly quadruples.

Figure 25.1. Tax incidence and the Harberger triangle. The tax wedge splits the price; the burden falls more heavily on the inelastic side, independent of which side the law charges. The shaded triangle is the deadweight loss; raising the rate slider grows it with the square of the rate. Drag sliders to explore.

The equity-efficiency tradeoff. The recurring tension that policies improving the distribution of welfare (equity) typically reduce the total size of the economy or the surplus available (efficiency), and vice versa. It arises because the instruments that move resources between people — taxes and transfers — also change the incentives to produce them. Every chapter result that follows is, at bottom, a way of locating the best point on this tradeoff.

These two facts — that the burden lands on the inelastic and that the cost rises with the square of the rate — already set up the field's organizing tension. A government that cared only about efficiency would tax the most inelastic things it could find, since those taxes change behavior least and so destroy the least surplus. But the most inelastic things are often necessities, bought in similar quantities by rich and poor alike, so taxing them heavily falls hardest on the poor. Efficiency points one way and equity the other. The remaining sections are the discipline's attempts to find, for each kind of public decision, the least-bad compromise. Raising the revenue, it turns out, costs something real; the next question is what the state should buy with it.

A government's first decision — who really pays?

The new government of Meridia needs revenue and reaches first for what looks easy: a tax on bread, which everyone buys and no one stops buying. The finance ministry is pleased — demand for bread is almost perfectly inelastic, so the tax raises money with almost no lost trades, a small Harberger triangle. Then the social-affairs ministry objects. Precisely because bread is inelastic, Meridian households cannot escape the tax by buying less; and because poor households spend a larger share of income on bread, the burden lands hardest on them. The same inelasticity that makes the tax efficient makes it regressive. Meridia has met the equity-efficiency tradeoff on its first day, and has not yet decided anything.

25.2 What the State Should Provide: Public Goods, Merit Goods, Clubs

Should the government build the lighthouse, or subsidize a private firm to build it? The question only has bite because of a property of the lighthouse the market-failures chapter named: its light, once lit, shines for every ship at once, and one ship's use of it subtracts nothing from another's. A good with that property — non-rival in consumption, and hard to withhold from non-payers — cannot be sold the way bread is sold, because the price mechanism that allocates bread has nothing to grip. Each ship would rather let another captain pay for the light and sail by it free. The market under-provides the lighthouse not because anyone is irrational but because rational free-riding is the equilibrium. The provision question begins where the market-failure diagnosis ends.

Public good. A good that is non-rival (one person's consumption does not reduce another's) and non-excludable (it is impractical to prevent non-payers from consuming it). Examples: national defense, a clean atmosphere, a lighthouse, basic research. Non-rivalry is the property that changes how efficient provision is calculated; non-excludability is the property that makes the market fail to provide it.

For an ordinary private good, the market finds the efficient quantity by adding up demand horizontally: at a given price, each person buys until their marginal benefit equals that price, and the quantities sum. A public good inverts this. Because everyone consumes the same single quantity at once, the relevant question is not how many units to allocate across people but how much benefit the whole population draws from one more unit of the shared good. So the marginal benefits are added vertically — every household's marginal valuation of the same extra unit, stacked — and the efficient level is where that vertical sum meets the marginal cost of providing it. This is the Samuelson condition, and the switch from horizontal to vertical summation is the entire conceptual content of public provision.

Samuelson condition. The efficient quantity of a public good is where the sum of all consumers' marginal rates of substitution (their marginal willingness to pay) equals the marginal rate of transformation (the marginal cost of producing it): $\sum_i \text{MRS}_i = \text{MRT}$. The marginal benefits are summed vertically because all consumers enjoy the same quantity simultaneously — in contrast to a private good, where quantities are summed horizontally at a common price.
$$\sum_{i=1}^{n} \text{MRS}_i = \text{MRT}$$ (Eq. 25.4)

Here $\text{MRS}_i$ is household $i$'s marginal rate of substitution between the public good and a private numeraire — its marginal willingness to pay for one more unit — and $\text{MRT}$ is the marginal rate of transformation, the units of the private good given up to produce one more unit of the public good (its marginal cost). The marginal rate of transformation is introduced here for the first time; it is the production-side counterpart to the marginal rate of substitution and measures the slope of the economy's production-possibility frontier. The free-riding equilibrium instead has each household contributing only until its own $\text{MRS}_i = \text{MRT}$, ignoring the benefit its contribution confers on everyone else, which lands provision far below the Samuelson level.

Intuition

Pourquoi c’est important : For a loaf of bread, you ask how many loaves to make and parcel them out one to a buyer. For a streetlight, everyone on the street gets the same light at the same time, so the question is not how to divide it but how bright to make it — and the answer is to add up how much that extra brightness is worth to every resident at once, and keep adding light until that total stops being worth the cost. That vertical stacking of everyone's small benefit is why a streetlight worth building is one that no single resident would pay for alone. Voluntary contributions fall short because each person counts only their own slice of the benefit. On the figure, toggle between vertical and horizontal summation and watch the efficient level jump; the free-rider marker sits well below it.

Figure 25.2. Public-good provision by vertical summation. Three households' marginal-benefit schedules are stacked vertically; the efficient level is where the vertical sum crosses marginal cost. The free-rider level — each contributing only to its own benefit — sits below it. Toggle to horizontal summation to see how a private good is priced instead. Drag sliders to explore.

The Samuelson condition tells the government how much of a pure public good to provide, but most of what governments actually supply is not pure. Two intermediate categories matter. A merit good is one the state provides or subsidizes not because the market would fail to supply it — schooling and basic healthcare can be sold privately — but because, left alone, people are judged to buy too little of it for their own good or for reasons of paternalistic or behavioral concern. The argument is no longer about non-rivalry; it is about whether individuals' own choices track their own welfare. That rationale leans on the behavioral apparatus — present bias, mispredicted future tastes, framing — developed in the behavioral chapter, and is invoked here without re-deriving it.

Merit good. A good the state provides or subsidizes on the judgment that individuals, left to the market, consume too little of it relative to their own true welfare — typically because of present bias, imperfect information, or mispredicted future preferences. The rationale is paternalistic or behavioral rather than a public-good or externality argument; education, preventive health care, and retirement saving are standard examples.
Club good. A good that is non-rival up to a congestion point but excludable, so that access can be priced and membership controlled — a toll road, a swimming pool, a satellite TV signal. The provision problem is to choose the optimal club size: members share the fixed cost, which favors a larger club, but congestion degrades the good, which favors a smaller one. James Buchanan's club theory locates the size where the marginal cost-sharing benefit of one more member equals the marginal congestion cost.

Between the pure public good (provide it publicly, finance it by tax) and the pure private good (let the market price it) sit the club goods — non-rival until they congest, but excludable, so a price can be charged at the gate. A toll bridge with light traffic serves an extra car at no cost to the others, like a public good; once it jams, each car slows the rest, like a private one. Buchanan's club-size problem balances these: a larger club spreads the fixed cost over more members but worsens congestion, and the optimal membership is where the saving from one more member's dues just offsets the crowding that member adds. The provision menu, then, runs from market to club to public, and the state's first design decision is to place each good on it correctly. Having decided what to provide, the government must raise the revenue to pay for it — and since taxes distort, the schedule it chooses is the hardest problem in the field.

Is housing a market good or a welfare good?

Apparatus stop. The walkthrough argues over whether the state should let a market house people or provide housing itself. The provision menu here — market, merit, public — is the tool that decision needs.

Ce que dit le modèle

Whether to treat housing as something a market should price or something the state should provide is a placement question on the provision menu of this section. Housing is mostly rival and excludable — closer to a private good than a lighthouse — so the pure-public-good case is weak. The live arguments run instead through the merit-good channel (do people under-provide for their own housing security in ways a paternalistic state should correct?) and through externalities and distribution (homelessness imposes costs on others; secure housing has been argued to raise health and schooling outcomes). The Samuelson condition does not settle it; the merit-good and distributional rationales do the work.

Le jugement (à ce niveau)

This section gives the walkthrough its vocabulary — public, merit, club, market — and the discipline of asking which rationale actually applies before reaching for direct provision. The contested call (is housing's under-consumption a genuine merit-good case, and is direct public provision better than a housing subsidy that lets a market clear?) is the walkthrough's to argue, not the chapter's. The apparatus is presented straight; the live position lives in the walkthrough.

Apparatus stop — the provision menu (market / merit / public)

What did behavioral economics actually change?

Applied-field stop. The walkthrough tracks where behavioral findings reshaped real fields. The merit-good rationale for public provision is one of those places.

Ce que dit le modèle

The merit-good case for state provision is the cleanest place public economics absorbed behavioral economics. The classical justification for provision is non-rivalry or an externality; the merit-good justification is that people, judged against their own long-run welfare, under-consume schooling, preventive care, or retirement saving because of present bias and mispredicted future tastes. That is a behavioral claim doing public-finance work — it converts a paternalism that the rational-agent model could not license into a defensible design argument.

Le jugement (à ce niveau)

This section names the behavioral rationale and uses it to justify merit-good provision, but it does not re-derive prospect theory or present bias — that apparatus lives in the behavioral chapter and is cross-linked, not rebuilt. The walkthrough's job is to argue how far the paternalistic license should run; the chapter's job is to show where it legitimately enters the provision decision.

Applied-field stop — merit goods as behavioral public economics

Meridia decides what to provide

Meridia's coast needs a lighthouse and its towns need preschools. The lighthouse is easy once the ministry sees the structure: no shipping firm will pay for a light every other firm sails by free, and the light costs nothing extra per ship, so Meridia provides it directly and funds it from general revenue — the textbook public good. Preschools are harder. A private market for them could exist, so this is not a free-rider failure; the case for public funding rests on the claim that parents, left alone, under-invest in early education relative to their children's long-run gain. Meridia treats preschool as a merit good and subsidizes it, while leaving the toll bridge across the river as a club good — priced at the gate, sized so the dues collected from one more daily user just offset the congestion that user adds.

25.3 Optimal Taxation: Ramsey, Mirrlees, Diamond-Saez

Suppose the government must raise a fixed sum and can tax several goods. If taxes were costless it would not matter how the burden was split, but the previous section's square law makes it matter intensely: a tax's deadweight loss rises with the rate and with how much the taxed quantity responds. The efficient way to raise a given revenue, then, is to lean on the goods whose quantities respond least — the inelastic ones — because taxing them changes behavior, and so destroys surplus, the least. This is the Ramsey rule, and in its simplest form it says to set the tax rates so that the proportional reduction in compensated demand is the same for every good, which under the standard assumptions means taxing each good roughly in inverse proportion to its elasticity.

Ramsey optimal-commodity-tax rule. To raise a fixed revenue with the least total deadweight loss, set commodity tax rates so that the compensated quantity of each taxed good falls by the same proportion. Under separable preferences this reduces to the inverse-elasticity rule: tax each good roughly in inverse proportion to its compensated demand elasticity, taxing inelastic goods more heavily and elastic goods less.
$$\frac{t_k}{p_k} \;\propto\; \frac{1}{\varepsilon_k} \qquad \text{(inverse-elasticity form, separable case)}$$ (Eq. 25.5)

In the general Ramsey problem the planner minimizes total excess burden subject to a revenue constraint, and the first-order conditions equalize the proportional compensated quantity reduction across goods. Under the simplifying assumption that demands are independent (no cross-price effects), this collapses to the inverse-elasticity rule shown: the proportional tax on good $k$ varies inversely with its own compensated elasticity $\varepsilon_k$. The result is purely an efficiency prescription — it minimizes the distortion of raising the revenue and says nothing about who is made poorer.

Intuition

Pourquoi c’est important : If you must collect a fixed sum and every tax destroys some trades, collect it where the fewest trades will die — tax what people will keep buying anyway. That is the whole efficiency logic. And it has a sting the government cannot avoid: the things people keep buying regardless of price are the necessities, and taxing necessities heavily is exactly what falls hardest on the poor. The most efficient commodity-tax system is regressive almost by construction. On the figure, search the split between two goods and watch the total excess burden bottom out where the inelastic good carries the heavier rate — then notice that the inelastic good is the necessity.

Figure 25.3. Ramsey inverse-elasticity tax assignment. Total excess burden as a function of how a fixed revenue target is split between an inelastic good and an elastic good; it is minimized when more revenue is drawn from the inelastic good. The regressivity annotation fires when the necessity carries the heavier rate. Drag sliders to explore.

The commodity-tax problem treats the regressivity as a side effect to be regretted. The income tax confronts it directly, because the income tax is the instrument equity actually runs on. James Mirrlees posed the problem in its modern form: a government wants to redistribute from high earners to low earners but cannot observe ability, only income. If it taxes high incomes heavily, the able have less reason to earn — and worse, a high earner can always choose to work less and look like a low earner, so any redistributive schedule must be designed so that no type prefers to mimic a lower one. This is a screening problem, the same incentive-compatibility logic the mechanism-design chapter develops, applied to the whole population's earnings. The optimal income tax is the schedule that buys the most redistribution net of the output it discourages.

Optimal income tax (Mirrlees). The income-tax schedule that maximizes social welfare when the government can observe income but not innate earning ability. Because a high-ability worker can imitate a low-ability one by working less, the schedule must be incentive-compatible — no type prefers to mimic a lower type. The solution trades the redistributive value of taxing high incomes against the output lost when taxation blunts the incentive to earn.
Elasticity of taxable income. The percentage change in reported taxable income in response to a one-percent change in the net-of-tax rate, denoted $e$. It is the single summary statistic for how much behavior — hours, effort, avoidance, reporting — responds to taxation, and it is the central, contested input to the optimal top-rate formula. (Distinct from the commodity-market price elasticities $\varepsilon_d, \varepsilon_s$ of §25.1.)

For the very top of the distribution the problem has a clean and famous answer. Peter Diamond and Emmanuel Saez showed that the optimal top marginal rate balances three forces. The first pushes the rate up: a high earner has a large income above any threshold, so each extra point of rate raises a lot of revenue from that inframarginal base. The second pushes it down: a higher rate makes top earners report less income — work less, avoid more — and how strongly they respond is the elasticity of taxable income, $e$. The third sets the floor: how much society values an extra dollar in a top earner's hands rather than redistributed, the social welfare weight on the rich. When that weight is essentially zero — when the government cares only about the revenue raised from the top to spend on everyone else — the formula reduces to a single expression in the elasticity and the shape of the top tail.

$$\tau^{*} = \frac{1}{1 + a\,e}$$ (Eq. 25.7)

In the revenue-maximizing (zero top welfare weight) limit, the optimal top marginal rate $\tau^{*}$ depends on the elasticity of taxable income $e$ and the Pareto-tail parameter $a$ describing the thickness of the top of the income distribution (empirically $a \approx 1.5$–$2$). The rate falls monotonically as $e$ rises: the more top earners respond to taxation, the lower the revenue-maximizing rate. With a positive welfare weight on top earners the optimum is lower still. The whole policy disagreement about top rates is, formally, a disagreement about the value of $e$ — which is why the next paragraph treats it as a contested band rather than a number.

Intuition

Pourquoi c’est important : The best top tax rate is a tug-of-war between three pulls. Pulling the rate up: a top earner has a huge income above the threshold, so every extra point of tax raises real money with no further behavioral loss on the dollars they keep earning anyway. Pulling it down: tax them harder and they report less — they work less, they shelter more, they leave — and how hard they pull back is the one number, the elasticity, that nobody has pinned down. Setting the floor: how much you think a dollar is worth in a billionaire's pocket versus a poor family's. The rate where the rope settles is the optimum. You do not need the formula to see it move — on the figure, raise the elasticity and watch the optimal rate fall, because behavior fighting back is exactly what makes high rates self-defeating.

Figure 25.4. The Diamond-Saez optimal top marginal rate as a function of the elasticity of taxable income. The optimal rate falls as the elasticity rises. The shaded band marks the contested mainstream range of the elasticity — the disagreement over the right top rate is, at bottom, this band. Drag sliders to explore.

Where the mainstream lands on the top rate is genuinely unsettled, and honesty requires saying where it leans rather than averaging the disagreement away. The decisive input is the elasticity of taxable income. Saez and collaborators have argued for relatively low elasticities for ordinary earnings — implying optimal top rates in the region of fifty to seventy percent — on the grounds that much of the apparent response to high rates is avoidance and income-shifting that better tax design can curb rather than a real reduction in productive effort. Others read the same evidence as showing larger behavioral responses, which pushes the optimal rate well below that range. The mainstream center of gravity sits at a substantial but not confiscatory top rate, with a frank acknowledgment that the number rests on an elasticity that the profession has not pinned down. The figure shows the disagreement honestly: not a point, but a band.

Should we tax wealth or income?

Apparatus home. The walkthrough argues over taxing stocks of wealth versus flows of income. The incidence, deadweight-loss, and optimal-income-tax machinery it leans on is taught here at full depth.

Ce que dit le modèle

The wealth-versus-income argument runs on three tools this chapter owns. Incidence (§25.1) settles who actually bears a tax once markets adjust — a wealth tax on capital can fall on workers if capital is mobile. The deadweight-loss square law (§25.1) governs how costly each instrument is at the margin. And the Diamond-Saez framework (§25.3) reframes the top-rate question as a balance of the inframarginal base, the behavioral elasticity, and the social weight — the same logic extends, with a different and higher elasticity, to taxing the return on wealth. The walkthrough depth-defers to this section rather than re-deriving any of it.

Le jugement (à ce niveau)

The live disagreement — whether a wealth tax's higher behavioral elasticity and administrative leakage make it worse than a well-designed capital-income tax, or whether wealth concentration is a target income taxation cannot reach — is the walkthrough's to argue. This chapter supplies the elasticity-band discipline and the optimal-tax frame; it names where the mainstream leans on the top income rate and stops there. The wealth-tax verdict is downstream, in the walkthrough.

Apparatus home — incidence, deadweight loss, and the optimal income tax

Meridia picks a tax schedule

Meridia's commodity taxes were efficient and regressive, so the government turns to the income tax to do the redistributive work. Its first instinct — a very high rate on top earners — runs straight into the screening problem: the most able can always earn less and look ordinary, so the schedule has to be designed around the prospect of mimicry. Meridia's analysts compute the revenue-maximizing top rate and find it pivots entirely on one number, how much top earners' reported incomes shrink when taxed. They cannot pin it down, so they report a range rather than a point and set a substantial top rate near the upper end of the elasticity-implied band, with the explicit caveat that if the behavioral response proves larger than assumed, the rate is too high. Honesty about the unknown elasticity, not false precision, is the deliverable.

25.4 Social Insurance and the Welfare State

A worker wants insurance against losing a job, a household against catastrophic illness, an old person against outliving their savings. In principle a private market could sell each of these. In practice the markets thin out or vanish, and the reason is the same one the market-failures chapter named: adverse selection. The people who most want unemployment insurance are those most likely to be laid off; the people who most want health insurance are those who expect to fall ill. An insurer who cannot tell them apart must price for the worst risks, which drives the good risks out, which raises the price further, until the market unravels. The state has one lever no private insurer has — compulsion. By requiring everyone into the pool, it stops the good risks from leaving, and the pool survives. This is the distinctive economic case for social insurance, and it is a case the unaided market genuinely cannot meet.

Social insurance. Government-provided insurance against risks — unemployment, illness, disability, longevity — financed by compulsory contributions. Its core economic rationale is that compulsion defeats the adverse-selection unraveling (Rothschild-Stiglitz) that thins or destroys private insurance markets: by forcing good and bad risks into a common pool, the state sustains coverage that voluntary markets cannot. It is distinct from, though often bundled with, redistribution.

Compulsion solves adverse selection but reintroduces the other classic insurance problem: moral hazard. An insured worker searches less hard for a new job; an insured patient consumes more care; an annuitant is freed from the discipline of self-funded retirement. The more generous the insurance, the more it smooths consumption when the bad event strikes — its benefit — and the more it dulls the incentive to avoid or shorten the bad event — its cost. The optimal generosity sits where these balance. Martin Baily and Raj Chetty formalized this for unemployment insurance: the optimal replacement rate is the one at which the marginal consumption-smoothing benefit of a more generous benefit just equals its marginal moral-hazard cost.

Baily-Chetty optimal replacement rate. The level of social-insurance generosity (the replacement rate $R$, the fraction of lost income the benefit replaces) that balances the marginal consumption-smoothing benefit against the marginal moral-hazard cost. The benefit is larger the more risk-averse the worker and the deeper the consumption drop the bad event would otherwise cause; the cost is larger the more the insured behavior (job-search duration, care use) responds to the benefit. Higher behavioral response means a lower optimal replacement rate.
$$\underbrace{\gamma \cdot \tfrac{\Delta c}{c}}_{\text{consumption-smoothing benefit}} \;=\; \underbrace{\frac{d \log D}{d \log R}}_{\text{moral-hazard cost}}$$ (Eq. 25.8)

In the Baily-Chetty condition the optimal replacement rate equates a benefit term — proportional to the coefficient of relative risk aversion $\gamma$ times the proportional consumption drop $\Delta c/c$ that the benefit prevents — to a cost term, the elasticity of the insured behavior (here the duration of unemployment $D$) with respect to the benefit. The condition is a sufficient-statistics result: it requires only the observable consumption drop, a measure of risk aversion, and the behavioral elasticity, not a fully specified structural model. As the behavioral elasticity rises, the cost term grows and the optimal replacement rate falls.

Intuition

Pourquoi c’est important : Insurance is a seesaw. On one side is the relief it buys — a laid-off worker who would otherwise see their living standard collapse instead keeps the lights on, and that smoothing is worth a great deal when the fall would be steep and the worker fears it. On the other side is the slack it induces — paid most of their wage to be unemployed, a worker takes longer to find the next job. The right replacement rate is where the seesaw balances: insure against the suffering, but not so generously that you are paying people to stay out of work. And the balance point moves: the more strongly behavior responds to the benefit, the further the seesaw tips toward less insurance. On the figure, raise the moral-hazard elasticity and watch the optimal replacement rate slide down.

Figure 25.5. The Baily-Chetty insurance-incentive balance. The marginal consumption-smoothing benefit (falling in the replacement rate) and the marginal moral-hazard cost (rising in it) cross at the optimal replacement rate. Raising the moral-hazard elasticity steepens the cost curve and slides the crossing point — the optimal rate — down. Drag sliders to explore.

The welfare state's three largest programs are this same apparatus applied to three risks. Unemployment insurance is the Baily-Chetty problem in its native form, and the empirical work centers on the duration elasticity — how much longer the insured stay unemployed — which is the moral-hazard cost the optimal replacement rate trades against. Public pensions answer a different failure: the private annuity market is thin because of adverse selection (those who buy annuities expect to live long), so the state pools longevity risk compulsorily and lets people smooth consumption across a life they cannot predict the length of. Public health insurance answers adverse selection in its sharpest form — the sick most want coverage — and the case for compulsory or single-payer pooling is precisely that it defeats the unraveling a voluntary health-insurance market suffers. Each program is one structural insight, not a tour of benefit rules.

Pay-as-you-go vs. funded pensions. A funded pension invests each cohort's contributions and pays them their own accumulated savings; a pay-as-you-go (PAYG) pension taxes today's workers to pay today's retirees, transferring across generations rather than across a lifetime. PAYG embeds an intergenerational transfer and is exposed to demographic change; funded systems are exposed to asset returns. The choice between them is partly an insurance question (PAYG insures against poor lifetime asset returns) and partly a redistribution question (across cohorts).

That pension example exposes a distinction that runs through the entire welfare state and is easy to miss. The same transfer can be doing two completely different things. When a pension takes from a worker's earning years and pays them in retirement, the recipient is the contributor's own future self, and the transfer is insurance — smoothing one person's consumption across time and across the risk of an unknown lifespan. When a pension takes more from high lifetime earners and pays more to low lifetime earners, the recipient is a different household, and the transfer is redistribution — moving resources across the distribution. Real programs mix both, and which one is first-order differs by program. A flat contributory pension is mostly insurance; a means-tested old-age benefit is mostly redistribution. Seeing the welfare state as two programs sharing one budget line is the conceptual key to the redistribution section that follows.

Figure 25.6. The same pension, decomposed. A stylized pay-as-you-go pension's total transfer split into its across-time insurance component (the worker's own smoothed life-cycle consumption) and its across-household redistribution component (from high to low lifetime earners). As the benefit formula is made more progressive, the redistribution slice grows and the insurance slice shrinks. Toggle the progressivity to see the mix shift.

What is the welfare state for?

Apparatus home for two of the three rationales. The walkthrough separates insurance, redistribution, and macro stabilization. The insurance machinery — and the insurance-vs-redistribution distinction — is taught here.

Ce que dit le modèle

The walkthrough's argument turns on keeping three jobs of the welfare state distinct. Two of them are this chapter's: the insurance rationale (compulsory pooling defeats adverse selection; the Baily-Chetty replacement rate sets its generosity, §25.4) and the redistribution rationale (the social welfare function and the leaky bucket, §25.5). The insurance-vs-redistribution decomposition — the same transfer reading as one or the other depending on who the recipient is — is the spine the walkthrough peeks here.

Le jugement (à ce niveau)

The third rationale — that the welfare state acts as an automatic macroeconomic stabilizer, propping up demand in downturns through transfers that rise as incomes fall — is not taught here. That is the multiplier and automatic-stabilizer apparatus of the intro-macro and monetary-fiscal chapters, and the walkthrough routes its stabilization beat there, not to this chapter. This section owns the microeconomics of insuring and redistributing; the macro stabilization role stays where it is built.

Apparatus home — social insurance and the insurance/redistribution split

Meridia insures its citizens

Meridia's private unemployment insurers had collapsed — only workers who feared layoffs bought policies, prices rose, the rest left, the market unraveled. The government makes coverage compulsory, restoring the pool, then has to choose how generous to be. Its analysts measure the consumption drop that unemployed Meridians suffer (large, which argues for generosity) and the degree to which benefits lengthen the time people stay unemployed (the duration elasticity, which argues for restraint), and set the replacement rate where the two balance. For pensions they pool longevity risk compulsorily, since no Meridian knows how long they will live; for healthcare they mandate a single pool, since the sick would otherwise be the only buyers. When a critic complains that the pension also takes more from the rich than it returns to them, Meridia's analysts agree — and note that this part of the pension is not insurance at all but redistribution, the subject of the next decision.

25.5 Redistribution: The Social Welfare Function and the Leaky Bucket

How much should a government redistribute? The question has two parts that are easy to run together and must be kept apart. The first is a value question: how much does society care about an extra dollar in a poor household's hands relative to a rich one's? The second is a positive question: how much of a dollar taken from the rich actually arrives in the poor household's hands, and how much is lost on the way? Economics has no answer to the first — it is a judgment about social priorities — but it gives that judgment a precise form, the social welfare function, and it has a great deal to say about the second.

Social welfare function. A function $W = W(u_1, \ldots, u_n)$ that aggregates individual utilities into a single social objective, encoding how society trades one person's welfare against another's. The utilitarian form $W = \sum_i u_i$ weights everyone's utility equally and so favors redistribution only through diminishing marginal utility; the Rawlsian form $W = \min_i u_i$ cares only about the worst-off and so favors redistribution maximally. Real social objectives sit on a dial between these poles.
$$W = W(u_1, \ldots, u_n), \qquad W_{\text{util}} = \sum_i u_i, \qquad W_{\text{Rawls}} = \min_i u_i$$ (Eq. 25.9)

The social welfare function $W$ maps the profile of individual utilities into a social ordering. Two special cases anchor the dial: the utilitarian $\sum_i u_i$, which is indifferent to the distribution of utility itself and redistributes only because money has diminishing marginal utility, and the Rawlsian $\min_i u_i$, which evaluates society by its worst-off member and so redistributes until the bottom can be raised no further. A constant-elasticity form $W = \sum_i u_i^{1-\rho}/(1-\rho)$ spans the two as the inequality-aversion parameter $\rho$ runs from $0$ (utilitarian) toward $\infty$ (Rawlsian).

The positive part of the question is Arthur Okun's leaky bucket. Carrying water from the rich to the poor, the bucket leaks: some of every transferred dollar is lost to administrative cost and, more importantly, to the behavioral distortions the earlier sections measured — the taxes that fund the transfer deaden effort, and the transfer itself can blunt the recipient's incentive to earn. So a dollar taken from the rich arrives as less than a dollar. Whether redistribution is worth it depends jointly on how much society weights the recipient (the social welfare function) and how leaky the bucket is (the empirical magnitude of the distortions). With a small leak and a strong concern for the poor, almost any transfer is worth making; with a large leak and weak concern, even a modest transfer fails the test. The verdict can flip with either input.

Leaky bucket (Okun). Arthur Okun's metaphor for the efficiency cost of redistribution: a dollar taken from a high-income household arrives at a low-income household as less than a dollar, the difference lost to administrative cost and to the behavioral distortions (reduced work, reduced saving) the funding tax and the transfer induce. The size of the leak — the share of each dollar lost in transit — is the empirical crux of the redistribution debate.
$$dW = \underbrace{(g_{\text{poor}} - g_{\text{rich}})}_{\text{distributional gain}} \;-\; \underbrace{L}_{\text{leak: admin + behavioral DWL}} \quad \text{per dollar transferred}$$ (Eq. 25.10)

The social-welfare change from a marginal dollar of redistribution is the distributional gain — the difference between the social marginal value $g$ of a dollar to the recipient and to the donor, which is large under a Rawlsian $W$ and small under a near-utilitarian $W$ once incomes are close — minus the leak $L$, the per-dollar loss to administration and to the behavioral deadweight loss of the funding tax and the transfer's own incentive effects. Redistribution is worth expanding while $dW > 0$ and stops at $dW = 0$, which is the equity-efficiency frontier for transfers. Both terms are needed: the sign of $dW$ can be reversed by changing either $W$ or $L$.

Intuition

Pourquoi c’est important : Every redistributed dollar arrives as less than a dollar — some leaks out as paperwork, but mostly as the work and effort that the funding tax and the handout together discourage. Whether the transfer is worth it comes down to two things, and reasonable people who agree on all the economics can still disagree because they disagree on these two. The first is how much you weight a dollar in a poor family's pocket against a dollar in a rich one's — a value, not a fact. The second is how leaky the bucket actually is — a fact, but a contested one. On the figure, set a small leak and a Rawlsian concern for the worst-off and the transfer is plainly worth it; crank the leak up or flatten your concern toward pure utilitarianism with incomes already close, and the same transfer turns net-negative. The verdict flips.

Figure 25.7. The leaky bucket and the equity-efficiency frontier. The net social-welfare change from a dollar of redistribution, as the leak (administrative plus behavioral loss) and the social welfare function vary. The curve traces net welfare against leak size; the verdict turns positive or negative as the SWF and the leak change. Drag sliders and toggle the SWF to flip the verdict.

Where does the mainstream land on the size of the leak? As with the top tax rate, the honest answer is a calibrated position rather than a settled number. The mainstream view is that the leak is real but, for moderate redistribution, smaller than the strongest efficiency-pessimist claims — most empirical estimates of the behavioral response to the taxes and transfers that fund a moderate welfare state are large enough to matter but not large enough to make redistribution self-defeating. The crucial caveat is that the leak does not stay constant: because deadweight loss rises with the square of the rate, the bucket leaks much faster at high rates of taxation and high transfer-withdrawal rates. A moderate welfare state carries a modest leak; pushing redistribution toward the limits of the schedule runs into the square law, and there the efficiency cost climbs steeply. Naming that shape — modest in the middle, steep at the extremes — is the calibrated position the chapter takes.

One honest qualification belongs here before the chapter moves to instrument design. Every optimum in this chapter — the optimal tax, the optimal level of provision, the optimal replacement rate, the optimal transfer — has assumed a government that wants to maximize social welfare and is competent to do it. That is a strong assumption. A rival tradition asks whether real governments behave that way at all, or whether they are captured by organized interests, distorted by the incentives of officials and voters, and prone to a government failure that is the mirror image of the market failure the chapter began with. The apparatus here takes the benevolent-planner posture because that is what lets the design problem be posed; it does not establish that actual states achieve the optima it computes. That counter-argument is a tradition in its own right, treated where the history of economic thought takes it up, and the reader should carry the caveat forward rather than mistake the optima for descriptions of what governments do.

Meridia weighs how much to redistribute

With insurance settled, Meridia confronts redistribution proper. Its cabinet splits not over the economics but over the value question: some weight the worst-off heavily, near the Rawlsian pole, and want large transfers; others, closer to utilitarian once incomes are not far apart, want little. The analysts cannot settle that — it is a judgment — but they can price the leak. They estimate that for the moderate transfers under discussion, each redistributed dollar arrives as roughly seventy cents, the rest lost to administration and to dulled work incentives. At that leak, the Rawlsian faction's transfers clear the bar and the near-utilitarian faction's barely do; both sides learn that their disagreement is about values plus the contested leak, not about whether redistribution "works." And both accept the analysts' warning that if they push the rates much higher, the square law means the bucket will leak far faster than seventy cents on the dollar.

25.6 Transfer Design: Means-Tested vs. Universal, NIT and the EITC

Once a government decides to redistribute, it must choose how. The first choice is between targeting and universalism. A means-tested transfer goes only to those below an income threshold, concentrating resources where need is greatest and so delivering more equity per dollar spent. But targeting has three costs. It is administratively expensive to verify eligibility; it carries stigma that suppresses take-up among the eligible; and, most importantly, withdrawing the benefit as income rises imposes an implicit marginal tax rate on the recipient. A dollar earned reduces the benefit, so the recipient keeps far less than a dollar — the phase-out is a tax in everything but name, and a steep one. A universal transfer avoids these by going to everyone regardless of income, but it spends heavily on households that do not need it and must be funded by higher general taxation.

Means-tested vs. universal transfers. A means-tested transfer is paid only to households below an income or asset threshold and withdrawn as income rises; it targets resources efficiently but imposes a high implicit marginal tax rate (the phase-out) and incurs administrative and take-up costs. A universal (categorical) transfer is paid to everyone in a category regardless of income; it avoids the phase-out distortion and the stigma but spends on the non-needy and requires higher funding taxes.
Implicit (effective) marginal tax rate. The total reduction in net income a household experiences when it earns one more dollar, combining the statutory income-tax rate with the rate at which benefits are withdrawn (the phase-out). A means-tested transfer with a steep phase-out can push the effective marginal tax rate on a low earner above that faced by a high earner, blunting the incentive to work.

Milton Friedman's negative income tax was the clean idea that cuts through the targeting-versus-universalism dilemma: a single integrated schedule that pays a guaranteed minimum to those with no income and taxes earnings at a constant rate, so that the transfer phases out smoothly rather than at a benefit cliff. Its practical descendant is the earned income tax credit, which goes one step further by making the transfer conditional on work. The EITC has three segments. In the phase-in, the credit grows with earnings — the government adds to each dollar earned, so the effective marginal tax rate is negative, and the program pays people to work. In the plateau, the credit is flat. In the phase-out, the credit is withdrawn as income rises, imposing a positive implicit marginal tax rate. The contrast with a means-tested transfer is exact and instructive: means-testing taxes the first dollar a poor household earns at a high implicit rate; the EITC subsidizes those first dollars and only claws back later, gently.

Negative income tax and the EITC. A negative income tax (Friedman) integrates transfers and taxes into one schedule: a guaranteed minimum at zero income, taxed away at a constant rate as earnings rise, eliminating benefit cliffs. The earned income tax credit conditions the transfer on work through a phase-in (credit rises with earnings, a negative effective marginal rate that subsidizes work), a plateau (flat credit), and a phase-out (credit withdrawn, a positive implicit marginal rate). It is the canonical work-conditioned transfer.
$$\text{effective MTR} = \tau_{\text{statutory}} + \phi_{\text{phase-out}}$$ (Eq. 25.11)
$$\text{credit}(w) = \begin{cases} s\,w & 0 \le w \le w_1 \;\text{(phase-in)} \\ s\,w_1 & w_1 < w \le w_2 \;\text{(plateau)} \\ s\,w_1 - \phi\,(w - w_2) & w > w_2 \;\text{(phase-out)} \end{cases}$$ (Eq. 25.12)

The effective marginal tax rate (Eq. 25.11) adds the statutory income-tax rate $\tau$ to the benefit phase-out rate $\phi$, so a benefit withdrawn at rate $\phi$ taxes earnings exactly as a statutory tax of $\phi$ would. The EITC schedule (Eq. 25.12) pays a subsidy at rate $s$ per dollar in the phase-in (where the effective marginal rate is $-s$, a subsidy to work), holds the credit flat across the plateau (effective marginal rate equal to the statutory rate alone), and withdraws it at rate $\phi$ in the phase-out (effective marginal rate $\tau + \phi$). A pure means-tested transfer is the special case with no phase-in: the full phase-out rate applies from the first dollar earned.

Intuition

Pourquoi c’est important : Targeting the poor by means-testing has a hidden trap: because the benefit shrinks as you earn, every dollar a poor person earns is partly clawed back, so they face a steeper effective tax on their first dollar than a rich person faces on their last. The earned income tax credit turns this upside down at the bottom — it pays you more for working more, so the first dollars earned are subsidized, not taxed, and only further up the scale does it withdraw the credit gently. Means-testing punishes the first dollar earned; the credit rewards it, then withdraws softly. On the figure, walk a worker's earnings from zero upward and watch the effective tax rate dive negative across the EITC's phase-in, sit flat on the plateau, and climb in the phase-out — then overlay a means-tested transfer and see it start high from the very first dollar.

Figure 25.8. The effective marginal tax rate across an EITC schedule. Negative in the phase-in (work is subsidized), zero plus the statutory rate on the plateau, positive in the phase-out (the credit claws back). Overlay a means-tested transfer to see its high implicit rate apply from the first dollar earned. Drag sliders and toggle the overlay.

Meridia designs the transfer

Meridia's redistribution could go to households below a poverty line, sharply targeted. But the analysts show the cabinet the trap: withdrawing that benefit as recipients earn would tax their first earned dollar at over fifty percent, and Meridians on the program would face a steeper effective rate on a dollar of work than the country's top earners face on theirs. So Meridia instead builds a work-conditioned credit on Friedman's logic — it tops up low earnings, so the first dollars a poor Meridian earns are subsidized rather than clawed back, and only at higher earnings does the credit withdraw gently. The cabinet accepts that this spends some money on households just above the bottom, the price of not punishing work. Which level of government should run it is the last question.

25.7 Fiscal Federalism (Brief): Who Should Tax and Provide

Every decision so far has assumed a single government. Most countries have several layers — national, regional, local — and the last question is which layer should do which job. This is the assignment problem, and it has a clean efficiency logic with one sharp catch. The efficiency case for decentralization is that local governments can match public goods to local tastes. A coastal town wants harbors; a mountain town wants ski-patrol roads; a single national bundle would over-provide one and under-provide the other. If households can move, they will sort themselves into the jurisdiction whose bundle of public goods and taxes best fits their preferences — voting with their feet. Charles Tiebout showed that this mobility can, in principle, reveal preferences for local public goods that voting in a single jurisdiction cannot, and discipline local governments through the threat of exit.

Fiscal federalism and the assignment problem. The study of which level of government — national, regional, local — should tax, spend on, and provide which public functions. The assignment problem is to allocate each function to the level that handles it best: local provision matches goods to local preferences; central provision internalizes spillovers and sustains redistribution. The decentralization tradeoff weighs preference-matching and inter-jurisdictional competition against spillovers, scale economies, and the mobility that undermines local redistribution.
Tiebout model. Charles Tiebout's model in which mobile households "vote with their feet," sorting into the jurisdiction whose bundle of local public goods and taxes best matches their preferences. Mobility can reveal preferences for local public goods and discipline local governments through the threat of exit — but the same mobility lets high-income households flee local redistributive taxes, which is why redistribution cannot be sustained at the local level.

The catch is that the same mobility that makes decentralized provision efficient makes decentralized redistribution impossible. If a town raises a local tax to fund transfers to its poor, its rich households can simply move to the next town, taking their tax base with them, while poor households move in to claim the benefit. The redistributive tax erodes its own base. Redistribution can only be sustained at a level no one can easily exit — the national government — because exit is what disciplines local provision and what destroys local redistribution at the same time. The decentralization theorem, Wallace Oates's result, captures the resulting assignment: provide a public good locally when preferences differ across jurisdictions and spillovers are small, centrally when they are uniform or spill across borders, and assign redistribution to the center regardless. The figure makes the redistribution catch concrete: switch on a local redistributive tax and watch the high-income households leave.

Intuition

Pourquoi c’est important : Local government is good at one thing and hopeless at another, and it is the same fact that makes it both. People can move between towns, so towns must compete to offer the public goods residents want — that is what keeps local provision responsive. But that same ability to move means a town that tries to tax its rich to help its poor will simply watch its rich leave for the next town. You cannot redistribute at a level people can walk away from. So the rule falls out almost by itself: let towns provide the goods that local tastes differ over, and leave redistribution to the level no one can exit. On the figure, switch on a local tax meant to redistribute and watch the high-income households vanish to the untaxed jurisdiction, collapsing the very revenue the tax was supposed to raise.

Figure 25.9. Tiebout sorting and the redistribution catch. Households sort across jurisdictions by preference. When jurisdiction A imposes a local redistributive tax, its high-income households exit to untaxed jurisdictions, collapsing the local tax base the redistribution depended on. Toggle the local tax to see the exit.

Meridia divides the work between its governments

Meridia's coastal and mountain provinces want different things, so the national government devolves local public goods — harbors, ski roads, parks — to the provinces, which compete to attract residents by tailoring the mix. But when one prosperous province tries to fund generous transfers from a local tax on its wealthy residents, those residents quietly relocate to a neighboring province, and the redistribution collapses with the base. Meridia learns Oates's lesson directly: it keeps redistribution and the social-insurance pools national, where no household can opt out by moving, and leaves the preference-sensitive public goods to the provinces. The government's seven decisions are made — and each was a point chosen on the equity-efficiency tradeoff that the chapter opened with.

Historical Lens

The optimal-tax program, 1971–2011. The modern theory of optimal income taxation began with James Mirrlees's 1971 solution to the screening problem of an unobservable-ability population and reached its most usable form with Peter Diamond and Emmanuel Saez's 2011 statement of the top-rate formula in terms of estimable sufficient statistics. The intellectual descent of that program — from Pigou's welfare economics through Frank Ramsey's 1927 commodity-tax rule to Mirrlees and Diamond-Saez — belongs to the history of economic thought rather than to the apparatus, and is taken up there.

The welfare state was built, not derived. The compulsory social-insurance pools justified above as apparatus were constructed in waves — Bismarck's German social insurance in the 1880s, the Beveridge-era British settlement, and the broad postwar OECD expansion — a history narrated in the Economic History book's postwar golden age chapter rather than re-told here.

Résumé

  1. Taxation. A tax's economic burden is set by relative elasticities, not by who legally pays it — the inelastic side bears more. The deadweight loss rises with the square of the rate, so small broad taxes are cheap and high rates expensive. The equity-efficiency tradeoff organizes the field: the inelastic things most efficient to tax are often the necessities it is least fair to tax.
  2. Provision. A public good is efficiently provided where the vertical sum of marginal benefits equals marginal cost (Samuelson), unlike the horizontal summation that prices a private good; free-riding under-provides it. Merit goods justify provision on behavioral grounds; club goods are priced and sized for the cost-sharing-versus-congestion balance.
  3. Optimal taxation. Ramsey says tax inelastic goods more — efficient and regressive. Mirrlees poses the income tax as screening an unobservable-ability population. The Diamond-Saez top rate balances inframarginal revenue, the behavioral elasticity, and the welfare weight, falling as the elasticity rises; where the mainstream lands on that elasticity, and hence on the top rate, is genuinely contested.
  4. Social insurance. Compulsory pooling defeats the adverse-selection unraveling that thins private insurance. The Baily-Chetty replacement rate balances consumption-smoothing against moral hazard, falling as the behavioral response rises. UI, pensions, and health insurance are this apparatus applied to three risks. The same transfer can be insurance (one's own future self) or redistribution (another household).
  5. Redistribution. The social welfare function (utilitarian to Rawlsian) encodes how much society values the poor; Okun's leaky bucket measures what is lost in transit. The optimal amount depends jointly on both, and the verdict can flip with either. The mainstream view: the leak is real but modest for moderate redistribution, rising steeply at high rates.
  6. Transfer design. Means-testing targets resources but imposes a high implicit marginal tax rate on the poor's first earned dollar. The negative income tax integrates transfer and tax into one smooth schedule; the EITC subsidizes the first dollars of work before withdrawing gently — the same redistribution with the opposite work incentive at the bottom.
  7. Fiscal federalism. Local provision matches public goods to local tastes (Tiebout), but mobility lets the rich flee local redistributive taxes, so redistribution must be assigned to the level no one can exit. The Oates decentralization theorem assigns each function accordingly.
  8. The benevolent-planner caveat. Every optimum here assumes a competent, welfare-maximizing government. The public-choice tradition asks whether real governments behave that way — a caveat to carry forward, not a result the apparatus overturns.

Équations clés

LibelléÉquationDescription
Eq. 25.1$\text{buyer's share} = \varepsilon_s/(\varepsilon_s+\varepsilon_d)$Tax incidence by elasticity (the inelastic side bears more)
Eq. 25.2$\text{EB} \approx \tfrac{1}{2}\varepsilon(t/p)^2 pQ$Excess burden — deadweight loss rises with the square of the rate
Eq. 25.4$\sum_i \text{MRS}_i = \text{MRT}$Samuelson condition for efficient public-good provision
Eq. 25.5$t_k/p_k \propto 1/\varepsilon_k$Ramsey inverse-elasticity commodity-tax rule
Eq. 25.7$\tau^* = 1/(1+ae)$Diamond-Saez optimal top marginal rate
Eq. 25.8$\gamma \cdot \Delta c/c = d\log D / d\log R$Baily-Chetty optimal replacement rate
Eq. 25.9$W = W(u_1,\ldots,u_n)$; $\sum u_i$ / $\min u_i$Social welfare function (utilitarian / Rawlsian)
Eq. 25.10$dW = (g_{\text{poor}}-g_{\text{rich}}) - L$Leaky-bucket net welfare of a marginal transfer
Eq. 25.11$\text{effective MTR} = \tau + \phi$Effective marginal tax rate of a means-tested transfer
Eq. 25.12$\text{credit}(w)$: piecewise phase-in / plateau / phase-outEITC schedule

Pratique

  1. A per-unit tax of \$3 is levied on a good with demand elasticity $\varepsilon_d = 0.5$ and supply elasticity $\varepsilon_s = 1.5$. (a) Using Eq. 25.1, compute the share of the tax borne by buyers and by sellers. (b) Show that the answer is unchanged if the tax is instead levied statutorily on sellers. (c) Which side is more inelastic, and is that consistent with who bears more?
  2. A tax raises a good's price from \$10 by 20%, reducing quantity from 100 to 90 units, with compensated elasticity $\varepsilon = 0.5$. (a) Estimate the deadweight loss using Eq. 25.2. (b) If the proportional tax doubles to 40%, what happens to the deadweight loss, and why?
  3. Classify each good on the rivalry × excludability matrix and name the right provision instrument: (a) a streetlight, (b) a loaf of bread, (c) a congested toll bridge, (d) an ocean fishery. For the public good, state the Samuelson condition; for the club good, state what determines its optimal size.

Application

  1. A government must raise revenue from two goods, a necessity with elasticity 0.4 and a luxury with elasticity 2.0. (a) Using the Ramsey inverse-elasticity rule (Eq. 25.5), which good should carry the higher tax rate, and in roughly what ratio? (b) Explain why the efficient commodity-tax structure is regressive. (c) Why does this push redistributive work onto the income tax rather than commodity taxes?
  2. Using the Diamond-Saez formula $\tau^* = 1/(1+ae)$ with Pareto parameter $a = 1.6$: (a) compute the revenue-maximizing top rate for $e = 0.2$, $e = 0.4$, and $e = 0.8$. (b) Describe in words why the optimal rate falls as $e$ rises. (c) Given the contested mainstream range of $e$, what range of top rates does the apparatus support, and why is a single number dishonest?
  3. A worker faces a 25% consumption drop at job loss, has a coefficient of relative risk aversion of 2, and the elasticity of unemployment duration with respect to the benefit is 0.5. (a) Using the Baily-Chetty balance, sketch the marginal benefit and cost of a higher replacement rate and find the optimal rate. (b) Recompute when the duration elasticity doubles to 1.0. (c) State the policy lesson.
  4. An EITC has a phase-in subsidy rate $s = 0.40$, a plateau, and a phase-out rate $\phi = 0.21$, with a background statutory rate $\tau = 0.10$. (a) Compute the effective marginal tax rate in each segment. (b) Compare to a means-tested transfer with a flat 50% phase-out from the first dollar. (c) Which design better preserves the incentive to take a low-wage job, and why?

Défi

  1. Derive the Ramsey inverse-elasticity rule as the solution to minimizing total excess burden (each good's $\approx \tfrac{1}{2}\varepsilon_k t_k^2 B_k$) subject to a fixed revenue target $\sum_k t_k B_k = R$. (a) Set up the Lagrangian and take first-order conditions. (b) Show the optimum equalizes the marginal excess burden per dollar of revenue across goods. (c) Reduce to the inverse-elasticity form under independent demands and discuss the assumption.
  2. Take a stylized pay-as-you-go pension that taxes workers and pays a benefit that is partly flat and partly earnings-related. (a) Decompose a high lifetime earner's and a low lifetime earner's net lifetime transfer into an across-time insurance component and an across-household redistribution component. (b) Show how making the benefit formula more progressive shifts the mix. (c) Explain why calling the whole program "insurance" or "redistribution" is misleading.
  3. A region considers funding local cash transfers to its poor with a local tax on high-income residents who can move to neighboring regions at low cost. (a) Explain, using the Tiebout logic, why the local redistributive tax erodes its own base. (b) Show why the same mobility that defeats local redistribution is what makes local provision efficient. (c) Conclude where redistribution must be assigned and state the Oates decentralization theorem that captures the result.

Sources

Ramsey (1927); Samuelson (1954); Buchanan (1965); Mirrlees (1971); Atkinson & Stiglitz (1976); Diamond & Saez (2011); Saez (2001); Baily (1978); Chetty (2006, 2008); Rothschild & Stiglitz (1976); Okun (1975); Friedman (1962); Tiebout (1956); Oates (1972); Musgrave (1959).