The Synthetic Control Method in Stata

Did California’s Proposition 99 tobacco tax cut smoking?

−19.0ATT · packs per capita / yr
0.974pre-treatment fit (R²)
0.026in-space placebo p-value

Carlos Mendez

Nagoya University (GSID)

June 11, 2026

The Tension

Act I

Smoking was falling everywhere — so did the tax change anything?

In 1988, California voters passed Proposition 99: a 25-cents-per-pack cigarette tax plus funded anti-smoking education, effective January 1989.

But national smoking was already falling. What would California have done without the law?

A raw comparison hints at an effect — but the comparator is crude

Cigarette sales per capita: California (solid blue) vs. the unweighted average of 38 control states (dashed grey), 1970–2000. Orange line marks Prop 99 (1989).

Where we’re going

  • The lab: a 39-state, 31-year panel and the ATT estimand
  • Build “synthetic California” with synth2 — a weighted blend of donor states
  • Three inference tools: in-space placebo, in-time placebo, leave-one-out
  • The lesson: a transparent counterfactual beats parallel-trends faith

The Investigation

Act II

The lab: 39 states × 31 years, one treated unit, the ATT

  • Outcomecigsale, per-capita cigarette sales (packs)
  • Treatment — Prop 99, California only, from 1989
  • Predictors — log income, share aged 15–24, retail price, beer, plus cigsale in 1975/1980/1988

Strongly balanced: 1,209 observations, 19 pre-treatment years and 12 post-treatment years. The estimand is the ATT — the effect on California, the one treated unit — not the ATE.

SCM builds a counterfactual by matching pre-treatment predictors

\[\min_{W} \sum_{m=1}^{M} v_m \left( X_{1m} - \sum_{j=2}^{J+1} w_j X_{jm} \right)^2\]

Pick donor weights \(w_j \ge 0\) that sum to one and match California’s predictors \(X_{1m}\).

The weights \(v_m\) set how much each covariate matters.

The synthetic control is a convex combination of real states — no extrapolation beyond the donor pool.

The treatment effect is the gap between actual and synthetic California

\[\hat{\tau}_t = Y_{1t} - \sum_{j=2}^{J+1} w_j^* Y_{jt}\]

The effect in year \(t\) is California’s actual sales minus the synthetic’s prediction. A negative \(\hat{\tau}_t\) means Prop 99 lowered sales.

Identification rests on good pre-treatment fit, no interference, and no anticipation — assumptions you can largely inspect, not just assume.

One synth2 call fits the baseline synthetic control

synth2 cigsale lnincome age15to24 retprice beer ///
    cigsale(1988) cigsale(1980) cigsale(1975), ///
    trunit(3) trperiod(1989) xperiod(1980(1)1988) nested allopt

trunit(3) = California · trperiod(1989) = treatment year · nested = outer V / inner W · allopt = multiple starts to dodge local optima.

Synthetic California reproduces 97.4% of the pre-1989 path

California actual vs. synthetic California, 1970–2000: near-indistinguishable before 1989, then a widening gap.

Two predictors carry the match: age 15–24 and 1975 sales

Predictor (V-matrix) weights: how much each covariate drives the SCM optimization.

Synthetic California is just five states — one-third Utah

Donor weights: the five states that compose synthetic California (33 others get exactly zero).

By 2000, real California sold 38% fewer packs than its synthetic twin

−19.0

average ATT, packs per capita / yr (1989–2000); −7.6 in 1989 deepening to −26.4 by 1999

The gap deepens steadily through the 1990s

Treatment effect (actual minus synthetic California) over time; the negative gap widens after 1989.

The Resolution

Act III

California’s signal-to-noise ratio is the most extreme of all 39 states

State Pre MSPE Post/Pre MSPE
California 3.17 123.5
Georgia 1.46 80.0
Virginia 2.78 79.0
Missouri 1.20 70.9

The post/pre MSPE ratio asks: how much worse does fit get after 1989 than before? California’s is highest by far.

Run the placebo on every state — California is the lone outlier

Treatment effects for all states: California (bold) plunges away from the tight grey band of placebo gaps near zero.

Across all 39 states, a gap this large appears 2.6% of the time

0.026

in-space placebo p-value, all controls (1/39); p = 0.05 after the cut(2) fit filter (1/20)

Significance holds in most post-treatment years

Left-sided Fisher exact p-values over time (left-sided, because the effect is negative): p = 0.05 in most years.

A fake 1985 treatment produces no comparable gap

In-time placebo: California actual vs. synthetic with a fake treatment at 1985. Lines stay close through 1988, then split after the real 1989 policy.

The effect snaps on at the real date, not the fake one

In-time placebo effect: small gaps during the fake 1985–1988 window, large gaps after the real 1989 treatment.

No single donor drives the result — leave-one-out stays negative

Leave-one-out: synthetic California’s prediction stays similar whichever weighted donor state is dropped.

Does choosing controls by fit make this causal? Not by itself

Objection. A flexible, data-driven counterfactual still can’t manufacture identification.

Response. Correct. The ATT is credible only under no interference, no anticipation, and good pre-treatment fit — SCM makes the last one visible and disciplines the comparator, but it cannot rule out cross-border shopping or an idiosyncratic donor (e.g. Utah’s distinct smoking norms). It also gives no standard errors; inference rides entirely on the placebos.