Evaluating place-based policy in Ethiopia with staggered difference-in-differences
Nagoya University (GSID)
June 12, 2026
Act I
An industrial park: serviced land, power, one-stop customs — rented to garment and leather factories. Ethiopia opened 20+ parks across 18 districts, 2008–2021.
The promise: jobs, a wage economy, a rural region pulled forward. The fear: a bright enclave behind a fence while the surrounding districts see nothing.
A park could raise satellite luminosity yet leave living standards flat. It could add jobs on average — yet only for men.
So we ask both: do parks raise local activity, and who inside the district actually benefits?
The “for whom” turns out to carry the headline.
Parks were sited near cities and roads — districts that were already growing faster. So a naive treated-vs-control gap confounds the park with the place.
We need a design that nets out pre-existing differences and handles a staggered rollout (parks opened in different years). That design is difference-in-differences.
A note on the data. Synthetic, calibrated data — tuned to Huang, Wang & Xu (2026)’s signs and magnitudes. Learn the methods, not facts about Ethiopia.
We want the Average Treatment effect on the Treated:
\[\text{ATT} = E[\,Y_i(1) - Y_i(0) \mid D_i = 1\,]\]
The effect on the 17 districts that got a park — not on a random district — identified under parallel trends.
Three data streams · one design · an escalating ladder of estimators.
Ethiopia’s industrial parks (red dots), regional state capitals (blue stars), and the paved and primary road network.
Source: Appendix Figure A2 in Huang, Wang & Xu (2026). Real park locations from the paper; this tutorial uses synthetic calibrated data.
Act II
Only 17 treated woredas — the recurring source of statistical caution.
Baseline-normalized group-mean IHS light: treated (orange) and control (blue) overlap before 2008, then the treated series climbs while controls stay flat.
Cohort staircase: each opening-year cohort turns up at its own park-opening date against a flat never-treated baseline.
1 woreda in 2008, then 2–3 per year across 2014–2020 — 17 in total.
Treatment map: the 17 treated woredas (orange) cluster spatially among the 122 matched controls (blue).
Near things are more related than distant things — their shocks are not independent draws.
The simplest estimate collapses the design at the median opening year and takes a difference of differences:
\[\widehat{\text{DiD}} = \big(\bar{Y}_{\text{treat, post}} - \bar{Y}_{\text{treat, pre}}\big) - \big(\bar{Y}_{\text{ctrl, post}} - \bar{Y}_{\text{ctrl, pre}}\big)\]
Naive 2×2 ATT: +0.2011 (SE 0.0885, p = 0.023).
The effect ramps over ~5 years, so averaging small early years with large late ones pulls the mean down.
The static TWFE specification, for woreda \(d\) in year \(t\):
\[Y_{dt} = \beta \, D_{dt} + \alpha_d + \gamma_{r(d),t} + \varepsilon_{dt}\]
\(\alpha_d\) absorbs the bright base; \(\gamma_{r(d),t}\) absorbs regional shocks; \(\beta\) is the ATT.
With baseline-trend interactions: \(\hat\beta = +0.2152\) (SE 0.0833, \(t = 2.58\), sig. at 1%) — a ~21% rise in luminosity.
Table 1 forest: a positive park ATT across IHS light, raw light, and the impervious-surface ratio, no-trends vs with-trends.
Event study: the four pre-opening leads hug zero, then the effect jumps at k = 0 and climbs to a +0.48 plateau by k = 4–5.
Pre-trend flat (largest \(|t| = 2.17\)) → parallel trends credible. Effect builds, not jumps.
The worry. Under staggered timing TWFE makes “forbidden comparisons” — already-treated woredas as controls for later-treated ones. When effects grow over time, those comparisons get negative weights and can bias, even flip, the estimate.
The fix. Sun-Abraham, Borusyak/Gardner, and Callaway-Sant’Anna only ever compare treated cohorts to clean (never- or not-yet-treated) controls. Each targets the same ATT — if they agree with TWFE, the bias is not biting.
Four estimators compared: TWFE +0.270, Sun-Abraham +0.299, Borusyak/Gardner +0.302, Callaway-Sant’Anna +0.256 — all in a tight band, each significant at 1%.
| Estimator | ATT | Sig. |
|---|---|---|
| TWFE | +0.2699 | *** |
| Sun-Abraham | +0.2991 | *** |
| Borusyak/Gardner | +0.3022 | *** |
| Callaway-Sant’Anna | +0.2561 | *** |
Goodman-Bacon decomposition: the clean treated-vs-never 2×2 comparisons carry nearly all the weight; the forbidden later-vs-earlier comparisons carry almost none.
| Comparison type | Weight | Avg estimate |
|---|---|---|
| Treated vs never | 95.42% | +0.2708 |
| Earlier vs later | 3.38% | +0.3370 |
| Later vs earlier (forbidden) | 1.21% | +0.0135 |
Heterogeneity: the implied park effect fades the farther a woreda lies from Addis, its state capital, or the nearest city.
Distance to nearest city \(-0.0335\) (\(t = -4.90\)) · paved roads \(+0.6695\) (\(t = 2.08\)). Place is first-order.
Spillover test: treatment lifts the host woreda strongly (+0.27), but the effect on control neighbours within 10 km is about zero.
nearby \(= +0.0648\) (\(t = 1.06\)), insignificant — so the host’s gain is net-new, and SUTVA holds.
Table 5 forest: households near a park gain durable goods, housing quality, and wealth, with or without controls.
| Outcome | ATT (with controls) | Sig. |
|---|---|---|
| Durable goods p.c. | +0.2286 (~74%) | *** |
| Housing quality | +0.2480 | *** |
| Wealth index | +0.3825 SD | *** |
Household durables RCS event study: flat, insignificant pre-phases, then a jump at park opening (phase 0).
Phase \(-3\): \(-0.020\) (ns) · phase \(-2\): \(+0.024\) (ns) · phase \(0\): \(+0.261\) (p < 0.001).
The gender story: employment is null overall but large for women; women’s decision power and savings rise while acceptance of domestic violence falls.
| Non-ag employment | ATT | \(t\) |
|---|---|---|
| Full sample | +0.0911 (ns) | +1.57 |
| Women | +0.1404*** | +3.00 |
| Men | +0.0176 (ns) | +0.19 |
+0.140
women’s non-ag employment ATT (p < 0.01) — and the empowerment cascade that follows
Decision power +0.110*** · savings account +0.315*** · accepts domestic violence −0.210*** (women only).
Act III
Treated woredas cluster in space, so a regional shock hits several at once — the naive SE assumes independence and is too small. The fix is a Conley spatial-HAC standard error; the point estimate never moves.
| With-trends light ATT | Estimate | Naive HC0 | Conley-HAC | \(t\)(HAC) |
|---|---|---|---|---|
| 2008–2020 | +0.2152 | 0.0329 | 0.0799 | +2.69 |
Same +0.2152 in every column. The SE inflates 2.43× — yet still significant at 1%.
Triangulation across methods — not a single regression — is what makes the claim credible.
The lesson is not “build parks everywhere.” It is that where and for whom decide whether place-based policy works.
Objection. The data are synthetic, there are only 17 treated woredas, and this is observational — point estimates fragile, identification on faith.
Response.
| Number | Value |
|---|---|
| Light ATT (with trends) | +0.2152*** (~21%) |
| Four-estimator spread | 0.046 IHS units |
| Clean Bacon weight | 95.4% |
| Female employment ATT | +0.140*** (vs +0.091 ns) |
| Light SE: naive → Conley-HAC | 0.0329 → 0.0799 |
And five lessons: let evolving effects evolve · triangulate estimators · disaggregate by sex · place is first-order · honest inference, honest caveats.