<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>event study | Carlos Mendez</title><link>https://carlos-mendez.org/tag/event-study/</link><atom:link href="https://carlos-mendez.org/tag/event-study/index.xml" rel="self" type="application/rss+xml"/><description>event study</description><generator>Wowchemy (https://wowchemy.com)</generator><language>en-us</language><copyright>© 2018–2026 Carlos Mendez. All rights reserved.</copyright><lastBuildDate>Sun, 07 Jun 2026 00:00:00 +0000</lastBuildDate><image><url>https://carlos-mendez.org/media/icon_huedfae549300b4ca5d201a9bd09a3ecd5_79625_512x512_fill_lanczos_center_3.png</url><title>event study</title><link>https://carlos-mendez.org/tag/event-study/</link></image><item><title>Staggered Synthetic Difference-in-Differences (SDID) in Stata: Gender Quotas and Women in Parliament</title><link>https://carlos-mendez.org/post/stata_sdid_staggered/</link><pubDate>Sun, 07 Jun 2026 00:00:00 +0000</pubDate><guid>https://carlos-mendez.org/post/stata_sdid_staggered/</guid><description>&lt;h2 id="1-overview">1. Overview&lt;/h2>
&lt;p>In a &lt;a href="https://carlos-mendez.org/post/stata_sdid/">previous tutorial&lt;/a>, one unit — California — adopted one policy — Proposition 99 — in one year — 1989. That &lt;strong>block design&lt;/strong> is the textbook setting for synthetic difference-in-differences (SDID). But most real policies do not arrive on a single clock. Parliamentary gender quotas, minimum-wage laws, carbon taxes, and clean-air regulations are adopted by &lt;strong>different units in different years&lt;/strong>. This is the &lt;strong>staggered adoption&lt;/strong> design, and it is where naive panel methods quietly break.&lt;/p>
&lt;p>This tutorial extends SDID to staggered adoption and applies it in Stata to a real question in political economy: &lt;strong>do parliamentary gender quotas raise the share of women in national parliaments?&lt;/strong> We use the &lt;code>quota_example&lt;/code> dataset that ships with the &lt;code>sdid&lt;/code> package — 119 countries observed annually from 1990 to 2015, in which 9 countries adopt a gender quota across 7 different cohorts (2000, 2002, 2003, 2005, 2010, 2012, and 2013).&lt;/p>
&lt;p>The headline is a story about heterogeneity. The overall effect of quotas is about &lt;strong>+8 percentage points&lt;/strong> of women in parliament, but the cohort-by-cohort effects swing from &lt;strong>−3.5 to +21.8 points&lt;/strong>. A single number hides that range — and, as we will see, the naive two-way fixed-effects regression that most people reach for first can hide even more.&lt;/p>
&lt;details>
&lt;summary>&lt;b>Why does staggered timing break the naive regression?&lt;/b> (click to expand)&lt;/summary>
&lt;p>The workhorse for panel policy evaluation is the &lt;strong>two-way fixed-effects (TWFE)&lt;/strong> regression — unit dummies, time dummies, and a treatment dummy. With one adoption date it estimates a clean difference-in-differences. With &lt;em>staggered&lt;/em> timing and &lt;em>heterogeneous&lt;/em> effects, the same regression implicitly uses &lt;strong>already-treated units as controls for later adopters&lt;/strong> (&amp;ldquo;forbidden comparisons&amp;rdquo;). The result is a variance-weighted average of every 2×2 comparison in the panel, and some of those weights can be &lt;strong>negative&lt;/strong> — so the estimate can even take the wrong sign (Goodman-Bacon, 2021; de Chaisemartin &amp;amp; D&amp;rsquo;Haultfœuille, 2020). Staggered SDID sidesteps this by estimating a &lt;strong>separate, clean&lt;/strong> SDID effect for each adoption cohort and aggregating with transparent, non-negative weights.&lt;/p>
&lt;/details>
&lt;pre>&lt;code class="language-mermaid">graph TD
subgraph &amp;quot;Block design — predecessor (Prop 99)&amp;quot;
B1[&amp;quot;California&amp;lt;br/&amp;gt;adopts 1989&amp;quot;] --&amp;gt; BATT[&amp;quot;one ATT&amp;quot;]
B2[&amp;quot;other states&amp;lt;br/&amp;gt;never treated&amp;quot;] --&amp;gt; BATT
end
subgraph &amp;quot;Staggered design — this post (gender quotas)&amp;quot;
S1[&amp;quot;cohort 2000&amp;quot;] --&amp;gt; SATT[&amp;quot;aggregate ATT&amp;quot;]
S2[&amp;quot;cohort 2002&amp;quot;] --&amp;gt; SATT
S3[&amp;quot;cohorts 2003 to 2013&amp;quot;] --&amp;gt; SATT
SC[&amp;quot;110 never-treated&amp;lt;br/&amp;gt;controls&amp;quot;] -.donor pool.-&amp;gt; SATT
end
style B1 fill:#d97757,stroke:#141413,color:#fff
style B2 fill:#6a9bcc,stroke:#141413,color:#fff
style BATT fill:#00d4c8,stroke:#141413,color:#141413
style S1 fill:#d97757,stroke:#141413,color:#fff
style S2 fill:#d97757,stroke:#141413,color:#fff
style S3 fill:#d97757,stroke:#141413,color:#fff
style SC fill:#6a9bcc,stroke:#141413,color:#fff
style SATT fill:#00d4c8,stroke:#141413,color:#141413
&lt;/code>&lt;/pre>
&lt;h3 id="11-learning-objectives">1.1 Learning objectives&lt;/h3>
&lt;p>By the end of this tutorial you will be able to:&lt;/p>
&lt;ul>
&lt;li>&lt;strong>Explain&lt;/strong> why staggered adoption breaks naive TWFE difference-in-differences, and how per-cohort SDID avoids the forbidden-comparison problem.&lt;/li>
&lt;li>&lt;strong>Derive&lt;/strong> the SDID estimator from first principles — unit weights $\omega$, time weights $\lambda$, and the weighted two-way fixed-effects objective — and the rule that aggregates cohort-specific effects $\hat{\tau}_a$ into one overall ATT.&lt;/li>
&lt;li>&lt;strong>Estimate&lt;/strong> the effect of gender quotas with &lt;code>sdid&lt;/code> on a staggered panel, add a covariate two different ways (&lt;code>optimized&lt;/code> vs &lt;code>projected&lt;/code>), and choose among bootstrap, jackknife, and placebo inference.&lt;/li>
&lt;li>&lt;strong>Read&lt;/strong> an SDID event-study plot produced by &lt;code>sdid_event&lt;/code>, distinguishing pre-trend placebo coefficients from post-period dynamic effects.&lt;/li>
&lt;/ul>
&lt;h2 id="2-key-concepts-at-a-glance">2. Key concepts at a glance&lt;/h2>
&lt;p>Each card gives a plain-language &lt;strong>definition&lt;/strong>, a concrete &lt;strong>example&lt;/strong> from this quota study, and an everyday &lt;strong>analogy&lt;/strong>. Open any term that is unfamiliar.&lt;/p>
&lt;details>
&lt;summary>&lt;b>1. ATT (average treatment effect on the treated)&lt;/b> — the question we actually answer.&lt;/summary>
&lt;p>&lt;strong>Definition.&lt;/strong> The effect of adopting a quota on the women-in-parliament share, &lt;em>in the countries that adopted one&lt;/em>, averaged over their post-adoption years. It is not the effect a quota would have everywhere — only where one was actually tried.&lt;/p>
&lt;p>&lt;strong>Example.&lt;/strong> Our headline ATT is &lt;strong>+8.0 percentage points&lt;/strong>: across the nine adopting countries, quotas raised women&amp;rsquo;s parliamentary share by about eight points relative to their no-quota counterfactual.&lt;/p>
&lt;p>&lt;strong>Analogy.&lt;/strong> Like asking &amp;ldquo;how much did the patients who &lt;em>took&lt;/em> the drug improve?&amp;rdquo; — not &amp;ldquo;how much would everyone improve?&amp;rdquo; You measure only the units that were actually treated.&lt;/p>
&lt;/details>
&lt;details>
&lt;summary>&lt;b>2. Synthetic control&lt;/b> — a made-to-order comparison country.&lt;/summary>
&lt;p>&lt;strong>Definition.&lt;/strong> A weighted blend of never-treated &amp;ldquo;donor&amp;rdquo; countries, built so its pre-adoption path mimics the treated cohort. It stands in for the unobservable counterfactual: what the cohort&amp;rsquo;s outcome &lt;em>would&lt;/em> have been without a quota.&lt;/p>
&lt;p>&lt;strong>Example.&lt;/strong> The 2002 cohort&amp;rsquo;s synthetic control mixes dozens of donors (Belgium, Paraguay, Cuba, …) so that, before 2002, the blend tracks the cohort&amp;rsquo;s trend — then keeps going as the cohort would have without the law.&lt;/p>
&lt;p>&lt;strong>Analogy.&lt;/strong> A stunt double cast to match the lead actor&amp;rsquo;s build and movement — close enough that, in the shots you cannot film the star, the double stands in convincingly.&lt;/p>
&lt;/details>
&lt;details>
&lt;summary>&lt;b>3. Unit weights (ω)&lt;/b> — how much each donor counts.&lt;/summary>
&lt;p>&lt;strong>Definition.&lt;/strong> Non-negative weights, one per donor country, summing to one, that build the synthetic control. Each cohort gets its own ω.&lt;/p>
&lt;p>&lt;strong>Example.&lt;/strong> In the 2000 cohort, 80 donors receive nonzero weight — Argentina ≈ 0.061, Guatemala ≈ 0.057, Austria ≈ 0.045 — a &lt;em>diffuse&lt;/em> blend rather than one or two stand-ins.&lt;/p>
&lt;p>&lt;strong>Analogy.&lt;/strong> A recipe calling for many ingredients in small, precise amounts: no single one dominates, so the dish survives a bad batch of any one ingredient.&lt;/p>
&lt;/details>
&lt;details>
&lt;summary>&lt;b>4. Time weights (λ)&lt;/b> — which "before" years matter.&lt;/summary>
&lt;p>&lt;strong>Definition.&lt;/strong> Non-negative weights on the pre-adoption years, summing to one, that decide which pre-periods define the baseline. They up-weight the years most like the post-period.&lt;/p>
&lt;p>&lt;strong>Example.&lt;/strong> For the 2002 cohort, λ concentrates on the late 1990s and 2001 rather than spreading evenly across 1990–2001 — the recent past is the relevant baseline.&lt;/p>
&lt;p>&lt;strong>Analogy.&lt;/strong> Forecasting tomorrow&amp;rsquo;s weather, you trust last week far more than the same date five years ago. Time weights formalize &amp;ldquo;recent and similar counts more.&amp;rdquo;&lt;/p>
&lt;/details>
&lt;details>
&lt;summary>&lt;b>5. Adoption cohort (a)&lt;/b> — units that switch on together.&lt;/summary>
&lt;p>&lt;strong>Definition.&lt;/strong> The set of countries that first adopt a quota in the same calendar year. Staggered SDID runs one self-contained SDID per cohort, always against the never-treated controls.&lt;/p>
&lt;p>&lt;strong>Example.&lt;/strong> There are seven cohorts — 2000, 2002, 2003, 2005, 2010, 2012, 2013 — with two countries each in 2002 and 2003, and one in the rest.&lt;/p>
&lt;p>&lt;strong>Analogy.&lt;/strong> School graduating classes: the &amp;ldquo;class of 2002&amp;rdquo; and the &amp;ldquo;class of 2010&amp;rdquo; share a start date and are analyzed as groups, even though all attend the same school.&lt;/p>
&lt;/details>
&lt;details>
&lt;summary>&lt;b>6. Staggered adoption &amp;amp; the forbidden comparison&lt;/b> — why the naive regression breaks.&lt;/summary>
&lt;p>&lt;strong>Definition.&lt;/strong> Staggered adoption means units are treated at different times. The hazard: a two-way fixed-effects regression can use &lt;em>already-treated&lt;/em> units as controls for &lt;em>later&lt;/em> adopters — a &amp;ldquo;forbidden comparison&amp;rdquo; that places negative weights on some effects and can flip the sign.&lt;/p>
&lt;p>&lt;strong>Example.&lt;/strong> When the 2012 cohort adopts, a naive TWFE quietly treats the 2002 cohort — already treated, already changed — as part of its control group. Staggered SDID never does this: each cohort is compared only to the 110 never-treated countries.&lt;/p>
&lt;p>&lt;strong>Analogy.&lt;/strong> Timing a late runner against runners who already crossed the line and slowed to a walk — your &amp;ldquo;control&amp;rdquo; is contaminated because it has already run the race.&lt;/p>
&lt;/details>
&lt;details>
&lt;summary>&lt;b>7. Event time (relative period)&lt;/b> — every cohort on its own clock.&lt;/summary>
&lt;p>&lt;strong>Definition.&lt;/strong> Time measured relative to each cohort&amp;rsquo;s &lt;em>own&lt;/em> adoption year (… −2, −1, 0, +1 …), so cohorts that adopted in different calendar years can be lined up and averaged.&lt;/p>
&lt;p>&lt;strong>Example.&lt;/strong> Event time 0 is the year 2000 for the first cohort but 2013 for the last; re-centring lets us ask &amp;ldquo;what happens three years &lt;em>after&lt;/em> a quota?&amp;rdquo; across all cohorts at once.&lt;/p>
&lt;p>&lt;strong>Analogy.&lt;/strong> Comparing marathon runners by their own start gun, not the wall clock: a runner who started at 9:05 and one who started at 9:20 are both &amp;ldquo;at mile 10&amp;rdquo; measured from their own start.&lt;/p>
&lt;/details>
&lt;details>
&lt;summary>&lt;b>8. ATT aggregation&lt;/b> — from many cohort effects to one number.&lt;/summary>
&lt;p>&lt;strong>Definition.&lt;/strong> The overall ATT is a weighted average of the cohort effects, each weighted by its share of treated unit-by-post-period observations — earlier, longer-exposed, larger cohorts count more.&lt;/p>
&lt;p>&lt;strong>Example.&lt;/strong> The seven cohort effects span &lt;strong>−3.5 to +21.8&lt;/strong>; weighted by treated country-years they average to &lt;strong>+8.0&lt;/strong> (the plain unweighted mean would be ≈ 7.0).&lt;/p>
&lt;p>&lt;strong>Analogy.&lt;/strong> A course grade that weights the final exam more than a pop quiz: the cohorts you observe for longer carry more of the final mark.&lt;/p>
&lt;/details>
&lt;details>
&lt;summary>&lt;b>9. Pre-trend placebo test&lt;/b> — the assumption you can see.&lt;/summary>
&lt;p>&lt;strong>Definition.&lt;/strong> Event-study coefficients for the &lt;em>pre-adoption&lt;/em> periods. If treated and synthetic-control countries moved in parallel before treatment, these sit near zero — a falsification check.&lt;/p>
&lt;p>&lt;strong>Example.&lt;/strong> For the 2002 cohort, all twelve pre-period placebos fall in &lt;strong>[−0.2, +0.8]&lt;/strong> points — flat, so we cannot reject parallel synthetic trends.&lt;/p>
&lt;p>&lt;strong>Analogy.&lt;/strong> Checking a scale by weighing nothing first: if it does not read zero when empty, you distrust every later reading. Flat placebos are that &amp;ldquo;reads zero when empty&amp;rdquo; check.&lt;/p>
&lt;/details>
&lt;details>
&lt;summary>&lt;b>10. Bootstrap, jackknife, placebo&lt;/b> — three rulers for uncertainty.&lt;/summary>
&lt;p>&lt;strong>Definition.&lt;/strong> Three ways to attach a standard error to the ATT. With many treated units all three are available; they share one point estimate but report different spread.&lt;/p>
&lt;p>&lt;strong>Example.&lt;/strong> On the two-cohort subsample the ATT is &lt;strong>10.3&lt;/strong> for all three, but the SE is &lt;strong>4.7&lt;/strong> (bootstrap), &lt;strong>6.0&lt;/strong> (jackknife, most conservative), and &lt;strong>2.3&lt;/strong> (placebo, tightest).&lt;/p>
&lt;p>&lt;strong>Analogy.&lt;/strong> Measuring a table with a tape, a folding ruler, and a laser: they agree on the length but disagree on the error bars — the cautious carpenter reports the widest.&lt;/p>
&lt;/details>
&lt;h2 id="3-the-data-gender-quotas-across-119-countries">3. The data: gender quotas across 119 countries&lt;/h2>
&lt;p>We use &lt;code>quota_example.dta&lt;/code>, the balanced panel from Bhalotra, Clarke, Gomes &amp;amp; Venkataramani (2023) distributed with the &lt;code>sdid&lt;/code> package. The outcome is the percentage of seats held by women in the national parliament; the treatment is the adoption of a reserved-seat gender quota; the covariate is log GDP per capita.&lt;/p>
&lt;pre>&lt;code class="language-stata">webuse set www.damianclarke.net/stata/
webuse quota_example, clear
label variable quota &amp;quot;Parliamentary gender quota&amp;quot;
xtset country year
codebook country year quota womparl lngdp, compact
&lt;/code>&lt;/pre>
&lt;pre>&lt;code class="language-text">Variable Obs Unique Mean Min Max Label
----------------------------------------------------------------------------
country 3094 119 . . . Country
year 3094 26 2002.5 1990 2015 Year
quota 3094 2 .0303814 0 1 =1 if country has a gender quota
womparl 3094 449 14.96531 0 63.8 Women in parliament
lngdp 2990 2956 9.154291 5.8701 11.61789 log(GDP)
----------------------------------------------------------------------------
&lt;/code>&lt;/pre>
&lt;p>The panel is &lt;strong>balanced&lt;/strong>: 119 countries times 26 years equals 3,094 observations, with no gaps in the outcome or treatment (&lt;code>lngdp&lt;/code> has 104 missing values, which will matter only when we add the covariate). The treatment indicator &lt;code>quota&lt;/code> equals one for just 3% of observations, a reminder that treated country-years are scarce. Crucially, &lt;code>quota&lt;/code> is &lt;strong>absorbing&lt;/strong> — once a country adopts a quota it stays treated — which SDID requires.&lt;/p>
&lt;table>
&lt;thead>
&lt;tr>
&lt;th>Variable&lt;/th>
&lt;th>Role&lt;/th>
&lt;th>Symbol&lt;/th>
&lt;th>Description&lt;/th>
&lt;/tr>
&lt;/thead>
&lt;tbody>
&lt;tr>
&lt;td>&lt;code>country&lt;/code>&lt;/td>
&lt;td>unit&lt;/td>
&lt;td>$i$&lt;/td>
&lt;td>119 countries (9 ever-treated, 110 never-treated)&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;code>year&lt;/code>&lt;/td>
&lt;td>time&lt;/td>
&lt;td>$t$&lt;/td>
&lt;td>1990–2015 (26 years)&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;code>womparl&lt;/code>&lt;/td>
&lt;td>outcome&lt;/td>
&lt;td>$Y_{it}$&lt;/td>
&lt;td>% women in the national parliament&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;code>quota&lt;/code>&lt;/td>
&lt;td>treatment&lt;/td>
&lt;td>$W_{it}$&lt;/td>
&lt;td>1 once a country has a quota, 0 before / never&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;code>lngdp&lt;/code>&lt;/td>
&lt;td>covariate&lt;/td>
&lt;td>$X_{it}$&lt;/td>
&lt;td>log GDP per capita&lt;/td>
&lt;/tr>
&lt;/tbody>
&lt;/table>
&lt;p>&lt;strong>The estimand.&lt;/strong> Our target is the &lt;strong>average treatment effect on the treated (ATT)&lt;/strong>: the effect of adopting a quota on the women-in-parliament share &lt;em>in the countries that adopted one&lt;/em>, averaged over their post-adoption years. Formally,&lt;/p>
&lt;p>$$
\tau = \frac{1}{N_{tr}\, T_{post}} \sum_{i:\, W_i = 1}\ \sum_{t &amp;gt; T_{pre}} \left[\, Y_{it}(1) - Y_{it}(0) \,\right]
$$&lt;/p>
&lt;p>In words: for every treated country and every post-adoption year, take the gap between the share of women &lt;em>with&lt;/em> a quota, $Y_{it}(1)$, and the share that &lt;em>would have occurred without one&lt;/em>, $Y_{it}(0)$ — then average. The first term is observed; the second is the counterfactual that the synthetic control must impute, because we never see a quota-adopting country in the parallel world where it abstained.&lt;/p>
&lt;p>&lt;strong>An observational, not experimental, setting.&lt;/strong> Quotas are not randomly assigned. Countries that adopt them early may differ systematically — they may be wealthier, more democratic, or already on a rising trajectory of women&amp;rsquo;s representation. That is exactly why we need a method that builds a &lt;em>credible counterfactual&lt;/em> from comparison countries rather than assuming a simple before/after change would have held. Identification rests on assumptions we will keep visible: that treated and synthetic-control countries share a &lt;strong>common (synthetic) trend&lt;/strong> absent treatment, &lt;strong>no anticipation&lt;/strong> of the quota, &lt;strong>no spillovers&lt;/strong> across countries, and that adoption timing is not itself driven by the outcome&amp;rsquo;s future path.&lt;/p>
&lt;h3 id="31-the-staggered-structure">3.1 The staggered structure&lt;/h3>
&lt;p>Before modelling, let us see the timing directly. The adoption year is the first year a country is treated; we tabulate the cohorts.&lt;/p>
&lt;pre>&lt;code class="language-stata">bysort country (year): egen firsttreat = min(cond(quota==1, year, .))
preserve
keep country firsttreat
duplicates drop
tab firsttreat, missing
restore
&lt;/code>&lt;/pre>
&lt;pre>&lt;code class="language-text"> firsttreat | Freq. Percent Cum.
------------+-----------------------------------
2000 | 1 0.84 0.84
2002 | 2 1.68 2.52
2003 | 2 1.68 4.20
2005 | 1 0.84 5.04
2010 | 1 0.84 5.88
2012 | 1 0.84 6.72
2013 | 1 0.84 7.56
. | 110 92.44 100.00
------------+-----------------------------------
Total | 119 100.00
&lt;/code>&lt;/pre>
&lt;p>Nine countries adopt a quota, spread across &lt;strong>seven cohorts&lt;/strong>; the 2002 and 2003 cohorts contain two countries each, the rest one. The remaining &lt;strong>110 countries are never treated&lt;/strong> — they form the donor pool from which every cohort&amp;rsquo;s synthetic control is built. This staircase of adoption dates is the defining feature of a staggered design, and the reason a single &amp;ldquo;post&amp;rdquo; dummy is too blunt.&lt;/p>
&lt;h2 id="4-exploratory-analysis-with-panelview">4. Exploratory analysis with &lt;code>panelview&lt;/code>&lt;/h2>
&lt;p>A staggered design is best understood by &lt;em>looking&lt;/em> at it. The &lt;code>panelview&lt;/code> command (Xu &amp;amp; Hua) draws two pictures we need: a heatmap of &lt;em>who is treated when&lt;/em>, and the raw outcome trajectories colored by treatment status.&lt;/p>
&lt;pre>&lt;code class="language-stata">ssc install panelview, replace
panelview womparl quota, i(country) t(year) type(treat) bytiming
panelview womparl quota, i(country) t(year) type(outcome)
&lt;/code>&lt;/pre>
&lt;p>&lt;img src="stata_sdid_staggered_panelview_treat.png" alt="Treatment-timing heatmap: countries sorted by adoption year reveal the staggered staircase">&lt;/p>
&lt;p>The treatment heatmap (&lt;code>type(treat)&lt;/code>, sorted with &lt;code>bytiming&lt;/code>) makes the staggered structure unmistakable: the dark treated cells appear in the &lt;strong>top-right corner as a staircase&lt;/strong>, each step a different cohort switching on between 2000 and 2013, against a sea of never-treated controls. This is the visual opposite of a block design, where every treated cell would switch on in the same column.&lt;/p>
&lt;p>&lt;img src="stata_sdid_staggered_panelview_outcome.png" alt="Outcome trajectories: treated countries (orange) against the control spaghetti (blue)">&lt;/p>
&lt;p>The outcome plot (&lt;code>type(outcome)&lt;/code>) overlays all 119 women-in-parliament series, with the 9 treated countries in orange. Several treated countries start near the bottom of the distribution and climb steeply after their adoption year — a hint of a positive effect — but the climbs begin at different times, and a few treated countries barely move. No single &amp;ldquo;treated average&amp;rdquo; line could summarize this; we need cohort-specific counterfactuals.&lt;/p>
&lt;pre>&lt;code class="language-stata">collapse (mean) womparl, by(evertreat year)
* ... reshape and plot ever- vs never-adopting means ...
&lt;/code>&lt;/pre>
&lt;p>&lt;img src="stata_sdid_staggered_raw_trends.png" alt="Mean outcome: ever-adopting vs never-adopting countries">&lt;/p>
&lt;p>Collapsing to group means tells a cautionary tale. The ever-adopting countries (orange) start the 1990s &lt;strong>below&lt;/strong> the never-adopting countries (about 4% vs 10% women in parliament) and end &lt;strong>above&lt;/strong> them by 2015 (about 23% vs 22%). A naive eyeball difference-in-differences on these two lines would be badly confounded: the groups began at different levels and the &amp;ldquo;treated&amp;rdquo; line aggregates countries that switched on in seven different years. The raw means motivate the machinery to come — we must compare each cohort to a &lt;em>tailored&lt;/em> synthetic control, not to the grand average.&lt;/p>
&lt;h2 id="5-synthetic-difference-in-differences-from-first-principles">5. Synthetic difference-in-differences from first principles&lt;/h2>
&lt;p>Before tackling staggered timing, fix ideas with a single cohort. SDID (Arkhangelsky et al., 2021) is a &lt;strong>weighted two-way fixed-effects regression&lt;/strong>. It chooses an ATT, a constant, unit fixed effects, and time fixed effects to minimize a weighted sum of squared residuals:&lt;/p>
&lt;p>$$
\left(\hat{\tau}, \hat{\mu}, \hat{\alpha}, \hat{\beta}\right) = \arg\min_{\tau,\mu,\alpha,\beta} \sum_{i=1}^{N} \sum_{t=1}^{T} \left(Y_{it} - \mu - \alpha_i - \beta_t - W_{it}\,\tau\right)^{2}\, \hat{\omega}_i\, \hat{\lambda}_t
$$&lt;/p>
&lt;p>In words: run a difference-in-differences regression, but weight each observation by a &lt;strong>unit weight&lt;/strong> $\hat{\omega}_i$ times a &lt;strong>time weight&lt;/strong> $\hat{\lambda}_t$. Here $\alpha_i$ is a country fixed effect, $\beta_t$ a year fixed effect, $W_{it}$ the treatment dummy, and $\tau$ the ATT we want. Set all weights equal and you recover ordinary DiD; the weights are what make SDID special. They are not free parameters — each solves its own optimization.&lt;/p>
&lt;p>The &lt;strong>unit weights&lt;/strong> are chosen so that a weighted blend of control countries tracks the treated cohort across the pre-period:&lt;/p>
&lt;p>$$
\hat{\omega} = \arg\min_{\omega_0,\, \omega \ge 0} \sum_{t=1}^{T_{pre}} \left(\omega_0 + \sum_{i=1}^{N_{co}} \omega_i\, Y_{it} - \frac{1}{N_{tr}} \sum_{i=1}^{N_{tr}} Y_{it}\right)^{2} + \zeta^{2}\, T_{pre}\, \lVert \omega \rVert^{2}
$$&lt;/p>
&lt;p>The bracketed term asks the synthetic control $\sum_i \omega_i Y_{it}$ (plus an intercept $\omega_0$) to match the treated average in every pre-adoption year. The intercept $\omega_0$ is the SDID twist: it lets the synthetic match the treated &lt;em>trend&lt;/em> without matching its &lt;em>level&lt;/em>, because any constant level gap is later absorbed by the unit fixed effect $\alpha_i$. The final term is a &lt;strong>ridge penalty&lt;/strong> with regularization strength $\zeta$; it spreads weight across many donors instead of concentrating it on a few, which stabilizes the estimate. (Synthetic control, by contrast, drops $\omega_0$ and the penalty and must match the level too.)&lt;/p>
&lt;p>The &lt;strong>time weights&lt;/strong> are the mirror image — they pick the pre-period years that best predict each control country&amp;rsquo;s post-period average:&lt;/p>
&lt;p>$$
\hat{\lambda} = \arg\min_{\lambda_0,\, \lambda \ge 0} \sum_{i=1}^{N_{co}} \left(\lambda_0 + \sum_{t=1}^{T_{pre}} \lambda_t\, Y_{it} - \frac{1}{T_{post}} \sum_{t=T_{pre}+1}^{T} Y_{it}\right)^{2} + \zeta_{\lambda}^{2}\, N_{co}\, \lVert \lambda \rVert^{2}
$$&lt;/p>
&lt;p>Years that look most like the post-period get the most weight, so the &amp;ldquo;before&amp;rdquo; comparison is built from the most relevant history rather than a flat average over possibly-irrelevant early years. The two weighting schemes together are what distinguish SDID from its cousins, as the table summarizes.&lt;/p>
&lt;table>
&lt;thead>
&lt;tr>
&lt;th>Method&lt;/th>
&lt;th>Unit weights $\omega$&lt;/th>
&lt;th>Time weights $\lambda$&lt;/th>
&lt;th>Unit FE $\alpha_i$&lt;/th>
&lt;th>Must match&lt;/th>
&lt;/tr>
&lt;/thead>
&lt;tbody>
&lt;tr>
&lt;td>&lt;strong>DiD&lt;/strong>&lt;/td>
&lt;td>uniform&lt;/td>
&lt;td>uniform&lt;/td>
&lt;td>yes&lt;/td>
&lt;td>trend on &lt;em>all&lt;/em> controls&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;strong>Synthetic control&lt;/strong>&lt;/td>
&lt;td>optimized&lt;/td>
&lt;td>uniform&lt;/td>
&lt;td>&lt;strong>no&lt;/strong>&lt;/td>
&lt;td>level &lt;em>and&lt;/em> trend&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;strong>SDID&lt;/strong>&lt;/td>
&lt;td>optimized&lt;/td>
&lt;td>optimized&lt;/td>
&lt;td>yes&lt;/td>
&lt;td>trend (level gap allowed)&lt;/td>
&lt;/tr>
&lt;/tbody>
&lt;/table>
&lt;h2 id="6-the-staggered-extension-per-cohort-effects-and-their-aggregation">6. The staggered extension: per-cohort effects and their aggregation&lt;/h2>
&lt;p>Staggered SDID is a disarmingly simple idea: &lt;strong>do the single-cohort analysis once per adoption cohort, then average.&lt;/strong> For each cohort $a$, take only that cohort&amp;rsquo;s treated countries plus the pure never-treated controls, solve the SDID problem above on that sub-panel to get its own $\hat{\omega}_a$, $\hat{\lambda}_a$, and cohort effect $\hat{\tau}_a$. Because each cohort is compared &lt;strong>only to never-treated controls&lt;/strong>, an already-treated unit is never used as a control for a later adopter — precisely the contamination that breaks naive TWFE.&lt;/p>
&lt;pre>&lt;code class="language-mermaid">graph LR
POOL[&amp;quot;110 never-treated&amp;lt;br/&amp;gt;controls (donor pool)&amp;quot;]
C1[&amp;quot;Cohort 2000&amp;lt;br/&amp;gt;+ controls&amp;quot;]
C2[&amp;quot;Cohort 2002&amp;lt;br/&amp;gt;+ controls&amp;quot;]
CD[&amp;quot;Cohorts 2003…2013&amp;lt;br/&amp;gt;+ controls&amp;quot;]
T1[&amp;quot;SDID &amp;amp;rarr; &amp;amp;tau;&amp;lt;sub&amp;gt;2000&amp;lt;/sub&amp;gt; = 8.4&amp;quot;]
T2[&amp;quot;SDID &amp;amp;rarr; &amp;amp;tau;&amp;lt;sub&amp;gt;2002&amp;lt;/sub&amp;gt; = 7.0&amp;quot;]
TD[&amp;quot;SDID &amp;amp;rarr; &amp;amp;tau;&amp;lt;sub&amp;gt;a&amp;lt;/sub&amp;gt;&amp;lt;br/&amp;gt;(&amp;amp;minus;3.5 … +21.8)&amp;quot;]
ATT[&amp;quot;Aggregate ATT = 8.0&amp;lt;br/&amp;gt;weighted by treated periods&amp;quot;]
POOL --&amp;gt; C1 --&amp;gt; T1 --&amp;gt; ATT
POOL --&amp;gt; C2 --&amp;gt; T2 --&amp;gt; ATT
POOL --&amp;gt; CD --&amp;gt; TD --&amp;gt; ATT
style POOL fill:#6a9bcc,stroke:#141413,color:#fff
style C1 fill:#d97757,stroke:#141413,color:#fff
style C2 fill:#d97757,stroke:#141413,color:#fff
style CD fill:#d97757,stroke:#141413,color:#fff
style T1 fill:#1f2b5e,stroke:#6a9bcc,color:#fff
style T2 fill:#1f2b5e,stroke:#6a9bcc,color:#fff
style TD fill:#1f2b5e,stroke:#6a9bcc,color:#fff
style ATT fill:#00d4c8,stroke:#141413,color:#141413
&lt;/code>&lt;/pre>
&lt;p>The overall ATT aggregates the cohort effects with &lt;strong>non-negative&lt;/strong> weights equal to each cohort&amp;rsquo;s share of treated unit-by-post-period observations:&lt;/p>
&lt;p>$$
\widehat{ATT} = \sum_{a \in \mathcal{A}} \frac{N_{tr}^{a}\, T_{post}^{a}}{\sum_{b \in \mathcal{A}} N_{tr}^{b}\, T_{post}^{b}}\ \hat{\tau}_a
$$&lt;/p>
&lt;p>In words: a cohort counts in proportion to how many treated country-years it contributes. The 2000 cohort, treated for 16 years (2000–2015), carries more weight than the 2013 cohort, treated for only 3. This is the staggered generalization of single-cohort SDID, and — unlike TWFE — every weight is positive and interpretable. (When each cohort has one treated unit, this reduces to the post-period share $T_{post}^{a}/T_{post}$ from Clarke et al., 2024.)&lt;/p>
&lt;h2 id="7-estimation-in-stata">7. Estimation in Stata&lt;/h2>
&lt;p>One command does the whole staggered procedure. We request bootstrap inference and a fixed seed for reproducibility.&lt;/p>
&lt;pre>&lt;code class="language-stata">sdid womparl country year quota, vce(bootstrap) seed(1213)
matrix list e(tau)
&lt;/code>&lt;/pre>
&lt;pre>&lt;code class="language-text">Synthetic Difference-in-Differences Estimator
-----------------------------------------------------------------------------
womparl | ATT Std. Err. t P&amp;gt;|t| [95% Conf. Interval]
-------------+---------------------------------------------------------------
quota | 8.03410 3.74040 2.15 0.032 0.70305 15.36516
-----------------------------------------------------------------------------
&lt;/code>&lt;/pre>
&lt;p>The overall &lt;strong>ATT is +8.03 percentage points&lt;/strong> (SE 3.74, $t=2.15$, $p=0.032$), with a 95% confidence interval of [0.70, 15.37] that excludes zero. Substantively: adopting a parliamentary gender quota raises the share of women in parliament by about &lt;strong>eight percentage points&lt;/strong> in the adopting countries — a large effect against a sample mean of 15%, and statistically distinguishable from no effect at the 5% level.&lt;/p>
&lt;p>The single number, though, is the average of a very heterogeneous set of cohort effects, returned in &lt;code>e(tau)&lt;/code>:&lt;/p>
&lt;pre>&lt;code class="language-text">T[7,3]
Tau Std.Err. Time
r1 8.3888685 .68278345 2000
r2 6.9677465 .64102999 2002
r3 13.952256 9.1289943 2003
r4 -3.4505431 .75603453 2005
r5 2.7490355 .44799502 2010
r6 21.762716 .91589982 2012
r7 -.82032354 .83151601 2013
&lt;/code>&lt;/pre>
&lt;p>&lt;img src="stata_sdid_staggered_cohort_taus.png" alt="Cohort-specific SDID effects with 95% confidence intervals and the aggregate ATT">&lt;/p>
&lt;p>The cohort effects span an enormous range: from &lt;strong>−3.5 points&lt;/strong> (2005 cohort) to &lt;strong>+21.8 points&lt;/strong> (2012 cohort), with the 2003 cohort essentially uninformative (SE 9.13, a confidence interval that runs from −4 to +32). The teal line marks the aggregate ATT of 8.0. Notice that this aggregate is &lt;strong>not&lt;/strong> the simple average of the seven cohort effects — that average would be about 7.0. It is the &lt;em>treated-period-weighted&lt;/em> average from the aggregation formula, which up-weights the earlier, longer-exposed 2000, 2002, and 2003 cohorts. The lesson of the figure is that &amp;ldquo;+8 points on average&amp;rdquo; is a summary of real heterogeneity, not a universal constant; some quotas were transformative, others did nothing measurable.&lt;/p>
&lt;p>To see the synthetic-control machinery underneath one cohort, the figure below plots the 2002 cohort against its synthetic control. Because SDID matches the pre-period &lt;em>trend&lt;/em> and lets the unit fixed effect absorb the &lt;em>level&lt;/em> gap, we anchor the synthetic to the treated cohort by its $\lambda$-weighted pre-period gap so the two align before adoption.&lt;/p>
&lt;p>&lt;img src="stata_sdid_staggered_cohort2002_path.png" alt="SDID counterfactual for the 2002 cohort (synthetic anchored to the treated pre-period)">&lt;/p>
&lt;p>The treated 2002 cohort (orange) and its anchored synthetic control (blue dashed) track each other closely &lt;strong>before 2002&lt;/strong> — the synthetic was built precisely to do so — and then diverge: the treated cohort climbs to roughly 15% women in parliament while the synthetic counterfactual reaches only about 9–10%. That post-2002 gap is the cohort effect, about +7 points, matching $\hat{\tau}_{2002}=6.97$ from &lt;code>e(tau)&lt;/code>.&lt;/p>
&lt;p>Which pre-period years anchor that comparison? The time weights $\hat{\lambda}_t$ for the 2002 cohort do not spread evenly over 1990–2001 — they concentrate on the years just before adoption.&lt;/p>
&lt;p>&lt;img src="stata_sdid_staggered_lambda.png" alt="SDID pre-period time weights (λ) for the 2002 cohort">&lt;/p>
&lt;p>The bars show SDID&amp;rsquo;s baseline for the 2002 cohort leaning on the late 1990s and 2001 — the pre-adoption years whose level most resembles the post-adoption period — rather than weighting all twelve pre-years equally as a plain difference-in-differences would. This is the time-weighting half of SDID at work: it builds the &amp;ldquo;before&amp;rdquo; from the most relevant history, which is also the baseline the event study below measures against.&lt;/p>
&lt;h2 id="8-adding-a-covariate-optimized-vs-projected">8. Adding a covariate: optimized vs projected&lt;/h2>
&lt;p>Does the quota effect simply reflect economic development — richer countries both grow GDP and elect more women? We can condition on log GDP per capita. The &lt;code>sdid&lt;/code> command offers two routes, and SDID needs a balanced panel, so we first drop the country-years with missing &lt;code>lngdp&lt;/code>.&lt;/p>
&lt;pre>&lt;code class="language-stata">drop if missing(lngdp)
sdid womparl country year quota, vce(bootstrap) seed(2022) covariates(lngdp, optimized)
sdid womparl country year quota, vce(bootstrap) seed(1213) covariates(lngdp, projected)
&lt;/code>&lt;/pre>
&lt;pre>&lt;code class="language-text">SDID + lngdp (optimized) ATT = 8.0515 SE = 3.0466
SDID + lngdp (projected) ATT = 8.0593 SE = 3.1191
&lt;/code>&lt;/pre>
&lt;p>The two methods differ in &lt;em>how&lt;/em> they estimate the covariate&amp;rsquo;s coefficient. The &lt;strong>optimized&lt;/strong> method (Arkhangelsky et al., 2021) folds the covariate adjustment into the SDID optimization itself, estimating it jointly with the weights — flexible but computationally heavy. The &lt;strong>projected&lt;/strong> method (Kranz, 2022) instead regresses the outcome on the covariate among the &lt;em>untreated&lt;/em> observations first, then runs SDID on the residuals — much faster and numerically more stable. Reassuringly, here they agree to the second decimal: &lt;strong>8.05 and 8.06&lt;/strong>, essentially unchanged from the no-covariate estimate of 8.03. Controlling for income does &lt;strong>not&lt;/strong> explain away the quota effect; the result is robust to the most obvious confounder.&lt;/p>
&lt;h2 id="9-the-event-study-with-sdid_event">9. The event study with &lt;code>sdid_event&lt;/code>&lt;/h2>
&lt;p>A single ATT — even per cohort — cannot tell us &lt;em>when&lt;/em> the effect appears, or whether treated and control countries were already diverging &lt;em>before&lt;/em> the quota. For that we need an &lt;strong>event study&lt;/strong>: the treatment effect traced out by years relative to adoption. The modern &lt;code>sdid_event&lt;/code> command (Ciccia, Clarke &amp;amp; Pailañir, 2024) computes exactly this for SDID, including pre-period &lt;strong>placebo&lt;/strong> estimates that serve as a parallel-trends test.&lt;/p>
&lt;p>The dynamic effect at event time $\ell$ is the treated-minus-synthetic gap in that period, &lt;em>net of the same gap at baseline&lt;/em>, where — characteristically for SDID — the baseline is the $\lambda$-weighted pre-period average rather than a single &amp;ldquo;year −1&amp;rdquo;:&lt;/p>
&lt;p>$$
\delta_{\ell} = \left(\bar{Y}_{\ell}^{,tr} - \bar{Y}_{\ell}^{,co}\right) - \left(\bar{Y}_{base}^{,tr} - \bar{Y}_{base}^{,co}\right), \qquad \bar{Y}_{base}^{,g} = \sum_{t=1}^{T_{pre}} \hat{\lambda}_t\, \bar{Y}_t^{,g}
$$&lt;/p>
&lt;p>&lt;code>sdid_event&lt;/code> handles the full staggered panel directly, returning a cohort-aggregated ATT plus dynamic effects. To read the dynamics transparently we focus the &lt;em>plot&lt;/em> on the 2002 cohort — the package authors&amp;rsquo; own worked example — which gives a clean event-time axis; the full-panel call confirms the same aggregated ATT (≈ 8.06).&lt;/p>
&lt;pre>&lt;code class="language-stata">ssc install sdid_event, replace
* full staggered panel: aggregated ATT + cohort-aggregated dynamic effects
sdid_event womparl country year quota, vce(bootstrap) brep(100) effects(8) placebo(5) covariates(lngdp)
* clean event study on the 2002 cohort, with all placebos
keep if quotaYear==2002 | quotaYear==.
sdid_event womparl country year quota, vce(placebo) brep(100) placebo(all) covariates(lngdp)
&lt;/code>&lt;/pre>
&lt;pre>&lt;code class="language-text"> | Estimate SE LB CI UB CI Switchers
-------------+------------------------------------------------------
ATT | 6.853472 3.372744 .2428928 13.46405 2
Effect_1 | 4.086404 1.191517 1.75103 6.421778 2
Effect_2 | 9.164442 1.522799 6.179756 12.14913 2
Effect_3 | 7.938504 2.182572 3.660663 12.21635 2
... |
Placebo_1 | -.218417 .470226 -1.14006 .703227 2
Placebo_2 | .242148 .884557 -1.491584 1.975880 2
... |
&lt;/code>&lt;/pre>
&lt;p>&lt;img src="stata_sdid_staggered_event_study.png" alt="Event-study SDID for the 2002 cohort: flat placebos before adoption, rising effects after">&lt;/p>
&lt;p>This plot rewards careful reading, and there are three things to look for.&lt;/p>
&lt;p>&lt;strong>First, the baseline is $\lambda$-weighted, not &amp;ldquo;the year before.&amp;rdquo;&lt;/strong> Unlike a textbook event study that normalizes to $t=-1$, SDID measures everything against the optimally weighted pre-period average. That is why the zero line is a &lt;em>weighted&lt;/em> baseline; do not read it as the single pre-adoption year.&lt;/p>
&lt;p>&lt;strong>Second, the points to the &lt;em>left&lt;/em> of zero are placebo tests.&lt;/strong> Every pre-adoption coefficient (&lt;code>Placebo_1&lt;/code> through &lt;code>Placebo_12&lt;/code>, event times −1 to −12) sits within a whisker of zero — ranging only from about −0.2 to +0.8. Because the treated cohort and its synthetic control moved in parallel &lt;em>before&lt;/em> 2002, we cannot reject that the parallel-(synthetic-)trends assumption holds. This is the identifying assumption made visible and, here, survived.&lt;/p>
&lt;p>&lt;strong>Third, the points to the &lt;em>right&lt;/em> of zero are the dynamic ATT.&lt;/strong> The effect appears immediately at adoption (&lt;code>Effect_1&lt;/code> = +4.1 points at event time 0), roughly doubles within a year or two (&lt;code>Effect_2&lt;/code> = +9.2), and then settles in the +6 to +9 range for over a decade. Quotas do not just shift the level once; they sustain a higher share of women in parliament. Aggregated by the same treated-period logic as before, these dynamic effects reproduce the cohort&amp;rsquo;s overall ATT of about +7 points — but the plot shows the &lt;em>shape&lt;/em> the single number conceals.&lt;/p>
&lt;h2 id="10-inference-bootstrap-jackknife-and-placebo">10. Inference: bootstrap, jackknife, and placebo&lt;/h2>
&lt;p>With one treated unit (California), the previous tutorial could only use placebo/permutation inference. With &lt;strong>nine&lt;/strong> treated units here, all three of &lt;code>sdid&lt;/code>&amp;rsquo;s variance estimators are on the table. To keep the comparison clean — jackknife needs more than one treated unit &lt;em>per adoption period&lt;/em> — we follow Clarke et al. (2024) and restrict to the two-country 2002 and 2003 cohorts by dropping the five single-country cohorts.&lt;/p>
&lt;pre>&lt;code class="language-mermaid">graph TD
Q1{&amp;quot;How many&amp;lt;br/&amp;gt;treated units?&amp;quot;}
Q1 --&amp;gt;|&amp;quot;One (e.g. California)&amp;quot;| PL1[&amp;quot;Placebo only&amp;lt;br/&amp;gt;jackknife undefined&amp;quot;]
Q1 --&amp;gt;|&amp;quot;Many (e.g. 9 quota adopters)&amp;quot;| Q2{&amp;quot;More controls than treated?&amp;lt;br/&amp;gt;no singleton cohorts?&amp;quot;}
Q2 --&amp;gt;|&amp;quot;Yes&amp;quot;| ALL[&amp;quot;All three available&amp;quot;]
Q2 --&amp;gt;|&amp;quot;Singleton cohorts&amp;quot;| PL2[&amp;quot;Placebo / bootstrap&amp;lt;br/&amp;gt;jackknife drops out&amp;quot;]
ALL --&amp;gt; BOOT[&amp;quot;bootstrap&amp;lt;br/&amp;gt;SE 4.7 (default)&amp;quot;]
ALL --&amp;gt; JACK[&amp;quot;jackknife&amp;lt;br/&amp;gt;SE 6.0 (most conservative)&amp;quot;]
ALL --&amp;gt; PLAC[&amp;quot;placebo&amp;lt;br/&amp;gt;SE 2.3 (homoskedastic)&amp;quot;]
style Q1 fill:#141413,stroke:#6a9bcc,color:#fff
style Q2 fill:#141413,stroke:#6a9bcc,color:#fff
style PL1 fill:#d97757,stroke:#141413,color:#fff
style PL2 fill:#d97757,stroke:#141413,color:#fff
style ALL fill:#00d4c8,stroke:#141413,color:#141413
style BOOT fill:#6a9bcc,stroke:#141413,color:#fff
style JACK fill:#6a9bcc,stroke:#141413,color:#fff
style PLAC fill:#6a9bcc,stroke:#141413,color:#fff
&lt;/code>&lt;/pre>
&lt;pre>&lt;code class="language-stata">drop if inlist(country,&amp;quot;Algeria&amp;quot;,&amp;quot;Kenya&amp;quot;,&amp;quot;Samoa&amp;quot;,&amp;quot;Swaziland&amp;quot;,&amp;quot;Tanzania&amp;quot;)
sdid womparl country year quota, vce(bootstrap) seed(1213)
sdid womparl country year quota, vce(placebo) seed(1213)
sdid womparl country year quota, vce(jackknife)
&lt;/code>&lt;/pre>
&lt;pre>&lt;code class="language-text">method att se ci_l ci_u
bootstrap 10.33066 4.7291 1.0618 19.5995
placebo 10.33066 2.3404 5.7436 14.9178
jackknife 10.33066 6.0056 -1.4401 22.1014
&lt;/code>&lt;/pre>
&lt;p>&lt;img src="stata_sdid_staggered_inference.png" alt="Same ATT, three variance estimators">&lt;/p>
&lt;p>The point estimate is &lt;strong>identical&lt;/strong> across all three methods — 10.33 points on this subsample — because the inference procedure changes only the &lt;em>standard error&lt;/em>, never the estimate. But the standard errors differ by a factor of nearly three: &lt;strong>jackknife is the most conservative&lt;/strong> (SE 6.01, a confidence interval that crosses zero), &lt;strong>placebo is the tightest&lt;/strong> (SE 2.34) but rests on a homoskedasticity assumption and requires more controls than treated units, and &lt;strong>bootstrap sits in between&lt;/strong> (SE 4.73) and is the default. The practical takeaway: with only a handful of treated units, report the bootstrap as your headline but cross-check it — a result that is &amp;ldquo;significant&amp;rdquo; under placebo but not under jackknife deserves caution. (The subsample ATT of 10.3 is larger than the full-sample 8.0 because dropping the five single-country cohorts discards the negative 2005 and 2013 effects.)&lt;/p>
&lt;h2 id="11-robustness-and-discussion">11. Robustness and discussion&lt;/h2>
&lt;p>Three caveats keep the result honest. &lt;strong>Effect concentration:&lt;/strong> the +8 aggregate leans heavily on a few cohorts — the 2012 cohort alone contributes a +21.8 effect, and the early 2000/2002/2003 cohorts carry most of the aggregation weight. Drop the 2012 cohort and the average falls noticeably. &lt;strong>Fragile counterfactuals:&lt;/strong> with only 110 controls and as few as one treated country per cohort, some synthetic controls are imprecise — the 2003 cohort&amp;rsquo;s standard error of 9.13 is the tell. &lt;strong>Identifying assumptions:&lt;/strong> SDID still requires no anticipation, an absorbing treatment, no cross-country spillovers, and that quota timing is not itself a response to the outcome&amp;rsquo;s trajectory; the flat event-study placebos support, but cannot prove, the parallel-trends part. Finally, &lt;code>quota_example&lt;/code> is a teaching subset of Bhalotra et al. (2023); these numbers illustrate the &lt;em>method&lt;/em>, not a final verdict on quota policy.&lt;/p>
&lt;h2 id="12-summary-and-key-takeaways">12. Summary and key takeaways&lt;/h2>
&lt;ul>
&lt;li>&lt;strong>Method.&lt;/strong> Staggered SDID estimates a &lt;em>separate, clean&lt;/em> synthetic difference-in-differences for each adoption cohort — comparing it only to never-treated controls — and aggregates the cohort effects $\hat{\tau}_a$ with non-negative, treated-period-share weights. This avoids the negative-weighting trap that contaminates naive two-way fixed-effects DiD under staggered timing.&lt;/li>
&lt;li>&lt;strong>Result.&lt;/strong> Gender quotas raise the share of women in parliament by an overall &lt;strong>ATT of +8.0 percentage points&lt;/strong> (SE 3.74, $p=0.032$), robust to a log-GDP control (8.05 optimized, 8.06 projected). Cohort effects range widely, from &lt;strong>−3.5 to +21.8 points&lt;/strong> — heterogeneity the single number hides.&lt;/li>
&lt;li>&lt;strong>Event study.&lt;/strong> The &lt;code>sdid_event&lt;/code> plot shows pre-adoption placebo coefficients near zero (parallel synthetic trends) and post-adoption effects that appear immediately and persist for over a decade — the dynamics behind the average.&lt;/li>
&lt;li>&lt;strong>Inference.&lt;/strong> With nine treated units, bootstrap, jackknife, and placebo are all available; they share one point estimate (10.3 on the two-cohort illustration) but report standard errors of 4.7, 6.0, and 2.3. Jackknife is the most conservative.&lt;/li>
&lt;li>&lt;strong>Bridge.&lt;/strong> The block design (Proposition 99, the &lt;a href="https://carlos-mendez.org/post/stata_sdid/">previous tutorial&lt;/a>) and the staggered design here are two faces of one estimator — the staggered version is just single-cohort SDID, done once per cohort and averaged.&lt;/li>
&lt;/ul>
&lt;h2 id="13-exercises">13. Exercises&lt;/h2>
&lt;ol>
&lt;li>&lt;strong>Re-aggregate by hand.&lt;/strong> Pull &lt;code>e(tau)&lt;/code> and each cohort&amp;rsquo;s treated unit-count and post-period length. Verify that the treated-period-weighted average of the seven $\hat{\tau}_a$ reproduces the overall ATT of 8.03, and show that it differs from the unweighted mean (≈ 7.0). Which cohorts move the aggregate the most?&lt;/li>
&lt;li>&lt;strong>Inference sensitivity.&lt;/strong> Re-run the full nine-country sample with &lt;code>vce(bootstrap)&lt;/code> and then &lt;code>vce(placebo)&lt;/code> at &lt;code>reps(500)&lt;/code>. How much do the standard error and confidence interval move, and which would you report given only nine treated units?&lt;/li>
&lt;li>&lt;strong>Drop the outlier cohort.&lt;/strong> Re-estimate the overall ATT excluding the 2012 cohort (the +21.8 outlier). How far does the aggregate fall, and what does that tell you about how concentrated the average effect is?&lt;/li>
&lt;/ol>
&lt;h2 id="14-references">14. References&lt;/h2>
&lt;ol>
&lt;li>Arkhangelsky, D., Athey, S., Hirshberg, D. A., Imbens, G. W., &amp;amp; Wager, S. (2021). &lt;a href="https://doi.org/10.1257/aer.20190159" target="_blank" rel="noopener">Synthetic Difference-in-Differences&lt;/a>. &lt;em>American Economic Review&lt;/em>, 111(12), 4088–4118.&lt;/li>
&lt;li>Clarke, D., Pailañir, D., Athey, S., &amp;amp; Imbens, G. (2024). &lt;a href="https://doi.org/10.1177/1536867X241297184" target="_blank" rel="noopener">On Synthetic Difference-in-Differences and Related Estimation Methods in Stata&lt;/a>. &lt;em>The Stata Journal&lt;/em>, 24(4). Package: &lt;code>ssc install sdid&lt;/code>.&lt;/li>
&lt;li>Ciccia, D. (2024). &lt;a href="https://arxiv.org/abs/2407.09565" target="_blank" rel="noopener">A Short Note on Event-Study Synthetic Difference-in-Differences Estimators&lt;/a>. Package: &lt;code>ssc install sdid_event&lt;/code>.&lt;/li>
&lt;li>Bhalotra, S., Clarke, D., Gomes, J. F., &amp;amp; Venkataramani, A. (2023). &lt;a href="https://doi.org/10.1093/jeea/jvad043" target="_blank" rel="noopener">Maternal Mortality and Women&amp;rsquo;s Political Power&lt;/a>. &lt;em>Journal of the European Economic Association&lt;/em>. (Source of the &lt;code>quota_example&lt;/code> data.)&lt;/li>
&lt;li>Goodman-Bacon, A. (2021). &lt;a href="https://doi.org/10.1016/j.jeconom.2021.03.014" target="_blank" rel="noopener">Difference-in-Differences with Variation in Treatment Timing&lt;/a>. &lt;em>Journal of Econometrics&lt;/em>, 225(2), 254–277.&lt;/li>
&lt;li>de Chaisemartin, C., &amp;amp; D&amp;rsquo;Haultfœuille, X. (2020). &lt;a href="https://doi.org/10.1257/aer.20181169" target="_blank" rel="noopener">Two-Way Fixed Effects Estimators with Heterogeneous Treatment Effects&lt;/a>. &lt;em>American Economic Review&lt;/em>, 110(9), 2964–2996.&lt;/li>
&lt;li>Xu, Y., &amp;amp; Hua, L. &lt;a href="https://yiqingxu.org/packages/panelview_stata/" target="_blank" rel="noopener">panelView: Visualizing Panel Data&lt;/a>. Package: &lt;code>ssc install panelview&lt;/code>.&lt;/li>
&lt;/ol>
&lt;p>&lt;em>Related tutorials on this site:&lt;/em> &lt;a href="https://carlos-mendez.org/post/stata_sdid/">Synthetic Difference-in-Differences (the block design)&lt;/a> · &lt;a href="https://carlos-mendez.org/post/stata_did/">Difference-in-Differences&lt;/a>.&lt;/p>
&lt;h2 id="15-acknowledgments">15. Acknowledgments&lt;/h2>
&lt;p>This tutorial uses the &lt;code>sdid&lt;/code> command (Clarke, Pailañir, Athey &amp;amp; Imbens), the &lt;code>sdid_event&lt;/code> command (Ciccia, Clarke &amp;amp; Pailañir), and &lt;code>panelview&lt;/code> (Xu &amp;amp; Hua). The data, &lt;code>quota_example&lt;/code>, is distributed with &lt;code>sdid&lt;/code> and draws on Bhalotra, Clarke, Gomes &amp;amp; Venkataramani (2023). All estimates were produced by the companion &lt;code>analysis.do&lt;/code> and verified against Clarke et al. (2024). AI tools (Claude Code) assisted with drafting and figure preparation; all code was executed and every number checked by the author.&lt;/p>
&lt;hr>
&lt;style>
.podcast-overlay {
display: none;
position: fixed;
bottom: 0;
left: 0;
right: 0;
z-index: 9999;
animation: podSlideUp 0.35s ease-out;
}
@keyframes podSlideUp {
from { transform: translateY(100%); }
to { transform: translateY(0); }
}
.podcast-overlay.pod-closing {
animation: podSlideDown 0.3s ease-in forwards;
}
@keyframes podSlideDown {
from { transform: translateY(0); }
to { transform: translateY(100%); }
}
.podcast-container {
background: linear-gradient(135deg, #1a1a2e 0%, #16213e 100%);
padding: 18px 24px 20px;
font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, sans-serif;
box-shadow: 0 -4px 32px rgba(0,0,0,0.5);
border-top: 1px solid rgba(106,155,204,0.2);
}
.podcast-inner {
max-width: 800px;
margin: 0 auto;
}
.podcast-top-row {
display: flex;
align-items: center;
gap: 14px;
margin-bottom: 14px;
}
.podcast-icon {
width: 42px;
height: 42px;
background: linear-gradient(135deg, #d97757, #e8956a);
border-radius: 10px;
display: flex;
align-items: center;
justify-content: center;
flex-shrink: 0;
}
.podcast-icon svg {
width: 22px;
height: 22px;
fill: #fff;
}
.podcast-title-block {
flex: 1;
min-width: 0;
}
.podcast-title-block h4 {
margin: 0 0 1px 0;
color: #f0ece2;
font-size: 14px;
font-weight: 600;
letter-spacing: 0.02em;
white-space: nowrap;
overflow: hidden;
text-overflow: ellipsis;
}
.podcast-title-block span {
color: #8b9dc3;
font-size: 11px;
}
.podcast-close-btn {
background: none;
border: none;
cursor: pointer;
padding: 6px;
border-radius: 50%;
display: flex;
align-items: center;
justify-content: center;
transition: background 0.2s;
flex-shrink: 0;
}
.podcast-close-btn:hover {
background: rgba(255,255,255,0.1);
}
.podcast-close-btn svg {
width: 20px;
height: 20px;
fill: #8b9dc3;
}
.podcast-progress-wrap {
margin-bottom: 12px;
}
.podcast-time-row {
display: flex;
justify-content: space-between;
font-size: 11px;
color: #8b9dc3;
margin-bottom: 5px;
font-variant-numeric: tabular-nums;
}
.podcast-bar-bg {
width: 100%;
height: 6px;
background: rgba(255,255,255,0.1);
border-radius: 3px;
cursor: pointer;
position: relative;
overflow: hidden;
transition: height 0.15s;
}
.podcast-bar-buffered {
position: absolute;
top: 0;
left: 0;
height: 100%;
background: rgba(106,155,204,0.25);
border-radius: 3px;
transition: width 0.3s;
}
.podcast-bar-progress {
position: absolute;
top: 0;
left: 0;
height: 100%;
background: linear-gradient(90deg, #6a9bcc, #00d4c8);
border-radius: 3px;
transition: width 0.1s linear;
}
.podcast-bar-bg:hover {
height: 10px;
margin-top: -2px;
}
.podcast-controls-row {
display: flex;
align-items: center;
justify-content: space-between;
}
.podcast-transport {
display: flex;
align-items: center;
gap: 8px;
}
.podcast-btn {
background: none;
border: none;
cursor: pointer;
padding: 4px;
display: flex;
align-items: center;
justify-content: center;
border-radius: 50%;
transition: all 0.2s;
}
.podcast-btn svg {
fill: #c8d0e0;
transition: fill 0.2s;
}
.podcast-btn:hover svg {
fill: #f0ece2;
}
.podcast-btn-skip {
position: relative;
}
.podcast-btn-skip span {
position: absolute;
font-size: 7px;
font-weight: 700;
color: #c8d0e0;
top: 50%;
left: 50%;
transform: translate(-50%, -50%);
pointer-events: none;
margin-top: 1px;
}
.podcast-btn-play {
width: 48px;
height: 48px;
background: linear-gradient(135deg, #d97757, #e8956a);
border-radius: 50%;
box-shadow: 0 3px 12px rgba(217,119,87,0.4);
transition: all 0.2s;
}
.podcast-btn-play:hover {
transform: scale(1.08);
box-shadow: 0 5px 20px rgba(217,119,87,0.5);
}
.podcast-btn-play svg {
fill: #fff;
width: 22px;
height: 22px;
}
.podcast-extras {
display: flex;
align-items: center;
gap: 10px;
}
.podcast-volume-wrap {
display: flex;
align-items: center;
gap: 5px;
}
.podcast-volume-wrap svg {
fill: #8b9dc3;
width: 16px;
height: 16px;
cursor: pointer;
flex-shrink: 0;
}
.podcast-volume-wrap svg:hover {
fill: #c8d0e0;
}
.podcast-volume-slider {
-webkit-appearance: none;
appearance: none;
width: 60px;
height: 4px;
background: rgba(255,255,255,0.12);
border-radius: 2px;
outline: none;
cursor: pointer;
}
.podcast-volume-slider::-webkit-slider-thumb {
-webkit-appearance: none;
appearance: none;
width: 12px;
height: 12px;
background: #6a9bcc;
border-radius: 50%;
cursor: pointer;
}
.podcast-speed-btn {
background: rgba(255,255,255,0.08);
border: 1px solid rgba(255,255,255,0.12);
color: #c8d0e0;
font-size: 11px;
font-weight: 600;
padding: 3px 9px;
border-radius: 12px;
cursor: pointer;
transition: all 0.2s;
font-family: inherit;
min-width: 40px;
text-align: center;
}
.podcast-speed-btn:hover {
background: rgba(106,155,204,0.2);
border-color: #6a9bcc;
color: #f0ece2;
}
.podcast-download-btn {
background: none;
border: 1px solid rgba(255,255,255,0.12);
border-radius: 8px;
padding: 4px 10px;
cursor: pointer;
display: flex;
align-items: center;
gap: 4px;
color: #8b9dc3;
font-size: 11px;
font-family: inherit;
text-decoration: none;
transition: all 0.2s;
}
.podcast-download-btn:hover {
border-color: #6a9bcc;
color: #f0ece2;
background: rgba(106,155,204,0.1);
}
.podcast-download-btn svg {
width: 14px;
height: 14px;
fill: currentColor;
}
@media (max-width: 600px) {
.podcast-container { padding: 14px 16px 16px; }
.podcast-volume-wrap { display: none; }
.podcast-title-block h4 { font-size: 13px; }
.podcast-extras { gap: 8px; }
}
&lt;/style>
&lt;div class="podcast-overlay" id="podOverlay">
&lt;div class="podcast-container">
&lt;div class="podcast-inner">
&lt;audio id="podAudio" preload="none" src="https://files.catbox.moe/iea7xk.m4a">&lt;/audio>
&lt;div class="podcast-top-row">
&lt;div class="podcast-icon">
&lt;svg viewBox="0 0 24 24">&lt;path d="M12 1a5 5 0 0 0-5 5v4a5 5 0 0 0 10 0V6a5 5 0 0 0-5-5zm0 16a7 7 0 0 1-7-7H3a9 9 0 0 0 8 8.94V22h2v-3.06A9 9 0 0 0 21 10h-2a7 7 0 0 1-7 7z"/>&lt;/svg>
&lt;/div>
&lt;div class="podcast-title-block">
&lt;h4>AI Podcast: Staggered Synthetic Difference-in-Differences&lt;/h4>
&lt;span id="podDurationLabel">Click play to load&lt;/span>
&lt;/div>
&lt;button class="podcast-close-btn" onclick="podClose()" title="Close player">
&lt;svg viewBox="0 0 24 24">&lt;path d="M19 6.41L17.59 5 12 10.59 6.41 5 5 6.41 10.59 12 5 17.59 6.41 19 12 13.41 17.59 19 19 17.59 13.41 12z"/>&lt;/svg>
&lt;/button>
&lt;/div>
&lt;div class="podcast-progress-wrap">
&lt;div class="podcast-time-row">
&lt;span id="podCurrent">0:00&lt;/span>
&lt;span id="podDuration">0:00&lt;/span>
&lt;/div>
&lt;div class="podcast-bar-bg" id="podBarBg" onclick="podSeek(event)">
&lt;div class="podcast-bar-buffered" id="podBuffered">&lt;/div>
&lt;div class="podcast-bar-progress" id="podProgress">&lt;/div>
&lt;/div>
&lt;/div>
&lt;div class="podcast-controls-row">
&lt;div class="podcast-transport">
&lt;button class="podcast-btn podcast-btn-skip" onclick="podSkip(-15)" title="Back 15s">
&lt;svg width="26" height="26" viewBox="0 0 24 24">&lt;path d="M12 5V1L7 6l5 5V7c3.31 0 6 2.69 6 6s-2.69 6-6 6-6-2.69-6-6H4c0 4.42 3.58 8 8 8s8-3.58 8-8-3.58-8-8-8z"/>&lt;/svg>
&lt;span>15&lt;/span>
&lt;/button>
&lt;button class="podcast-btn podcast-btn-play" id="podPlayBtn" onclick="podToggle()" title="Play">
&lt;svg id="podIconPlay" viewBox="0 0 24 24">&lt;path d="M8 5v14l11-7z"/>&lt;/svg>
&lt;svg id="podIconPause" viewBox="0 0 24 24" style="display:none">&lt;path d="M6 19h4V5H6v14zm8-14v14h4V5h-4z"/>&lt;/svg>
&lt;/button>
&lt;button class="podcast-btn podcast-btn-skip" onclick="podSkip(15)" title="Forward 15s">
&lt;svg width="26" height="26" viewBox="0 0 24 24">&lt;path d="M12 5V1l5 5-5 5V7c-3.31 0-6 2.69-6 6s2.69 6 6 6 6-2.69 6-6h2c0 4.42-3.58 8-8 8s-8-3.58-8-8 3.58-8 8-8z"/>&lt;/svg>
&lt;span>15&lt;/span>
&lt;/button>
&lt;/div>
&lt;div class="podcast-extras">
&lt;div class="podcast-volume-wrap">
&lt;svg id="podVolIcon" onclick="podMute()" viewBox="0 0 24 24">&lt;path d="M3 9v6h4l5 5V4L7 9H3zm13.5 3A4.5 4.5 0 0 0 14 8.5v7a4.47 4.47 0 0 0 2.5-3.5zM14 3.23v2.06a6.51 6.51 0 0 1 0 13.42v2.06A8.51 8.51 0 0 0 14 3.23z"/>&lt;/svg>
&lt;input type="range" class="podcast-volume-slider" id="podVolume" min="0" max="1" step="0.05" value="0.8">
&lt;/div>
&lt;button class="podcast-speed-btn" id="podSpeedBtn" onclick="podCycleSpeed()" title="Playback speed">1x&lt;/button>
&lt;a class="podcast-download-btn" href="https://files.catbox.moe/iea7xk.m4a" target="_blank" rel="noopener" title="Stream">
&lt;svg viewBox="0 0 24 24">&lt;path d="M19 9h-4V3H9v6H5l7 7 7-7zM5 18v2h14v-2H5z"/>&lt;/svg>
&lt;/a>
&lt;/div>
&lt;/div>
&lt;/div>
&lt;/div>
&lt;/div>
&lt;script>
(function(){
var overlay = document.getElementById('podOverlay');
var a = document.getElementById('podAudio');
var speeds = [0.75, 1, 1.25, 1.5, 2];
var si = 1;
var opened = false;
function fmt(s){
if(isNaN(s)) return '0:00';
var m=Math.floor(s/60), sec=Math.floor(s%60);
return m+':'+(sec&lt;10?'0':'')+sec;
}
document.addEventListener('click', function(e){
var link = e.target.closest('a.btn-page-header');
if(!link) return;
var text = link.textContent.trim();
if(text.indexOf('AI Podcast') === -1) return;
e.preventDefault();
e.stopPropagation();
overlay.style.display = 'block';
overlay.classList.remove('pod-closing');
if(!opened){
a.preload = 'metadata';
a.load();
opened = true;
}
});
a.volume = 0.8;
a.addEventListener('loadedmetadata', function(){
document.getElementById('podDuration').textContent = fmt(a.duration);
document.getElementById('podDurationLabel').textContent = fmt(a.duration) + ' minutes';
});
a.addEventListener('timeupdate', function(){
document.getElementById('podCurrent').textContent = fmt(a.currentTime);
var pct = a.duration ? (a.currentTime/a.duration)*100 : 0;
document.getElementById('podProgress').style.width = pct+'%';
});
a.addEventListener('progress', function(){
if(a.buffered.length>0){
var pct = (a.buffered.end(a.buffered.length-1)/a.duration)*100;
document.getElementById('podBuffered').style.width = pct+'%';
}
});
a.addEventListener('ended', function(){
document.getElementById('podIconPlay').style.display='';
document.getElementById('podIconPause').style.display='none';
});
window.podToggle = function(){
if(a.paused){a.play();document.getElementById('podIconPlay').style.display='none';document.getElementById('podIconPause').style.display='';}
else{a.pause();document.getElementById('podIconPlay').style.display='';document.getElementById('podIconPause').style.display='none';}
};
window.podSkip = function(s){a.currentTime = Math.max(0,Math.min(a.duration||0,a.currentTime+s));};
window.podSeek = function(e){
var rect = document.getElementById('podBarBg').getBoundingClientRect();
var pct = (e.clientX - rect.left)/rect.width;
a.currentTime = pct * (a.duration||0);
};
window.podMute = function(){
a.muted = !a.muted;
document.getElementById('podVolume').value = a.muted ? 0 : a.volume;
};
window.podCycleSpeed = function(){
si = (si+1) % speeds.length;
a.playbackRate = speeds[si];
document.getElementById('podSpeedBtn').textContent = speeds[si]+'x';
};
window.podClose = function(){
overlay.classList.add('pod-closing');
setTimeout(function(){ overlay.style.display='none'; }, 300);
a.pause();
document.getElementById('podIconPlay').style.display='';
document.getElementById('podIconPause').style.display='none';
};
document.getElementById('podVolume').addEventListener('input', function(){
a.volume = this.value;
a.muted = false;
});
if(window.location.hash === '#podcast-player'){
overlay.style.display = 'block';
a.preload = 'metadata';
a.load();
opened = true;
}
})();
&lt;/script></description></item></channel></rss>