<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>pysyncon | Carlos Mendez</title><link>https://carlos-mendez.org/tag/pysyncon/</link><atom:link href="https://carlos-mendez.org/tag/pysyncon/index.xml" rel="self" type="application/rss+xml"/><description>pysyncon</description><generator>Wowchemy (https://wowchemy.com)</generator><language>en-us</language><copyright>Carlos Mendez</copyright><lastBuildDate>Fri, 15 May 2026 00:00:00 +0000</lastBuildDate><image><url>https://carlos-mendez.org/media/icon_huedfae549300b4ca5d201a9bd09a3ecd5_79625_512x512_fill_lanczos_center_3.png</url><title>pysyncon</title><link>https://carlos-mendez.org/tag/pysyncon/</link></image><item><title>Carbon Taxes and CO2 Emissions: A Synthetic-Control Analysis in Python</title><link>https://carlos-mendez.org/post/python_sc_co2tax/</link><pubDate>Fri, 15 May 2026 00:00:00 +0000</pubDate><guid>https://carlos-mendez.org/post/python_sc_co2tax/</guid><description>&lt;h2 id="overview">Overview&lt;/h2>
&lt;p>In 1991, Sweden put a price on carbon dioxide. It was one of the first countries in the world to do so. The reform sat on top of two earlier taxes on transport fuel — a value-added tax (VAT, added in 1990) and an older energy tax. Together, these taxes pushed Sweden&amp;rsquo;s retail gasoline price well above the wholesale price set by world oil markets.&lt;/p>
&lt;p>Three decades later, two questions matter for policy:&lt;/p>
&lt;ol>
&lt;li>&lt;strong>Did the carbon tax actually reduce CO2 emissions from transport?&lt;/strong>&lt;/li>
&lt;li>&lt;strong>Did it cost Sweden any economic growth?&lt;/strong>&lt;/li>
&lt;/ol>
&lt;p>This post answers both questions using the same data Andersson (2019) used. We replicate his analysis step by step in Python.&lt;/p>
&lt;h3 id="what-is-causal-inference-and-why-is-it-hard">What is causal inference, and why is it hard?&lt;/h3>
&lt;p>A causal claim says &lt;em>&amp;ldquo;X caused Y to change.&amp;quot;&lt;/em> A correlation says &lt;em>&amp;ldquo;X and Y moved together.&amp;quot;&lt;/em> The two are very different. Sweden&amp;rsquo;s emissions fell after 1991, but lots of other things also changed: oil prices, car technology, recessions, EU policy. To say the carbon tax &lt;em>caused&lt;/em> the fall, we need to compare Sweden&amp;rsquo;s actual emissions to a Sweden that &lt;strong>did not have the carbon tax&lt;/strong> — a &lt;em>counterfactual&lt;/em> Sweden.&lt;/p>
&lt;p>We can never observe that counterfactual directly. The whole craft of causal inference is about building a plausible one from real data. This post walks through three increasingly serious attempts:&lt;/p>
&lt;ul>
&lt;li>&lt;strong>Single-country before/after.&lt;/strong> Naive — confounded by everything else that changed.&lt;/li>
&lt;li>&lt;strong>Difference-in-differences (DiD).&lt;/strong> Better — uses another country (or group of countries) as a benchmark.&lt;/li>
&lt;li>&lt;strong>Synthetic control (SCM).&lt;/strong> Best for this case — builds a weighted blend of donor countries that mimics Sweden &lt;em>before&lt;/em> the reform.&lt;/li>
&lt;/ul>
&lt;p>Each step relaxes a weaker assumption with a stronger tool. We then validate the synthetic-control result with three different placebo tests, and finally turn to regression analysis to study &lt;em>how&lt;/em> the reform worked at the consumer level.&lt;/p>
&lt;h3 id="acknowledgement-of-sources">Acknowledgement of sources&lt;/h3>
&lt;p>This post is &lt;strong>inspired by&lt;/strong> the &lt;a href="https://github.com/TheresaGraefe/RTutorCarbonTaxesAndCO2Emissions" target="_blank" rel="noopener">RTutor problem set &amp;ldquo;Carbon Taxes and CO2 Emissions&amp;rdquo;&lt;/a> by &lt;a href="https://github.com/TheresaGraefe" target="_blank" rel="noopener">Theresa Graefe&lt;/a> (2020), which in turn replicates &lt;a href="https://doi.org/10.1257/pol.20170144" target="_blank" rel="noopener">Andersson (2019), &lt;em>&amp;ldquo;Carbon Taxes and CO2 Emissions: Sweden as a Case Study&amp;rdquo;&lt;/em>&lt;/a>, &lt;em>AEJ: Economic Policy&lt;/em> 11(4). All empirical results — datasets, donor pool, synthetic-control design, and OLS/IV specifications — are Andersson&amp;rsquo;s; the exercise sequence is Graefe&amp;rsquo;s. Our contribution is a Python version using &lt;a href="https://sdfordham.github.io/pysyncon/synth.html" target="_blank" rel="noopener">&lt;code>pysyncon&lt;/code>&lt;/a> for synthetic control and &lt;a href="https://pyfixest.org/" target="_blank" rel="noopener">&lt;code>pyfixest&lt;/code>&lt;/a> for regressions.&lt;/p>
&lt;h3 id="learning-objectives">Learning objectives&lt;/h3>
&lt;p>By the end of this post you will be able to:&lt;/p>
&lt;ul>
&lt;li>&lt;strong>Explain&lt;/strong> why a single-country before/after comparison is not enough to claim a causal effect, and how difference-in-differences (DiD) and synthetic control improve on it.&lt;/li>
&lt;li>&lt;strong>Build a synthetic control&lt;/strong> in Python with &lt;code>pysyncon&lt;/code>: pick a donor pool, choose predictors, fit weights, and read the path / gap plots.&lt;/li>
&lt;li>&lt;strong>Validate&lt;/strong> a synthetic-control estimate with three placebo tests: in-time (fake date), in-space (fake country), and leave-one-out (drop a donor).&lt;/li>
&lt;li>&lt;strong>Estimate&lt;/strong> price and tax elasticities of gasoline demand using OLS and instrumental variables (2SLS) with &lt;code>pyfixest&lt;/code>.&lt;/li>
&lt;li>&lt;strong>Decompose&lt;/strong> the reform&amp;rsquo;s CO2 reduction into the part caused by the carbon tax and the part caused by the VAT.&lt;/li>
&lt;/ul>
&lt;h3 id="key-concepts-you-will-meet">Key concepts you will meet&lt;/h3>
&lt;p>These terms will recur. Each one is defined again in plain English when it first appears in the analysis.&lt;/p>
&lt;ul>
&lt;li>&lt;strong>Counterfactual&lt;/strong> — what &lt;em>would have happened&lt;/em> without the policy. Never observed; always estimated.&lt;/li>
&lt;li>&lt;strong>Treatment effect&lt;/strong> — the difference between the observed outcome and the counterfactual outcome.&lt;/li>
&lt;li>&lt;strong>Donor pool&lt;/strong> — the set of untreated countries we use to build the counterfactual.&lt;/li>
&lt;li>&lt;strong>Parallel trends&lt;/strong> — the assumption (in DiD) that treated and control units would have moved in step without treatment.&lt;/li>
&lt;li>&lt;strong>Endogeneity&lt;/strong> — when an explanatory variable is correlated with the error term, biasing the OLS estimate.&lt;/li>
&lt;li>&lt;strong>Instrumental variable (IV)&lt;/strong> — an external variable that shifts the endogenous regressor but does not affect the outcome directly.&lt;/li>
&lt;li>&lt;strong>Semi-elasticity&lt;/strong> — in a log-level model, the percent change in &lt;em>y&lt;/em> when &lt;em>x&lt;/em> rises by one unit.&lt;/li>
&lt;/ul>
&lt;h3 id="a-roadmap-of-the-analysis">A roadmap of the analysis&lt;/h3>
&lt;pre>&lt;code class="language-mermaid">graph TD
A[&amp;quot;OECD panel 1960-2005&amp;lt;br/&amp;gt;15 countries, transport CO2&amp;quot;] --&amp;gt; B[&amp;quot;Naive Sweden time-difference&amp;lt;br/&amp;gt;+0.55 t/cap — confounded&amp;quot;]
A --&amp;gt; C[&amp;quot;DiD vs Denmark&amp;lt;br/&amp;gt;-0.140 t/cap&amp;quot;]
A --&amp;gt; D[&amp;quot;DiD vs OECD pool&amp;lt;br/&amp;gt;-0.214 t/cap, p=0.02&amp;quot;]
A --&amp;gt; E[&amp;quot;Synthetic Sweden&amp;lt;br/&amp;gt;pysyncon.Synth&amp;quot;]
E --&amp;gt; F[&amp;quot;Path &amp;amp;amp; gap plots&amp;lt;br/&amp;gt;-11.3% avg 1990-2005&amp;quot;]
E --&amp;gt; G[&amp;quot;In-time placebo&amp;lt;br/&amp;gt;backdate to 1980&amp;quot;]
E --&amp;gt; H[&amp;quot;In-space placebos&amp;lt;br/&amp;gt;p=0.067&amp;quot;]
E --&amp;gt; I[&amp;quot;Leave-one-out&amp;lt;br/&amp;gt;range 8.8%-13%&amp;quot;]
A --&amp;gt; J[&amp;quot;Synthetic GDP&amp;lt;br/&amp;gt;no growth penalty&amp;quot;]
K[&amp;quot;Tax-incidence + OLS/IV&amp;lt;br/&amp;gt;regression_data.Rds&amp;quot;] --&amp;gt; L[&amp;quot;Pass-through ~1.0&amp;quot;]
K --&amp;gt; M[&amp;quot;OLS4: beta_price=-0.060&amp;lt;br/&amp;gt;beta_tax=-0.186&amp;quot;]
K --&amp;gt; N[&amp;quot;IV oil: beta_tax=-0.186&amp;lt;br/&amp;gt;tax response ~3x price response&amp;quot;]
O[&amp;quot;disentangling_data.dta&amp;quot;] --&amp;gt; P[&amp;quot;Carbon-tax-only&amp;lt;br/&amp;gt;contribution&amp;quot;]
style E fill:#6a9bcc,stroke:#141413,color:#fff
style H fill:#d97757,stroke:#141413,color:#fff
style N fill:#00d4c8,stroke:#141413,color:#000
&lt;/code>&lt;/pre>
&lt;p>Read the diagram top-to-bottom and left-to-right. It mirrors the structure of the post. We start from the raw OECD panel &lt;code>A&lt;/code>. We try the naive Sweden-only comparison (&lt;code>B&lt;/code>) — and find it is confounded. We try DiD (&lt;code>C&lt;/code>, &lt;code>D&lt;/code>) — better but still flawed. We move to Synthetic Sweden (&lt;code>E&lt;/code>) and read off the headline effect (&lt;code>F&lt;/code>). We validate that effect with three placebo tests (&lt;code>G&lt;/code>, &lt;code>H&lt;/code>, &lt;code>I&lt;/code>). We then check that the reform did not depress GDP (&lt;code>J&lt;/code>). Finally, regression analysis on the demand side (&lt;code>K&lt;/code>–&lt;code>N&lt;/code>) explains &lt;em>why&lt;/em> consumers responded, and the disentangling exercise (&lt;code>O&lt;/code>, &lt;code>P&lt;/code>) separates the carbon tax from the VAT.&lt;/p>
&lt;h2 id="setup-and-imports">Setup and imports&lt;/h2>
&lt;p>We use four specialised packages on top of pandas, numpy, and matplotlib.&lt;/p>
&lt;ul>
&lt;li>&lt;a href="https://sdfordham.github.io/pysyncon/" target="_blank" rel="noopener">&lt;code>pysyncon&lt;/code>&lt;/a> builds the synthetic control. It uses two main objects: &lt;code>Dataprep&lt;/code> (organises the panel) and &lt;code>Synth&lt;/code> (runs the optimisation that picks the donor weights).&lt;/li>
&lt;li>&lt;a href="https://pyfixest.org/" target="_blank" rel="noopener">&lt;code>pyfixest&lt;/code>&lt;/a> runs OLS and instrumental-variable regressions with a familiar formula syntax. It also offers robust standard errors out of the box.&lt;/li>
&lt;li>&lt;a href="https://www.statsmodels.org/" target="_blank" rel="noopener">&lt;code>statsmodels&lt;/code>&lt;/a> gives us Newey–West HAC standard errors with a chosen lag length. We use this because the original paper computes them in Stata (&lt;code>newey ... lag(16)&lt;/code>).&lt;/li>
&lt;li>&lt;a href="https://github.com/ofajardo/pyreadr" target="_blank" rel="noopener">&lt;code>pyreadr&lt;/code>&lt;/a> reads R &lt;code>.Rds&lt;/code> files in Python — useful because some of the original data ships in that format. The other files are Stata &lt;code>.dta&lt;/code> files, which &lt;code>pandas.read_stata&lt;/code> handles directly.&lt;/li>
&lt;/ul>
&lt;pre>&lt;code class="language-python">from pathlib import Path
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import pyreadr
import statsmodels.api as sm
import pyfixest as pf
from pysyncon import Dataprep, Synth
RANDOM_SEED = 42
np.random.seed(RANDOM_SEED)
# Site palette (dark theme)
DARK_NAVY = &amp;quot;#0f1729&amp;quot;; GRID_LINE = &amp;quot;#1f2b5e&amp;quot;
LIGHT_TEXT = &amp;quot;#c8d0e0&amp;quot;; WHITE_TEXT = &amp;quot;#e8ecf2&amp;quot;
STEEL_BLUE = &amp;quot;#6a9bcc&amp;quot;; WARM_ORANGE = &amp;quot;#d97757&amp;quot;; TEAL = &amp;quot;#00d4c8&amp;quot;
plt.rcParams.update({&amp;quot;figure.facecolor&amp;quot;: DARK_NAVY,
&amp;quot;axes.facecolor&amp;quot;: DARK_NAVY, ...}) # see script.py
&lt;/code>&lt;/pre>
&lt;p>We set the dark-theme plot configuration once, at the top of the script. Every figure inherits it automatically. The dark background also makes small treatment gaps easier to see than a white background does.&lt;/p>
&lt;h2 id="loading-the-data">Loading the data&lt;/h2>
&lt;p>We work with six datasets. Each one answers a different part of the post.&lt;/p>
&lt;table>
&lt;thead>
&lt;tr>
&lt;th>Dataset&lt;/th>
&lt;th>What it holds&lt;/th>
&lt;th>Used for&lt;/th>
&lt;/tr>
&lt;/thead>
&lt;tbody>
&lt;tr>
&lt;td>&lt;code>carbontax_data.dta&lt;/code>&lt;/td>
&lt;td>OECD panel, 15 countries × 46 years&lt;/td>
&lt;td>DiD and Synthetic Sweden&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;code>descr_Sweden.Rds&lt;/code>&lt;/td>
&lt;td>Sweden time series: prices, taxes, CO2, GDP&lt;/td>
&lt;td>Descriptive plots and GDP-gap analysis&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;code>GDP_data.Rds&lt;/code>&lt;/td>
&lt;td>13-country GDP panel&lt;/td>
&lt;td>Synthetic-GDP exercise&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;code>regression_data.Rds&lt;/code>&lt;/td>
&lt;td>Sweden time series for elasticity model&lt;/td>
&lt;td>OLS and IV regressions&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;code>leave_one_out_data.dta&lt;/code>&lt;/td>
&lt;td>Pre-computed leave-one-out series&lt;/td>
&lt;td>Robustness plot&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;code>disentangling_data.dta&lt;/code>&lt;/td>
&lt;td>Three counterfactual emission paths&lt;/td>
&lt;td>Carbon-tax-vs-VAT decomposition&lt;/td>
&lt;/tr>
&lt;/tbody>
&lt;/table>
&lt;p>A &lt;strong>panel dataset&lt;/strong> has multiple units (here, countries) observed across multiple time periods (here, years). The outcome of interest is the same throughout: per-capita CO2 emissions from transport in metric tons.&lt;/p>
&lt;pre>&lt;code class="language-python">panel = pd.read_stata(DATA_DIR / &amp;quot;carbontax_data.dta&amp;quot;)
descr_sweden = pyreadr.read_r(DATA_DIR / &amp;quot;descr_Sweden.Rds&amp;quot;)[None].reset_index(drop=True)
gdp_data = pyreadr.read_r(DATA_DIR / &amp;quot;GDP_data.Rds&amp;quot;)[None].reset_index(drop=True)
reg_data = pyreadr.read_r(DATA_DIR / &amp;quot;regression_data.Rds&amp;quot;)[None].reset_index(drop=True)
loo = pd.read_stata(DATA_DIR / &amp;quot;leave_one_out_data.dta&amp;quot;)
disent = pd.read_stata(DATA_DIR / &amp;quot;disentangling_data.dta&amp;quot;)
&lt;/code>&lt;/pre>
&lt;pre>&lt;code class="language-text">panel (carbontax_data.dta): (690, 9), countries=15, years=1960-2005
descr_Sweden.Rds: (46, 14)
GDP_data.Rds: (468, 8), countries=13
regression_data.Rds: (46, 17), years=1970-2015
disentangling_data.dta: (46, 6)
leave_one_out_data.dta: (46, 9)
&lt;/code>&lt;/pre>
&lt;p>The 15 countries in the OECD panel are Australia, Belgium, Canada, Denmark, France, Greece, Iceland, Japan, New Zealand, Poland, Portugal, Spain, Sweden, Switzerland, and the United States. They are all advanced economies with comparable data.&lt;/p>
&lt;p>Three numbers about the time window matter for everything that follows:&lt;/p>
&lt;ul>
&lt;li>&lt;strong>46 years total&lt;/strong> (1960–2005), giving us a long history.&lt;/li>
&lt;li>&lt;strong>30 pre-treatment years&lt;/strong> (1960–1989) to build the counterfactual.&lt;/li>
&lt;li>&lt;strong>16 post-treatment years&lt;/strong> (1990–2005) to measure the effect.&lt;/li>
&lt;/ul>
&lt;p>A long pre-treatment window is a structural advantage of this case study. It lets us check whether our counterfactual model fits well &lt;em>before&lt;/em> the policy. If it does, we are more confident that any post-treatment gap reflects the policy and not random noise.&lt;/p>
&lt;h2 id="descriptive-overview">Descriptive overview&lt;/h2>
&lt;p>A good causal study always starts with descriptive plots. We look at the policy variable (taxes), the outcome variable (CO2 emissions), and the most obvious mechanism between them (fuel consumption) before we run any model. The goal is to &lt;em>see&lt;/em> the data so the later modelling choices feel obvious.&lt;/p>
&lt;h3 id="decomposing-swedens-gasoline-price">Decomposing Sweden&amp;rsquo;s gasoline price&lt;/h3>
&lt;p>What did the 1991 reform actually do to prices at the pump? The retail gasoline price is the sum of four parts:&lt;/p>
&lt;ol>
&lt;li>&lt;strong>Wholesale price&lt;/strong> — set by world oil markets, not by Sweden.&lt;/li>
&lt;li>&lt;strong>Energy tax&lt;/strong> — a long-standing per-litre tax, in place before the reform.&lt;/li>
&lt;li>&lt;strong>Carbon tax&lt;/strong> — new in 1991, scaled to the CO2 content of the fuel.&lt;/li>
&lt;li>&lt;strong>VAT&lt;/strong> — new on transport fuel in 1990 (Sweden joined the EU VAT system).&lt;/li>
&lt;/ol>
&lt;p>The first chart layers all four components.&lt;/p>
&lt;pre>&lt;code class="language-python">ds = descr_sweden.copy()
fig, ax = plt.subplots(figsize=(9, 5.4))
ax.plot(ds[&amp;quot;year&amp;quot;], ds[&amp;quot;pw_real&amp;quot;], color=STEEL_BLUE, lw=2.2, label=&amp;quot;Real wholesale price&amp;quot;)
ax.plot(ds[&amp;quot;year&amp;quot;], ds[&amp;quot;en_tax&amp;quot;], color=WARM_ORANGE, lw=2.0, label=&amp;quot;Energy tax&amp;quot;)
ax.plot(ds[&amp;quot;year&amp;quot;], ds[&amp;quot;CO2_tax&amp;quot;], color=TEAL, lw=2.0, label=&amp;quot;Carbon tax&amp;quot;)
ax.plot(ds[&amp;quot;year&amp;quot;], ds[&amp;quot;VAT&amp;quot;], color=&amp;quot;#c179c8&amp;quot;, lw=1.8, label=&amp;quot;VAT&amp;quot;)
ax.axvline(1990, color=LIGHT_TEXT, lw=0.8, ls=&amp;quot;:&amp;quot;)
ax.set_xlabel(&amp;quot;Year&amp;quot;); ax.set_ylabel(&amp;quot;Real price components (SEK / litre)&amp;quot;)
plt.savefig(&amp;quot;python_sc_co2tax_gasoline_price_components.png&amp;quot;, dpi=300, bbox_inches=&amp;quot;tight&amp;quot;)
&lt;/code>&lt;/pre>
&lt;p>&lt;img src="python_sc_co2tax_gasoline_price_components.png" alt="Sweden gasoline price decomposition 1960-2005">&lt;/p>
&lt;p>Three things stand out:&lt;/p>
&lt;ul>
&lt;li>The carbon tax (teal) is brand new in 1991. It grows steadily after that.&lt;/li>
&lt;li>The energy tax (orange) actually &lt;em>drops&lt;/em> at the same time. The reform was partly a &lt;strong>tax swap&lt;/strong>, not a pure tax hike.&lt;/li>
&lt;li>The wholesale price (blue) is dominated by the 1970s and 1980s oil shocks, not by Swedish policy.&lt;/li>
&lt;/ul>
&lt;p>By 2005 the carbon tax has roughly the same magnitude as the energy tax. Keep this in mind: when we later disentangle &amp;ldquo;carbon tax&amp;rdquo; from &amp;ldquo;VAT&amp;rdquo;, we are looking at a meaningful slice of the total fuel-tax burden, not a rounding error.&lt;/p>
&lt;h3 id="gasoline-consumption-and-co2-emissions">Gasoline consumption and CO2 emissions&lt;/h3>
&lt;p>The reform aimed to cut CO2 emissions. To see whether that happened, we plot two things side by side:&lt;/p>
&lt;ol>
&lt;li>Sweden&amp;rsquo;s CO2 emissions from transport vs the OECD mean.&lt;/li>
&lt;li>Sweden&amp;rsquo;s per-capita gasoline and diesel consumption.&lt;/li>
&lt;/ol>
&lt;p>The second plot matters because reductions in CO2 could come from two different channels:&lt;/p>
&lt;ul>
&lt;li>People &lt;em>drive less&lt;/em> (a pure consumption drop).&lt;/li>
&lt;li>People &lt;em>switch fuels&lt;/em> (from gasoline to more efficient diesel).&lt;/li>
&lt;/ul>
&lt;p>We want to know which of these is happening.&lt;/p>
&lt;pre>&lt;code class="language-python">fig, axes = plt.subplots(1, 2, figsize=(12, 4.8))
axes[0].plot(ds[&amp;quot;year&amp;quot;], ds[&amp;quot;CO2_Sweden&amp;quot;], color=WARM_ORANGE, lw=2.2)
axes[0].plot(ds[&amp;quot;year&amp;quot;], ds[&amp;quot;CO2_OECD&amp;quot;], color=STEEL_BLUE, lw=2.0, ls=&amp;quot;--&amp;quot;)
axes[0].axvline(1990, color=LIGHT_TEXT, lw=0.8, ls=&amp;quot;:&amp;quot;)
axes[1].plot(ds[&amp;quot;year&amp;quot;], ds[&amp;quot;gas_cons&amp;quot;], color=TEAL, lw=2.2, label=&amp;quot;Gasoline&amp;quot;)
axes[1].plot(ds[&amp;quot;year&amp;quot;], ds[&amp;quot;diesel_cons&amp;quot;], color=WARM_ORANGE, lw=2.0, label=&amp;quot;Diesel&amp;quot;)
plt.savefig(&amp;quot;python_sc_co2tax_co2_vs_consumption.png&amp;quot;, dpi=300, bbox_inches=&amp;quot;tight&amp;quot;)
&lt;/code>&lt;/pre>
&lt;p>&lt;img src="python_sc_co2tax_co2_vs_consumption.png" alt="CO2 emissions and fuel consumption in Sweden">&lt;/p>
&lt;p>Two patterns emerge:&lt;/p>
&lt;ul>
&lt;li>&lt;strong>Before 1990:&lt;/strong> Sweden&amp;rsquo;s CO2 path moves in step with the OECD mean.&lt;/li>
&lt;li>&lt;strong>After 1990:&lt;/strong> Sweden plateaus while the OECD keeps climbing. That divergence is the first visible sign of an effect.&lt;/li>
&lt;/ul>
&lt;p>On the consumption side, gasoline use peaks in the late 1980s and then declines. Diesel grows steadily throughout. Diesel cars are more fuel-efficient than gasoline cars, so part of the CO2 reduction we will estimate is not &amp;ldquo;less driving&amp;rdquo; but &amp;ldquo;the same driving with less-emitting fuel&amp;rdquo;. This is good to know in advance — it shapes how we interpret the headline number later.&lt;/p>
&lt;p>Next, we plot the same CO2 outcome for &lt;em>every&lt;/em> country in the donor pool. The small-multiples view shows where Sweden sits relative to its potential counterfactuals.&lt;/p>
&lt;pre>&lt;code class="language-python">countries = sorted(panel[&amp;quot;country&amp;quot;].unique())
fig, axes = plt.subplots(3, 5, figsize=(15, 8.5), sharex=True, sharey=True)
for ax, country in zip(axes.ravel(), countries):
sub = panel[panel[&amp;quot;country&amp;quot;] == country].sort_values(&amp;quot;year&amp;quot;)
color = WARM_ORANGE if country == &amp;quot;Sweden&amp;quot; else STEEL_BLUE
ax.plot(sub[&amp;quot;year&amp;quot;], sub[&amp;quot;CO2_transport_capita&amp;quot;], color=color, lw=2.4 if country==&amp;quot;Sweden&amp;quot; else 1.4)
ax.axvline(1990, color=LIGHT_TEXT, lw=0.6, ls=&amp;quot;:&amp;quot;)
ax.set_title(country)
plt.savefig(&amp;quot;python_sc_co2tax_co2_donor_pool.png&amp;quot;, dpi=300, bbox_inches=&amp;quot;tight&amp;quot;)
&lt;/code>&lt;/pre>
&lt;p>&lt;img src="python_sc_co2tax_co2_donor_pool.png" alt="CO2 trajectories OECD donor pool">&lt;/p>
&lt;p>Across the fifteen panels, Sweden (orange) sits squarely in the middle of the distribution before 1990. It is neither the highest emitter (US, Canada) nor the lowest (Portugal, Poland). Many donors are reasonable matches for Sweden&amp;rsquo;s pre-1990 level.&lt;/p>
&lt;p>After 1990, several donors keep climbing while Sweden flattens. This is the visual hint that something causal might be happening — and motivates the formal analysis below.&lt;/p>
&lt;h2 id="estimating-causal-effects">Estimating causal effects&lt;/h2>
&lt;p>We now move from looking at the data to estimating the policy&amp;rsquo;s effect. We will try three estimators, ordered from worst to best. Each one fixes a problem with the previous one.&lt;/p>
&lt;h3 id="why-a-single-unit-time-comparison-fails">Why a single-unit time comparison fails&lt;/h3>
&lt;p>The simplest possible analysis: compare Sweden&amp;rsquo;s average CO2 &lt;em>after&lt;/em> 1990 to its average &lt;em>before&lt;/em>. In equation form:&lt;/p>
&lt;p>$$\text{CO2}_{\text{Sweden},t} = \alpha + \delta \cdot \mathbf{1}\{t \geq 1990\} + \varepsilon_t.$$&lt;/p>
&lt;p>Here:&lt;/p>
&lt;ul>
&lt;li>$\alpha$ is the average emission level in the pre-1990 period.&lt;/li>
&lt;li>$\delta$ is the change after 1990 — the coefficient we are estimating.&lt;/li>
&lt;li>$\mathbf{1}\{t \geq 1990\}$ is a 0/1 indicator that turns on starting in 1990.&lt;/li>
&lt;/ul>
&lt;p>In plain English, the regression asks: &amp;ldquo;is Sweden&amp;rsquo;s average post-1990 CO2 higher or lower than its average pre-1990 CO2?&amp;rdquo;. This is the wrong question. It treats every other thing that changed in Sweden between 1960 and 2005 — population, income, vehicle stock, EU integration — as if it were part of the carbon tax&amp;rsquo;s effect. The estimand here is just a &lt;em>time difference inside one country&lt;/em>, not a causal effect.&lt;/p>
&lt;pre>&lt;code class="language-python">sw = panel[panel[&amp;quot;country&amp;quot;] == &amp;quot;Sweden&amp;quot;].copy()
sw[&amp;quot;delta&amp;quot;] = (sw[&amp;quot;year&amp;quot;] &amp;gt;= 1990).astype(int)
m_time = pf.feols(&amp;quot;CO2_transport_capita ~ delta&amp;quot;, data=sw, vcov=&amp;quot;HC1&amp;quot;)
print(m_time.tidy().round(4))
&lt;/code>&lt;/pre>
&lt;pre>&lt;code class="language-text"> Estimate Std. Error t value Pr(&amp;gt;|t|)
Intercept 1.7937 0.0766 23.4181 0.0
delta 0.5522 0.0790 6.9908 0.0
&lt;/code>&lt;/pre>
&lt;p>The estimate is &lt;strong>+0.55 t CO2 per capita&lt;/strong> (t = 7.0). Taken at face value, Sweden emitted &lt;em>more&lt;/em> per capita after 1990, not less.&lt;/p>
&lt;p>This number is correct as an arithmetic fact but useless as a causal answer. It is high because the post-1990 window runs to 2005, and Sweden&amp;rsquo;s economy grew over those years. The naive comparison cannot tell us how much would have happened anyway. We need a &lt;em>control&lt;/em> — another country or group of countries whose path captures everything that would have happened to Sweden without the reform.&lt;/p>
&lt;h3 id="difference-in-differences-sweden-vs-denmark-sweden-vs-oecd">Difference-in-differences: Sweden vs Denmark, Sweden vs OECD&lt;/h3>
&lt;p>Difference-in-differences (DiD) is the first real attempt at a counterfactual. The idea is simple: take a control country (or group), compute its pre-vs-post change, and subtract it from Sweden&amp;rsquo;s pre-vs-post change. Whatever is left is the &lt;em>extra&lt;/em> change in Sweden, which we hope reflects the treatment.&lt;/p>
&lt;p>For DiD to be valid, we need one big assumption: &lt;strong>parallel trends&lt;/strong>. In the absence of the reform, Sweden and the control would have moved in lockstep. This is untestable in the post-period (we cannot see Sweden&amp;rsquo;s counterfactual). But we can eyeball the pre-period: if Sweden and the control already moved together before 1990, parallel trends is plausible.&lt;/p>
&lt;p>The estimand we are after is the &lt;strong>average treatment effect on the treated (ATT)&lt;/strong>: how much lower were Swedish emissions on average over 1990–2005 compared to a no-reform Sweden? Formally, the DiD regression is:&lt;/p>
&lt;p>$$y_{jt} = \beta_0 + \beta_1 \cdot T_j + \beta_2 \cdot P_t + \beta_3 \cdot (T_j \cdot P_t) + \varepsilon_{jt},$$&lt;/p>
&lt;p>where:&lt;/p>
&lt;ul>
&lt;li>$T_j = 1$ if country $j$ is Sweden (the &lt;strong>treated&lt;/strong> unit), 0 otherwise.&lt;/li>
&lt;li>$P_t = 1$ if year $t \geq 1990$ (the &lt;strong>post&lt;/strong> period), 0 otherwise.&lt;/li>
&lt;li>$\beta_3$ is the DiD coefficient on the &lt;strong>interaction&lt;/strong> $T_j \cdot P_t$ — the only term that is non-zero exclusively for Sweden in the post-period. That is our treatment-effect estimate.&lt;/li>
&lt;/ul>
&lt;p>In code, the variables are named &lt;code>treated&lt;/code>, &lt;code>post&lt;/code>, and &lt;code>Sweden_post&lt;/code> (the interaction).&lt;/p>
&lt;pre>&lt;code class="language-python">panel[&amp;quot;post&amp;quot;] = (panel[&amp;quot;year&amp;quot;] &amp;gt;= 1990).astype(int)
panel[&amp;quot;treated&amp;quot;] = (panel[&amp;quot;country&amp;quot;] == &amp;quot;Sweden&amp;quot;).astype(int)
panel[&amp;quot;Sweden_post&amp;quot;] = panel[&amp;quot;treated&amp;quot;] * panel[&amp;quot;post&amp;quot;]
two = panel[panel[&amp;quot;country&amp;quot;].isin([&amp;quot;Sweden&amp;quot;, &amp;quot;Denmark&amp;quot;])]
m_did2 = pf.feols(&amp;quot;CO2_transport_capita ~ treated + post + Sweden_post&amp;quot;, data=two, vcov=&amp;quot;HC1&amp;quot;)
m_did_oecd = pf.feols(&amp;quot;CO2_transport_capita ~ treated + post + Sweden_post&amp;quot;,
data=panel, vcov={&amp;quot;CRV1&amp;quot;: &amp;quot;country&amp;quot;})
&lt;/code>&lt;/pre>
&lt;pre>&lt;code class="language-text">Sweden vs Denmark (HC1):
Estimate Std. Error t value Pr(&amp;gt;|t|)
Sweden_post -0.1399 0.1157 -1.2095 0.2297
Sweden vs OECD pool (cluster SE by country):
Estimate Std. Error t value Pr(&amp;gt;|t|)
Sweden_post -0.2137 0.0825 -2.5907 0.0214
&lt;/code>&lt;/pre>
&lt;p>Differencing against Denmark &lt;strong>flips the sign&lt;/strong> of the naive estimate. Sweden now shows a &lt;strong>−0.14 t/capita&lt;/strong> reduction in an average post-1990 year. This matches Andersson&amp;rsquo;s and the R tutor&amp;rsquo;s number exactly.&lt;/p>
&lt;p>But the two-country comparison is &lt;strong>underpowered&lt;/strong> — p = 0.23 means we cannot reject the null of no effect with only one control. So we expand the control to all 14 other OECD countries and use &lt;strong>cluster-robust standard errors&lt;/strong> (which account for serial correlation within each country). The estimate tightens to &lt;strong>−0.21 t/capita (p = 0.02)&lt;/strong>, which is statistically significant at the 5% level.&lt;/p>
&lt;p>Both estimates are economically large — somewhere between 7% and 11% of Sweden&amp;rsquo;s pre-reform level. But there is a problem. The donor-pool DiD plot shows that Sweden and the OECD average were &lt;em>not&lt;/em> on parallel trends in the late 1980s. So our key assumption is questionable. This motivates the next step: synthetic control.&lt;/p>
&lt;p>&lt;img src="python_sc_co2tax_did_sweden_denmark.png" alt="DiD: Sweden vs Denmark">&lt;/p>
&lt;h3 id="building-synthetic-sweden">Building Synthetic Sweden&lt;/h3>
&lt;h4 id="the-core-idea">The core idea&lt;/h4>
&lt;p>DiD picked Denmark (or the unweighted OECD average) as the counterfactual. That is rigid. What if no single country looks like Sweden, but a &lt;em>blend&lt;/em> — say, 30% Denmark + 27% Belgium + 15% New Zealand + &amp;hellip; — does?&lt;/p>
&lt;p>That blend is &lt;strong>Synthetic Sweden&lt;/strong>. The synthetic-control method (SCM) chooses the blend weights to make the synthetic version match Sweden as closely as possible &lt;em>before&lt;/em> the reform. After the reform, the difference between Sweden and its synthetic twin is the estimated treatment effect.&lt;/p>
&lt;p>Three things make this powerful:&lt;/p>
&lt;ol>
&lt;li>The weights are &lt;strong>chosen by data&lt;/strong>, not by judgment.&lt;/li>
&lt;li>Each weight is constrained to be &lt;strong>non-negative&lt;/strong>, and they &lt;strong>sum to one&lt;/strong> — so the synthetic is a real convex combination of countries, not an extrapolation.&lt;/li>
&lt;li>The method does not require parallel trends. It just requires a good pre-period fit.&lt;/li>
&lt;/ol>
&lt;h4 id="the-math-briefly">The math, briefly&lt;/h4>
&lt;p>Let $X_1$ be the vector of pre-treatment predictor values for Sweden (GDP per capita, vehicles per capita, gasoline consumption, urbanisation, plus three lagged CO2 levels). Let $X_0$ be the same predictors for the donor countries, one column per donor.&lt;/p>
&lt;p>Synthetic control picks donor weights $w$ to minimise:&lt;/p>
&lt;p>$$w^* = \arg\min_{w} (X_1 - X_0 w)^\top V (X_1 - X_0 w) \quad \text{s.t.} \quad w_j \geq 0, \; \sum_j w_j = 1.$$&lt;/p>
&lt;p>The matrix $V$ tells the optimiser how much weight to give each predictor. It is also chosen automatically, to minimise the pre-treatment mean squared prediction error (MSPE) of the &lt;em>outcome&lt;/em>, CO2 emissions. So there are two nested optimisations: pick $w$ given $V$, and pick $V$ to make the resulting fit best on CO2.&lt;/p>
&lt;h4 id="the-code">The code&lt;/h4>
&lt;p>&lt;code>pysyncon&lt;/code> hides the nested optimisation behind two objects:&lt;/p>
&lt;ul>
&lt;li>&lt;code>Dataprep(...)&lt;/code> — pack the panel into the matrices the optimiser needs.&lt;/li>
&lt;li>&lt;code>Synth().fit(...)&lt;/code> — run the optimisation and store weights and loss.&lt;/li>
&lt;/ul>
&lt;pre>&lt;code class="language-python">controls = [c for c in countries if c != &amp;quot;Sweden&amp;quot;]
dataprep = Dataprep(
foo=panel,
predictors=[&amp;quot;GDP_per_capita&amp;quot;, &amp;quot;vehicles_capita&amp;quot;, &amp;quot;gas_cons_capita&amp;quot;, &amp;quot;urban_pop&amp;quot;],
predictors_op=&amp;quot;mean&amp;quot;,
time_predictors_prior=range(1980, 1990),
special_predictors=[
(&amp;quot;CO2_transport_capita&amp;quot;, [1989], &amp;quot;mean&amp;quot;),
(&amp;quot;CO2_transport_capita&amp;quot;, [1980], &amp;quot;mean&amp;quot;),
(&amp;quot;CO2_transport_capita&amp;quot;, [1970], &amp;quot;mean&amp;quot;),
],
dependent=&amp;quot;CO2_transport_capita&amp;quot;,
unit_variable=&amp;quot;country&amp;quot;, time_variable=&amp;quot;year&amp;quot;,
treatment_identifier=&amp;quot;Sweden&amp;quot;, controls_identifier=controls,
time_optimize_ssr=range(1960, 1990),
)
synth = Synth()
synth.fit(dataprep=dataprep, optim_method=&amp;quot;Nelder-Mead&amp;quot;, optim_initial=&amp;quot;equal&amp;quot;)
print(synth.weights().sort_values(ascending=False).head(6).round(3))
&lt;/code>&lt;/pre>
&lt;pre>&lt;code class="language-text">Denmark 0.289
Belgium 0.269
New Zealand 0.146
Greece 0.114
United States 0.101
Switzerland 0.079
(weights sum to 1.000)
&lt;/code>&lt;/pre>
&lt;p>&lt;code>pysyncon&lt;/code> picks &lt;strong>exactly the same six donors&lt;/strong> as Andersson&amp;rsquo;s R code: Denmark, Belgium, New Zealand, Greece, United States, Switzerland. Together they account for 100% of the weight. The other nine donor countries receive essentially zero weight.&lt;/p>
&lt;p>Why these six? Each contributes a different similarity to Sweden:&lt;/p>
&lt;ul>
&lt;li>&lt;strong>Denmark&lt;/strong> and &lt;strong>Belgium&lt;/strong> dominate (over half the weight) — small, advanced European economies with similar income, urbanisation, and energy mix.&lt;/li>
&lt;li>&lt;strong>New Zealand&lt;/strong> brings a comparable urbanisation profile.&lt;/li>
&lt;li>&lt;strong>Greece&lt;/strong>, the &lt;strong>US&lt;/strong>, and &lt;strong>Switzerland&lt;/strong> fill in the rest.&lt;/li>
&lt;/ul>
&lt;p>You may notice the exact percentages differ slightly from Andersson&amp;rsquo;s R results (where Denmark is 38% and Belgium 19%). This is because &lt;code>pysyncon&lt;/code> and R&amp;rsquo;s &lt;code>Synth&lt;/code> package use different numerical optimisers under the hood (&lt;code>scipy&lt;/code>&amp;rsquo;s Nelder–Mead vs &lt;code>kernlab&lt;/code>&amp;rsquo;s interior-point solver). Both reach the same family of solutions; the headline gap below is essentially identical.&lt;/p>
&lt;p>&lt;img src="python_sc_co2tax_synth_weights.png" alt="Synthetic Sweden — donor weights">&lt;/p>
&lt;p>The bar chart shows the donor structure at a glance. Concentrated weights on a handful of donors — like here — usually mean the optimiser found a tight fit. Spread-out weights across many countries would have been a red flag, suggesting that no good counterfactual exists in the donor pool.&lt;/p>
&lt;h3 id="the-path-plot-and-the-treatment-gap">The path plot and the treatment gap&lt;/h3>
&lt;p>We now use the donor weights to construct Synthetic Sweden&amp;rsquo;s CO2 path over the whole 1960–2005 window. The construction is simple arithmetic: in each year, multiply each donor&amp;rsquo;s emission level by its weight and add them up.&lt;/p>
&lt;p>The &lt;strong>treatment gap&lt;/strong> is the year-by-year difference between Sweden&amp;rsquo;s actual emissions and Synthetic Sweden&amp;rsquo;s emissions. Pre-treatment, this gap should be near zero (otherwise the fit is bad). Post-treatment, the gap is our estimate of the effect.&lt;/p>
&lt;pre>&lt;code class="language-python">years = np.arange(1960, 2006)
panel_wide = panel.pivot(index=&amp;quot;year&amp;quot;, columns=&amp;quot;country&amp;quot;, values=&amp;quot;CO2_transport_capita&amp;quot;)
w_sorted = synth.weights().sort_values(ascending=False)
y_sweden = panel_wide.loc[years, &amp;quot;Sweden&amp;quot;]
y_synth = panel_wide.loc[years, controls] @ w_sorted.reindex(controls).fillna(0)
gap = y_sweden.values - y_synth.values
&lt;/code>&lt;/pre>
&lt;p>&lt;img src="python_sc_co2tax_synth_sweden_fit.png" alt="Sweden vs Synthetic Sweden">&lt;/p>
&lt;p>&lt;img src="python_sc_co2tax_synth_gap.png" alt="Treatment gap">&lt;/p>
&lt;p>Two things to notice in the path plot:&lt;/p>
&lt;ul>
&lt;li>&lt;strong>Before 1990&lt;/strong> the two lines overlap almost perfectly. The pre-treatment MSPE is tiny. The optimiser found a synthetic version of Sweden that mimics both the &lt;em>level&lt;/em> and the &lt;em>trend&lt;/em> of real Swedish emissions.&lt;/li>
&lt;li>&lt;strong>After 1990&lt;/strong> the lines split apart. Sweden plateaus and slowly declines. Synthetic Sweden keeps climbing — that is what real Sweden &lt;em>would have&lt;/em> done without the reform.&lt;/li>
&lt;/ul>
&lt;p>The numbers:&lt;/p>
&lt;ul>
&lt;li>&lt;strong>2005 gap:&lt;/strong> −0.36 t CO2 per capita (or −15% relative to the synthetic level).&lt;/li>
&lt;li>&lt;strong>Average post-treatment gap (1990–2005):&lt;/strong> −0.27 t/capita per year, or −11.3% per year.&lt;/li>
&lt;/ul>
&lt;p>Both numbers are within rounding of Andersson&amp;rsquo;s reported range and the R tutor&amp;rsquo;s −10.9%. In plain headline terms, the carbon tax (plus the VAT) is associated with roughly &lt;strong>one ton of avoided per-capita transport CO2 every 3.7 years&lt;/strong>, sustained across the entire post-treatment window.&lt;/p>
&lt;p>But could these numbers just be noise? That is what placebo tests are for.&lt;/p>
&lt;h3 id="placebo-tests--is-this-just-noise">Placebo tests — is this just noise?&lt;/h3>
&lt;p>The post-treatment gap looks impressive on the path plot. But the synthetic-control optimiser is &lt;em>designed&lt;/em> to make Sweden look unique in the post-period. We need to check that the gap is not an artefact of the method itself.&lt;/p>
&lt;p>The standard approach is to apply the SCM in settings where we &lt;em>know&lt;/em> the answer should be zero. If the method still produces a gap, we should doubt the original result. If it correctly returns nothing, we are more confident.&lt;/p>
&lt;p>There are three classic falsification tests. We run all three.&lt;/p>
&lt;h4 id="1-in-time-placebo--pretend-the-reform-happened-earlier">1. In-time placebo — pretend the reform happened earlier&lt;/h4>
&lt;p>We fit a synthetic control as if the reform had been in &lt;strong>1980&lt;/strong>, ten years earlier, using only pre-1980 data. Since no reform actually happened in 1980, the gap between Sweden and Synthetic Sweden between 1980 and 1989 should be small. If it is large, the SCM is producing spurious gaps, and we should distrust the post-1990 gap too.&lt;/p>
&lt;pre>&lt;code class="language-python">dp_time = Dataprep(... time_optimize_ssr=range(1960, 1980),
time_predictors_prior=range(1970, 1980),
special_predictors=[(&amp;quot;CO2_transport_capita&amp;quot;, [1979], &amp;quot;mean&amp;quot;),
(&amp;quot;CO2_transport_capita&amp;quot;, [1970], &amp;quot;mean&amp;quot;),
(&amp;quot;CO2_transport_capita&amp;quot;, [1965], &amp;quot;mean&amp;quot;)])
synth_time = Synth(); synth_time.fit(dataprep=dp_time, optim_method=&amp;quot;BFGS&amp;quot;)
&lt;/code>&lt;/pre>
&lt;p>&lt;img src="python_sc_co2tax_placebo_in_time.png" alt="In-time placebo">&lt;/p>
&lt;p>Sweden and Synthetic Sweden track each other through 1990 with no divergence at the placebo treatment year. This is exactly what we want — the SCM does not invent gaps when no policy was implemented. ✓&lt;/p>
&lt;h4 id="2-in-space-placebos--pretend-each-donor-was-treated">2. In-space placebos — pretend each donor was treated&lt;/h4>
&lt;p>We re-run the entire SCM &lt;strong>fifteen times&lt;/strong>, once for each country in the panel. Each time, we pretend that country was treated in 1990 and use the others as donors. Then we collect all fifteen gap series.&lt;/p>
&lt;p>If Sweden&amp;rsquo;s actual gap is much larger than the placebo gaps from the other countries, the effect is unlikely to be noise. This is a non-parametric significance test: it asks &amp;ldquo;what fraction of random units would have produced a gap as big as Sweden&amp;rsquo;s?&amp;rdquo;. That fraction is the &lt;strong>permutation p-value&lt;/strong>.&lt;/p>
&lt;p>To compare gaps across countries fairly, we use the &lt;strong>post-/pre-treatment MSPE ratio&lt;/strong>. The numerator is how much each unit deviates from its synthetic counterpart &lt;em>after&lt;/em> 1990. The denominator is how badly the SCM fits the unit &lt;em>before&lt;/em> 1990. Dividing by the pre-period MSPE penalises units whose synthetic version was a poor fit to begin with — those gaps are not credible.&lt;/p>
&lt;pre>&lt;code class="language-python">def run_placebo(treated_country):
co = [c for c in countries if c != treated_country]
dp = Dataprep(..., treatment_identifier=treated_country, controls_identifier=co)
sy = Synth(); sy.fit(dataprep=dp, optim_method=&amp;quot;BFGS&amp;quot;)
# compute pre/post MSPE and the gap series
...
placebo_results = [run_placebo(c) for c in countries]
sweden_res = next(r for r in placebo_results if r[&amp;quot;country&amp;quot;] == &amp;quot;Sweden&amp;quot;)
p_val = np.mean([r[&amp;quot;ratio&amp;quot;] &amp;gt;= sweden_res[&amp;quot;ratio&amp;quot;] for r in placebo_results])
&lt;/code>&lt;/pre>
&lt;pre>&lt;code class="language-text">Permutation p-value for Sweden = 0.0667
&lt;/code>&lt;/pre>
&lt;p>&lt;img src="python_sc_co2tax_placebo_in_space.png" alt="In-space placebos">&lt;/p>
&lt;p>&lt;img src="python_sc_co2tax_placebo_mspe_ratio.png" alt="Permutation MSPE ratio">&lt;/p>
&lt;p>Sweden&amp;rsquo;s gap (the bold orange line) stands clearly outside the bundle of grey placebo gaps in the post-1990 period. Quantifying this, Sweden has the &lt;strong>highest post/pre-MSPE ratio of any unit&lt;/strong>. The permutation p-value is &lt;strong>0.067&lt;/strong>.&lt;/p>
&lt;p>What does p = 0.067 mean here? If we randomly re-assigned the treatment to any of the 15 countries, only one in fifteen would have produced a gap as extreme as Sweden&amp;rsquo;s. With only 15 donors, the smallest possible non-trivial p-value is exactly 1/15 ≈ 0.067 — and that is what we hit. With a bigger donor pool, the p-value could in principle be smaller. ✓&lt;/p>
&lt;h4 id="3-leave-one-out--drop-one-big-donor-at-a-time">3. Leave-one-out — drop one big donor at a time&lt;/h4>
&lt;p>Maybe Sweden&amp;rsquo;s gap is driven entirely by one quirky donor (say, Denmark) and would vanish without it. To check, we re-fit Synthetic Sweden &lt;strong>six times&lt;/strong>, each time excluding one of the six high-weight donors (Denmark, Belgium, New Zealand, Greece, US, Switzerland). If the result is robust, no single exclusion should erase the gap.&lt;/p>
&lt;pre>&lt;code class="language-python">fig, ax = plt.subplots(figsize=(9, 5.4))
for col in [c for c in loo.columns if c.startswith(&amp;quot;excl_&amp;quot;)]:
ax.plot(loo[&amp;quot;Year&amp;quot;], loo[col], color=LIGHT_TEXT, lw=1.1, alpha=0.7)
ax.plot(loo[&amp;quot;Year&amp;quot;], loo[&amp;quot;synth_sweden&amp;quot;], color=WARM_ORANGE, lw=2.4)
ax.plot(loo[&amp;quot;Year&amp;quot;], loo[&amp;quot;sweden&amp;quot;], color=TEAL, lw=2.2)
&lt;/code>&lt;/pre>
&lt;p>&lt;img src="python_sc_co2tax_placebo_leave_one_out.png" alt="Leave-one-out robustness">&lt;/p>
&lt;p>Dropping each high-weight donor barely moves Synthetic Sweden. The resulting range of estimated reductions is &lt;strong>8.8% (without Switzerland) to 13% (without Denmark)&lt;/strong>. All six versions are firmly negative. All bracket the headline 11%. Even the most conservative single-donor exclusion gives a bigger effect than the unweighted DiD&amp;rsquo;s 8.3%. So the SCM result is not driven by any one country. ✓&lt;/p>
&lt;p>&lt;strong>All three falsification tests pass.&lt;/strong> The −11.3% reduction is unlikely to be an artefact of the method.&lt;/p>
&lt;h2 id="was-gdp-a-confounder">Was GDP a confounder?&lt;/h2>
&lt;p>A &lt;strong>confounder&lt;/strong> is a variable that affects both the treatment and the outcome, so it looks like the treatment is doing something when really the confounder is. The most common objection to the carbon-tax-reduces-CO2 story is exactly this kind of worry: maybe Sweden&amp;rsquo;s emissions fell for completely separate economic reasons — a recession, a structural decline of heavy industry, anything that quietly depressed driving in the early 1990s. If so, our −11.3% number would be a confounded measure, not a causal effect.&lt;/p>
&lt;p>We rule this out in two steps:&lt;/p>
&lt;ol>
&lt;li>&lt;strong>Look at GDP and CO2 gaps side by side.&lt;/strong> If a recession caused the CO2 drop, the CO2 gap should follow the GDP gap, rising back when GDP recovers. If they decouple, the recession story does not hold.&lt;/li>
&lt;li>&lt;strong>Build a second Synthetic Sweden with GDP as the outcome.&lt;/strong> If the carbon tax really depressed Swedish growth, the actual GDP path should fall &lt;em>below&lt;/em> the synthetic GDP path after 1990. If they overlap, no growth penalty.&lt;/li>
&lt;/ol>
&lt;pre>&lt;code class="language-python">fig, axes = plt.subplots(1, 2, figsize=(12, 4.8))
for ax, var, color in [(axes[0], &amp;quot;gap_GDP&amp;quot;, STEEL_BLUE), (axes[1], &amp;quot;gap_CO2&amp;quot;, WARM_ORANGE)]:
ax.axvspan(1976, 1978, color=GRID_LINE, alpha=0.55) # recession 1
ax.axvspan(1991, 1993, color=GRID_LINE, alpha=0.55) # recession 2
ax.plot(ds[&amp;quot;year&amp;quot;], ds[var], color=color, lw=2.2)
&lt;/code>&lt;/pre>
&lt;p>&lt;img src="python_sc_co2tax_gdp_co2_gaps.png" alt="GDP vs CO2 gaps with recessions shaded">&lt;/p>
&lt;p>The shaded bands mark the two recessions Sweden faced in this period: 1976–78 and 1991–93. The left panel (GDP gap) shows deep negative dips during both recessions, as expected.&lt;/p>
&lt;p>If recessions drove the CO2 reduction, the right panel (CO2 gap) should mirror the left panel: dip during the recession, then rebound when GDP recovers. That is not what we see. The CO2 gap dips during the 1991–93 recession, but &lt;strong>never rebounds&lt;/strong> — even though Swedish GDP fully recovered after 1993. This asymmetry is the smoking gun: emissions did not snap back when growth did, so it was not the recession that suppressed them.&lt;/p>
&lt;p>For an even cleaner test, we now build a &lt;em>second&lt;/em> synthetic control — this time with GDP per capita as the outcome variable, not CO2.&lt;/p>
&lt;pre>&lt;code class="language-python">gdp = gdp_data.copy()
dp_gdp = Dataprep(
foo=gdp,
predictors=[&amp;quot;investrate&amp;quot;, &amp;quot;trade&amp;quot;, &amp;quot;infrate&amp;quot;],
predictors_op=&amp;quot;mean&amp;quot;,
time_predictors_prior=range(1980, 1990),
special_predictors=[(&amp;quot;gdp_cap&amp;quot;, [1975], &amp;quot;mean&amp;quot;), (&amp;quot;gdp_cap&amp;quot;, [1980], &amp;quot;mean&amp;quot;),
(&amp;quot;gdp_cap&amp;quot;, [1989], &amp;quot;mean&amp;quot;),
(&amp;quot;schooling&amp;quot;, [1975, 1980, 1985], &amp;quot;mean&amp;quot;)],
dependent=&amp;quot;gdp_cap&amp;quot;, unit_variable=&amp;quot;country&amp;quot;, time_variable=&amp;quot;year&amp;quot;,
treatment_identifier=&amp;quot;Sweden&amp;quot;,
controls_identifier=sorted([c for c in gdp[&amp;quot;country&amp;quot;].unique() if c != &amp;quot;Sweden&amp;quot;]),
time_optimize_ssr=range(1970, 1990),
)
synth_gdp = Synth(); synth_gdp.fit(dataprep=dp_gdp, optim_method=&amp;quot;BFGS&amp;quot;)
&lt;/code>&lt;/pre>
&lt;pre>&lt;code class="language-text">Synthetic-GDP donor weights (non-zero):
Denmark 0.6131
Norway 0.2007
Finland 0.0972
USA 0.0890
GDP 2005 — Sweden actual: $32,591 vs Synthetic: $32,358
&lt;/code>&lt;/pre>
&lt;p>&lt;img src="python_sc_co2tax_gdp_synth.png" alt="Synthetic Sweden — GDP">&lt;/p>
&lt;p>The Synthetic-GDP Sweden is dominated by Scandinavian peers (Denmark 61%, Norway 20%, Finland 10%) plus the US (9%). Its post-1990 path overlaps Sweden&amp;rsquo;s actual GDP to within &lt;strong>\$233 per capita by 2005&lt;/strong> — less than 1% of the level.&lt;/p>
&lt;p>In other words, Sweden&amp;rsquo;s economy did exactly what a synthetic Scandinavian-plus-US counterfactual predicted. There is no measurable growth penalty from the carbon tax. Combined with the gap-plot evidence above, this rules out GDP (and recessions more broadly) as a confounder of the CO2 result. The policy worked &lt;em>and&lt;/em> the economy was fine.&lt;/p>
&lt;h2 id="tax-incidence-ols-and-iv">Tax incidence, OLS, and IV&lt;/h2>
&lt;p>So far the analysis has been aggregate. Synthetic control tells us &lt;em>how much&lt;/em> emissions fell, but not &lt;em>why&lt;/em> consumers changed their behaviour. This final block of analysis zooms into the demand side.&lt;/p>
&lt;p>We will answer three questions in turn:&lt;/p>
&lt;ol>
&lt;li>&lt;strong>Tax incidence:&lt;/strong> when the government raises the fuel tax, who actually pays — consumers (at the pump) or oil companies (out of margins)?&lt;/li>
&lt;li>&lt;strong>Price vs tax elasticity:&lt;/strong> by how much do Swedes cut gasoline consumption per extra SEK on the price vs per extra SEK on the tax?&lt;/li>
&lt;li>&lt;strong>Disentangling:&lt;/strong> how much of the emission reduction came from the carbon tax alone, and how much from the bundled VAT?&lt;/li>
&lt;/ol>
&lt;h3 id="did-consumers-really-pay-the-tax">Did consumers really pay the tax?&lt;/h3>
&lt;p>We need to know whether the carbon tax shows up in the retail price. If oil companies absorb it (their profits drop), the price signal never reaches the consumer, and the behavioural channel disappears. If they pass it through fully, the tax actually changes prices at the pump.&lt;/p>
&lt;p>Andersson estimates &lt;strong>pass-through&lt;/strong> by regressing first-differences of the retail price on first-differences of the oil price and the total tax:&lt;/p>
&lt;p>$$\Delta p^*_t = \beta_0 + \beta_1 \, \Delta \Theta_t + \beta_2 \, \Delta T_t + \varepsilon_t.$$&lt;/p>
&lt;p>Here:&lt;/p>
&lt;ul>
&lt;li>$\Delta p^*_t$ is the year-on-year change in the nominal retail gasoline price.&lt;/li>
&lt;li>$\Delta \Theta_t$ is the year-on-year change in the oil price (the wholesale cost).&lt;/li>
&lt;li>$\Delta T_t$ is the year-on-year change in the energy + carbon tax.&lt;/li>
&lt;li>$\beta_2$ is the &lt;strong>pass-through coefficient&lt;/strong> — the share of the tax change consumers actually pay.&lt;/li>
&lt;/ul>
&lt;p>If $\beta_2 = 1$, consumers paid the full tax. If $\beta_2 = 0.5$, oil companies absorbed half. Working in changes (the $\Delta$ operator) rather than levels removes any time-invariant level effects and isolates how prices respond to &lt;em>new&lt;/em> tax movements.&lt;/p>
&lt;pre>&lt;code class="language-python">tax_sub = reg[[&amp;quot;year&amp;quot;,&amp;quot;p_nom&amp;quot;,&amp;quot;en_tax&amp;quot;,&amp;quot;CO2_tax&amp;quot;,&amp;quot;oil_p&amp;quot;,&amp;quot;en_CO2_tax&amp;quot;]].copy()
tax_sub[&amp;quot;delta_p&amp;quot;] = tax_sub[&amp;quot;p_nom&amp;quot;].diff()
tax_sub[&amp;quot;delta_oil_p&amp;quot;] = tax_sub[&amp;quot;oil_p&amp;quot;].diff()
tax_sub[&amp;quot;delta_tax&amp;quot;] = tax_sub[&amp;quot;en_CO2_tax&amp;quot;].diff()
m_incid = pf.feols(&amp;quot;delta_p ~ delta_oil_p + delta_tax&amp;quot;, data=tax_sub.dropna(), vcov=&amp;quot;HC1&amp;quot;)
print(m_incid.tidy().round(4))
&lt;/code>&lt;/pre>
&lt;pre>&lt;code class="language-text"> Estimate Std. Error t value Pr(&amp;gt;|t|)
delta_tax 1.1473 0.1513 7.5823 0.0000
&lt;/code>&lt;/pre>
&lt;p>The pass-through coefficient is &lt;strong>1.15&lt;/strong> with a standard error of 0.15. The 95% confidence interval is roughly [0.85, 1.45], which contains 1.0. We cannot reject the hypothesis that pass-through is exactly one.&lt;/p>
&lt;p>&lt;strong>Consumers paid the whole tax.&lt;/strong> This matters for everything below: when we estimate how much gasoline consumption fell in response to the tax, that response is to a real change in the pump price, not to a hidden absorption by refiners.&lt;/p>
&lt;h3 id="ols-gasoline-consumption-regressions-4-specifications-neweywest-hac-ses">OLS gasoline-consumption regressions (4 specifications, Newey–West HAC SEs)&lt;/h3>
&lt;p>We now estimate how strongly Swedish gasoline demand responds to two things:&lt;/p>
&lt;ul>
&lt;li>A change in the &lt;strong>price excluding the carbon tax&lt;/strong> ($pv_t$).&lt;/li>
&lt;li>A change in the &lt;strong>carbon tax (including VAT)&lt;/strong> ($ct_t$).&lt;/li>
&lt;/ul>
&lt;p>If consumers are rational and only care about the total price they pay, the two responses should be equal. If they react differently to &lt;em>taxes&lt;/em> than to &lt;em>prices&lt;/em> of the same size, that tells us something about how policy works in practice.&lt;/p>
&lt;p>Andersson uses a log-level model (log on the left, levels on the right):&lt;/p>
&lt;p>$$\ln y_t = \beta_0 + \beta_1 \, pv_t + \beta_2 \, ct_t + \beta_3 \, D_t + \beta_4 \, X_t + \varepsilon_t.$$&lt;/p>
&lt;p>Reading the equation:&lt;/p>
&lt;ul>
&lt;li>$\ln y_t$ is the &lt;strong>logarithm&lt;/strong> of per-capita gasoline consumption in year $t$.&lt;/li>
&lt;li>$pv_t$ is the carbon-tax-exclusive real retail price.&lt;/li>
&lt;li>$ct_t$ is the real carbon tax including VAT.&lt;/li>
&lt;li>$D_t$ is a 0/1 dummy that equals 1 in years from 1990 onward.&lt;/li>
&lt;li>$X_t$ is a (possibly empty) vector of controls: GDP per capita, urban population share, unemployment.&lt;/li>
&lt;/ul>
&lt;p>Because the outcome is in logs and the regressors are in levels, the coefficients are &lt;strong>semi-elasticities&lt;/strong>. A useful rule of thumb: a unit increase in $x$ is associated with a $100 \cdot \beta\,\%$ change in $y$. So $\beta_2 = -0.10$ would mean &amp;ldquo;one extra SEK/litre of carbon tax cuts gasoline use by 10%&amp;rdquo;.&lt;/p>
&lt;p>We estimate &lt;strong>four nested specifications&lt;/strong>, adding one control at a time:&lt;/p>
&lt;ul>
&lt;li>&lt;strong>OLS1:&lt;/strong> no controls.&lt;/li>
&lt;li>&lt;strong>OLS2:&lt;/strong> + GDP per capita.&lt;/li>
&lt;li>&lt;strong>OLS3:&lt;/strong> + urbanisation.&lt;/li>
&lt;li>&lt;strong>OLS4:&lt;/strong> + unemployment (the full specification Andersson highlights).&lt;/li>
&lt;/ul>
&lt;p>The point of nesting is to see whether the price and tax coefficients are &lt;strong>stable&lt;/strong> when controls are added. If they swing wildly, we should worry about confounding. If they barely move, we are on firmer ground.&lt;/p>
&lt;p>We use two flavours of standard error. &lt;strong>HC1&lt;/strong> corrects for heteroskedasticity (cross-section). &lt;strong>Newey–West HAC with 16 lags&lt;/strong> corrects for both heteroskedasticity &lt;em>and&lt;/em> autocorrelation — the right choice for time-series data, and what Andersson uses in Stata.&lt;/p>
&lt;pre>&lt;code class="language-python">ols_specs = {
&amp;quot;OLS1&amp;quot;: &amp;quot;log_gas_cons ~ p_real_vat + real_CO2_tax_vat + d_CO2_tax + t&amp;quot;,
&amp;quot;OLS2&amp;quot;: &amp;quot;log_gas_cons ~ p_real_vat + real_CO2_tax_vat + d_CO2_tax + t + gdp_cap&amp;quot;,
&amp;quot;OLS3&amp;quot;: &amp;quot;log_gas_cons ~ p_real_vat + real_CO2_tax_vat + d_CO2_tax + t + gdp_cap + urban_pop&amp;quot;,
&amp;quot;OLS4&amp;quot;: &amp;quot;log_gas_cons ~ p_real_vat + real_CO2_tax_vat + d_CO2_tax + t + gdp_cap + urban_pop + unempl&amp;quot;,
}
ols_fits = {name: pf.feols(f, data=reg, vcov=&amp;quot;HC1&amp;quot;) for name, f in ols_specs.items()}
&lt;/code>&lt;/pre>
&lt;pre>&lt;code class="language-text">OLS4 (HC1):
Estimate Std. Error t value Pr(&amp;gt;|t|)
p_real_vat -0.0603 0.0135 -4.4568 0.0001
real_CO2_tax_vat -0.1856 0.0450 -4.1217 0.0002
OLS4 (Newey-West HAC, 16 lags):
coef se_nw16 t p
p_real_vat -0.0603 0.0106 -5.7160 0.0000
real_CO2_tax_vat -0.1856 0.0383 -4.8520 0.0000
&lt;/code>&lt;/pre>
&lt;p>The OLS4 numbers are the headline of this section:&lt;/p>
&lt;ul>
&lt;li>&lt;strong>Price semi-elasticity:&lt;/strong> −0.060. A 1 SEK/litre rise in the carbon-tax-exclusive real price is associated with a &lt;strong>6% lower&lt;/strong> per-capita gasoline consumption.&lt;/li>
&lt;li>&lt;strong>Tax semi-elasticity:&lt;/strong> −0.186. A 1 SEK/litre rise in the real carbon tax (including VAT) is associated with an &lt;strong>18.6% lower&lt;/strong> per-capita gasoline consumption.&lt;/li>
&lt;/ul>
&lt;p>The tax response is roughly &lt;strong>three times the price response&lt;/strong>, and this gap is stable across OLS1 through OLS4. Both numbers are significant under both HC1 and Newey–West SEs (in fact, the Newey–West SEs are slightly tighter here, which is rare but consistent with positive autocorrelation in residuals).&lt;/p>
&lt;p>The 3-to-1 ratio is the most interesting finding of this section. We will interpret &lt;em>why&lt;/em> it shows up below — but first, we need to check that the OLS estimates are not biased by &lt;strong>endogeneity&lt;/strong>.&lt;/p>
&lt;h3 id="instrumental-variables--addressing-endogeneity">Instrumental variables — addressing endogeneity&lt;/h3>
&lt;h4 id="why-we-need-an-instrument">Why we need an instrument&lt;/h4>
&lt;p>OLS gives the &lt;strong>right answer only if&lt;/strong> the explanatory variables are uncorrelated with the regression&amp;rsquo;s error term. If they are correlated, the OLS coefficient is &lt;strong>biased&lt;/strong> and the bias does not shrink with more data — it is &lt;em>systematic&lt;/em>. This problem is called &lt;strong>endogeneity&lt;/strong>.&lt;/p>
&lt;p>In our setting, the carbon-tax-exclusive price $pv_t$ might be endogenous. Imagine an unobserved demand shock — say, a sudden push for electric vehicles, or a tightening of EU fuel-economy regulation. That shock would lower gasoline demand &lt;em>and&lt;/em> could simultaneously change the carbon-tax-exclusive price (because lower demand may push oil markets to react). The two are correlated through the demand shock, and OLS would mis-attribute the demand-shock effect to the price coefficient.&lt;/p>
&lt;h4 id="two-stage-least-squares-2sls">Two-stage least squares (2SLS)&lt;/h4>
&lt;p>The fix is &lt;strong>instrumental variables (IV)&lt;/strong>, usually implemented as &lt;strong>two-stage least squares (2SLS)&lt;/strong>. We find an external variable $z$ — the &lt;strong>instrument&lt;/strong> — that satisfies two conditions:&lt;/p>
&lt;ol>
&lt;li>&lt;strong>Relevance:&lt;/strong> $z$ is correlated with the endogenous regressor $pv_t$. (Without this, there is no signal to use.)&lt;/li>
&lt;li>&lt;strong>Exogeneity:&lt;/strong> $z$ is &lt;em>not&lt;/em> correlated with the error term. It affects the outcome only &lt;em>through&lt;/em> $pv_t$. (Without this, we just trade one bias for another.)&lt;/li>
&lt;/ol>
&lt;p>If both hold, the IV estimator gives an unbiased coefficient.&lt;/p>
&lt;p>Andersson proposes two instruments for the price:&lt;/p>
&lt;ul>
&lt;li>&lt;strong>Real crude oil price&lt;/strong> — exogenous because Sweden is too small to move world oil prices.&lt;/li>
&lt;li>&lt;strong>Real energy tax&lt;/strong> — exogenous because it is set by policy on long lead times, not by short-run demand shocks.&lt;/li>
&lt;/ul>
&lt;pre>&lt;code class="language-python">iv_data = reg[(reg[&amp;quot;year&amp;quot;] &amp;gt;= 1970) &amp;amp; (reg[&amp;quot;year&amp;quot;] &amp;lt;= 2011)].copy()
iv2 = pf.feols(
&amp;quot;log_gas_cons ~ real_CO2_tax_vat + d_CO2_tax + t + gdp_cap + urban_pop + unempl &amp;quot;
&amp;quot;| p_real_vat ~ oil_p_real&amp;quot;, # IV: oil price instrument
data=iv_data, vcov=&amp;quot;HC1&amp;quot;,
)
iv1 = pf.feols(
&amp;quot;log_gas_cons ~ real_CO2_tax_vat + d_CO2_tax + t + gdp_cap + urban_pop + unempl &amp;quot;
&amp;quot;| p_real_vat ~ real_en_tax_vat&amp;quot;, # IV: energy-tax instrument
data=iv_data, vcov=&amp;quot;HC1&amp;quot;,
)
&lt;/code>&lt;/pre>
&lt;pre>&lt;code class="language-text"> model beta_p_real_vat beta_real_CO2_tax_vat
OLS4 -0.0603 -0.1856
IV (energy tax) -0.0620 -0.1857
IV (oil price) -0.0641 -0.1857
IV (both) -0.0638 -0.1857
&lt;/code>&lt;/pre>
&lt;p>&lt;img src="python_sc_co2tax_iv_vs_ols_coefs.png" alt="OLS vs IV: price and tax semi-elasticities">&lt;/p>
&lt;p>Across all three IV specifications, the tax semi-elasticity is pinned to &lt;strong>−0.186&lt;/strong> — identical to OLS4 to four decimal places. The price semi-elasticity moves only slightly, from −0.060 (OLS) to −0.064 (IV with oil price).&lt;/p>
&lt;p>This near-identical agreement is itself informative. If the OLS price coefficient had been badly biased, the IV would have moved it noticeably. Andersson&amp;rsquo;s Wu–Hausman test (which checks exactly this) cannot reject the null that the price is exogenous. So we treat the OLS4 coefficients as causal estimates of the price and tax elasticities of gasoline demand.&lt;/p>
&lt;h4 id="why-a-3-tax-vs-price-asymmetry">Why a 3× tax-vs-price asymmetry?&lt;/h4>
&lt;p>The headline finding survives all sensitivity checks: consumers respond to a 1-SEK/litre tax increase &lt;strong>three times more strongly&lt;/strong> than to a 1-SEK/litre market price increase. Why? The economics literature points to two channels:&lt;/p>
&lt;ul>
&lt;li>&lt;strong>Salience.&lt;/strong> A tax increase is &lt;em>announced&lt;/em>. It appears in the news. It is debated in Parliament. A market price increase is just a slow drift on the petrol-station billboard.&lt;/li>
&lt;li>&lt;strong>Permanence.&lt;/strong> A tax increase is &lt;em>persistent&lt;/em>. Once enacted, it rarely reverses. Market prices fluctuate. A consumer who sees the pump price spike one week may rationally wait it out. A consumer who sees a tax come into force will adjust longer-term decisions — vehicle purchases, commute distance, transport-mode choice.&lt;/li>
&lt;/ul>
&lt;p>The policy implication is large: revenue-neutral tax swaps (raise the carbon tax, cut something else) can produce real emission reductions even when the average consumer&amp;rsquo;s total tax burden is unchanged.&lt;/p>
&lt;h3 id="disentangling-carbon-tax-from-vat">Disentangling carbon tax from VAT&lt;/h3>
&lt;p>The 1990/91 reform was a &lt;strong>bundle&lt;/strong>: a new carbon tax, a new VAT on transport fuel, and a small reduction in the pre-existing energy tax. The synthetic-control number above measures the &lt;em>total&lt;/em> effect of the bundle. But what fraction of that total is the carbon tax alone?&lt;/p>
&lt;p>Andersson answers this by simulating the demand model under three different counterfactual pricing scenarios:&lt;/p>
&lt;table>
&lt;thead>
&lt;tr>
&lt;th>Scenario&lt;/th>
&lt;th>What&amp;rsquo;s switched on&lt;/th>
&lt;th>What&amp;rsquo;s switched off&lt;/th>
&lt;/tr>
&lt;/thead>
&lt;tbody>
&lt;tr>
&lt;td>&lt;code>CarbonTaxandVAT&lt;/code> (actual)&lt;/td>
&lt;td>All three components&lt;/td>
&lt;td>Nothing&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;code>NoCarbonTaxWithVAT&lt;/code>&lt;/td>
&lt;td>VAT + energy tax&lt;/td>
&lt;td>Carbon tax&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;code>NoCarbonTaxNoVAT&lt;/code>&lt;/td>
&lt;td>Energy tax only&lt;/td>
&lt;td>Carbon tax + VAT&lt;/td>
&lt;/tr>
&lt;/tbody>
&lt;/table>
&lt;p>The vertical distance between two curves measures the contribution of the component switched between them. We focus on the wedge between &lt;code>CarbonTaxandVAT&lt;/code> and &lt;code>NoCarbonTaxWithVAT&lt;/code> — that is the &lt;strong>carbon-tax-only&lt;/strong> contribution.&lt;/p>
&lt;pre>&lt;code class="language-python">dis = disent[(disent[&amp;quot;year&amp;quot;] &amp;gt;= 1970) &amp;amp; (disent[&amp;quot;year&amp;quot;] &amp;lt;= 2005)].copy()
fig, ax = plt.subplots(figsize=(9, 5.4))
ax.plot(dis[&amp;quot;year&amp;quot;], dis[&amp;quot;NoCarbonTaxNoVAT&amp;quot;], color=TEAL, lw=2.2, ls=&amp;quot;:&amp;quot;,
label=&amp;quot;No carbon tax, no VAT&amp;quot;)
ax.plot(dis[&amp;quot;year&amp;quot;], dis[&amp;quot;NoCarbonTaxWithVAT&amp;quot;], color=STEEL_BLUE, lw=2.2, ls=&amp;quot;--&amp;quot;,
label=&amp;quot;No carbon tax, with VAT&amp;quot;)
ax.plot(dis[&amp;quot;year&amp;quot;], dis[&amp;quot;CarbonTaxandVAT&amp;quot;], color=WARM_ORANGE, lw=2.4,
label=&amp;quot;Carbon tax + VAT (actual)&amp;quot;)
ax.axvline(1990, color=LIGHT_TEXT, lw=0.8, ls=&amp;quot;:&amp;quot;)
&lt;/code>&lt;/pre>
&lt;p>&lt;img src="python_sc_co2tax_disentangling.png" alt="Disentangling carbon tax and VAT">&lt;/p>
&lt;pre>&lt;code class="language-text"> year CarbonTaxandVAT NoCarbonTaxWithVAT NoCarbonTaxNoVAT
2000 2.3986 2.5747 2.7640
2005 2.2923 2.8601 3.0495
Mean post-1990 carbon-tax-attributable reduction (rel. to no-carbon-tax-with-VAT): 9.50%
&lt;/code>&lt;/pre>
&lt;p>Reading the three lines:&lt;/p>
&lt;ul>
&lt;li>The &lt;strong>orange&lt;/strong> line is what actually happened (carbon tax + VAT + energy tax all active).&lt;/li>
&lt;li>The &lt;strong>blue dashed&lt;/strong> line shows where emissions &lt;em>would&lt;/em> have been if the carbon tax had been removed but the VAT had stayed.&lt;/li>
&lt;li>The &lt;strong>teal dotted&lt;/strong> line shows where emissions &lt;em>would&lt;/em> have been if both the carbon tax and VAT had been removed.&lt;/li>
&lt;/ul>
&lt;p>The vertical gap between orange and blue is the carbon-tax-only wedge. The vertical gap between blue and teal-dotted is the VAT-only wedge.&lt;/p>
&lt;p>Three numbers from the simulation:&lt;/p>
&lt;ul>
&lt;li>&lt;strong>2005 carbon-tax-only effect:&lt;/strong> −0.57 t/capita, roughly &lt;strong>75% of the total reform wedge&lt;/strong> in that year.&lt;/li>
&lt;li>&lt;strong>Average post-1990 carbon-tax-only effect:&lt;/strong> 9.5% of the no-carbon-tax-with-VAT baseline.&lt;/li>
&lt;li>&lt;strong>Andersson&amp;rsquo;s headline number&lt;/strong> (same wedge, but measured against the Synthetic-Sweden baseline): 6.3%.&lt;/li>
&lt;/ul>
&lt;p>The two percentages look different but describe the same physical wedge (~0.17 t/capita on average). They differ only in the denominator used to normalise. The carbon tax does most of the work after 2000, when the rate is ratcheted up sharply.&lt;/p>
&lt;h2 id="discussion">Discussion&lt;/h2>
&lt;h3 id="what-we-found">What we found&lt;/h3>
&lt;p>Five claims emerge from the analysis, each built on a different piece of evidence:&lt;/p>
&lt;ol>
&lt;li>&lt;strong>The carbon tax cut Swedish transport CO2.&lt;/strong> The synthetic-control point estimate is an 11.3% average annual reduction over 1990–2005.&lt;/li>
&lt;li>&lt;strong>The result is robust.&lt;/strong> Three independent placebo tests support it: in-time (no false-positive gap when treatment is backdated), in-space (Sweden&amp;rsquo;s gap exceeds 14 of 15 placebos, p = 0.067), and leave-one-out (the gap is between 8.8% and 13% regardless of which donor we drop).&lt;/li>
&lt;li>&lt;strong>No growth penalty.&lt;/strong> A separately built Synthetic-Sweden(GDP) tracks Sweden&amp;rsquo;s actual GDP within \$233 per capita by 2005, ruling out the recession story.&lt;/li>
&lt;li>&lt;strong>Pass-through was complete.&lt;/strong> The retail price absorbed the entire tax change (β ≈ 1.15), so consumers really did face the higher price.&lt;/li>
&lt;li>&lt;strong>Consumers responded ~3× more strongly to taxes than to prices&lt;/strong> of the same magnitude. The carbon-tax-only contribution explains roughly 75% of the total reform wedge by 2005.&lt;/li>
&lt;/ol>
&lt;h3 id="what-it-means-for-policy">What it means for policy&lt;/h3>
&lt;p>For a policymaker weighing carbon pricing today, three concrete takeaways follow:&lt;/p>
&lt;ul>
&lt;li>&lt;strong>Modest carbon taxes work&lt;/strong> — &lt;em>if&lt;/em> they are salient, persistent, and fully passed through. Sweden&amp;rsquo;s reform was all three.&lt;/li>
&lt;li>&lt;strong>The &amp;ldquo;growth penalty&amp;rdquo; fear is empirically unsupported&lt;/strong> in this case study. Thirty years of data show no measurable GDP cost.&lt;/li>
&lt;li>&lt;strong>Revenue-neutral tax swaps&lt;/strong> (raise carbon tax, cut another tax) can deliver real emission reductions even when the average household&amp;rsquo;s total tax burden does not rise — because the &lt;em>composition&lt;/em> of taxes carries more behavioural weight than the level.&lt;/li>
&lt;/ul>
&lt;h3 id="limitations">Limitations&lt;/h3>
&lt;p>Three honest caveats keep the result in perspective:&lt;/p>
&lt;ul>
&lt;li>&lt;strong>Single-country case.&lt;/strong> Sweden is one observation. External validity to, say, a developing economy or a much larger emitter is not guaranteed.&lt;/li>
&lt;li>&lt;strong>Donor-pool size caps the p-value.&lt;/strong> With 15 countries, the smallest possible permutation p-value is 1/15 ≈ 0.067. A larger donor pool would deliver more statistical power.&lt;/li>
&lt;li>&lt;strong>No 2020s data.&lt;/strong> The analysis stops in 2005, before the surge in electric vehicles and broader EU climate policy. Re-running with newer data would test whether the relationship still holds.&lt;/li>
&lt;/ul>
&lt;h2 id="summary-and-next-steps">Summary and next steps&lt;/h2>
&lt;h3 id="five-numbers-to-remember">Five numbers to remember&lt;/h3>
&lt;table>
&lt;thead>
&lt;tr>
&lt;th>Quantity&lt;/th>
&lt;th>Value&lt;/th>
&lt;th>What it means&lt;/th>
&lt;/tr>
&lt;/thead>
&lt;tbody>
&lt;tr>
&lt;td>Synthetic Sweden — average gap&lt;/td>
&lt;td>&lt;strong>−11.3%&lt;/strong> per year (1990–2005)&lt;/td>
&lt;td>The carbon tax cut transport CO2 by about a tenth, every year, for 16 years&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>Permutation p-value&lt;/td>
&lt;td>&lt;strong>0.067&lt;/strong>&lt;/td>
&lt;td>Only 1 in 15 placebo countries shows a gap as big as Sweden&amp;rsquo;s&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>Leave-one-out range&lt;/td>
&lt;td>&lt;strong>8.8% to 13%&lt;/strong>&lt;/td>
&lt;td>The result survives dropping any single high-weight donor&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>Tax-vs-price asymmetry&lt;/td>
&lt;td>&lt;strong>3×&lt;/strong>&lt;/td>
&lt;td>Consumers cut consumption 3× harder per SEK of tax than per SEK of price&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>Synthetic GDP gap&lt;/td>
&lt;td>&lt;strong>&amp;lt; \$233 / capita&lt;/strong>&lt;/td>
&lt;td>No detectable growth penalty from the carbon tax&lt;/td>
&lt;/tr>
&lt;/tbody>
&lt;/table>
&lt;h3 id="methods-recap-in-plain-language">Methods recap, in plain language&lt;/h3>
&lt;ul>
&lt;li>&lt;strong>Naive pre/post&lt;/strong> confuses the policy with everything else over time. Use it only as a strawman.&lt;/li>
&lt;li>&lt;strong>DiD&lt;/strong> introduces a control unit but assumes parallel trends — testable only in the pre-period.&lt;/li>
&lt;li>&lt;strong>Synthetic control&lt;/strong> builds a data-driven weighted blend of donors. It relaxes parallel trends and gives a transparent counterfactual.&lt;/li>
&lt;li>&lt;strong>Placebo tests&lt;/strong> are the price of admission for any synthetic-control claim. Without them, the gap is just a number.&lt;/li>
&lt;li>&lt;strong>OLS&lt;/strong> is the workhorse, but &lt;strong>IV (2SLS)&lt;/strong> is the insurance policy against endogeneity.&lt;/li>
&lt;/ul>
&lt;h3 id="things-to-try-next">Things to try next&lt;/h3>
&lt;ul>
&lt;li>&lt;strong>Augmented synthetic control.&lt;/strong> Re-fit with &lt;code>pysyncon.AugSynth&lt;/code> (Ben-Michael, Feller, Rothstein 2021), which allows negative weights via ridge regularisation. Does the headline gap move?&lt;/li>
&lt;li>&lt;strong>Extend the panel through 2020.&lt;/strong> Recent OECD data would let you test whether the relationship persists after the electric-vehicle boom.&lt;/li>
&lt;li>&lt;strong>Wild cluster bootstrap.&lt;/strong> Replace the Newey–West HAC SEs with &lt;code>pyfixest&lt;/code>&amp;rsquo;s wild-cluster bootstrap to check inference under small-sample concerns.&lt;/li>
&lt;/ul>
&lt;h2 id="exercises">Exercises&lt;/h2>
&lt;ol>
&lt;li>&lt;strong>Sensitivity to the donor pool.&lt;/strong> Drop Denmark from the donor list before fitting &lt;code>pysyncon.Synth&lt;/code>. Does the post-1990 gap shrink, stay the same, or grow? Compare numerically to the leave-one-out plot.&lt;/li>
&lt;li>&lt;strong>Alternative predictors.&lt;/strong> Re-fit Synthetic Sweden with only the four economic predictors and &lt;em>no&lt;/em> lagged CO2 levels in &lt;code>special_predictors&lt;/code>. Does the pre-treatment fit deteriorate? By how much does the donor composition shift?&lt;/li>
&lt;li>&lt;strong>Augmented synthetic control.&lt;/strong> Replace &lt;code>Synth()&lt;/code> with &lt;code>pysyncon.AugSynth()&lt;/code> (which permits negative weights via ridge regularization). Compare the headline post-treatment gap and donor weights to the constrained-Synth solution.&lt;/li>
&lt;/ol>
&lt;h2 id="references">References&lt;/h2>
&lt;ol>
&lt;li>Andersson, J. J. (2019). &lt;em>Carbon Taxes and CO2 Emissions: Sweden as a Case Study&lt;/em>. American Economic Journal: Economic Policy, 11(4), 1–30. &lt;a href="https://www.aeaweb.org/articles?id=10.1257/pol.20170144" target="_blank" rel="noopener">https://www.aeaweb.org/articles?id=10.1257/pol.20170144&lt;/a>&lt;/li>
&lt;li>Abadie, A., Diamond, A., &amp;amp; Hainmueller, J. (2010). &lt;em>Synthetic Control Methods for Comparative Case Studies&lt;/em>. JASA, 105(490), 493–505.&lt;/li>
&lt;li>Abadie, A., Diamond, A., &amp;amp; Hainmueller, J. (2015). &lt;em>Comparative Politics and the Synthetic Control Method&lt;/em>. AJPS, 59(2), 495–510.&lt;/li>
&lt;li>Graefe, T. (2020). &lt;em>RTutor Carbon Taxes and CO2 Emissions&lt;/em> — the R tutor problem set this post replicates. &lt;a href="https://github.com/TheresaGraefe/RTutorCarbonTaxesAndCO2Emissions" target="_blank" rel="noopener">https://github.com/TheresaGraefe/RTutorCarbonTaxesAndCO2Emissions&lt;/a>&lt;/li>
&lt;li>&lt;code>pysyncon&lt;/code> documentation — &lt;a href="https://sdfordham.github.io/pysyncon/" target="_blank" rel="noopener">https://sdfordham.github.io/pysyncon/&lt;/a>&lt;/li>
&lt;li>&lt;code>pyfixest&lt;/code> documentation — &lt;a href="https://pyfixest.org/" target="_blank" rel="noopener">https://pyfixest.org/&lt;/a>&lt;/li>
&lt;li>Newey, W. K., &amp;amp; West, K. D. (1987). &lt;em>A Simple, Positive Semi-Definite, Heteroskedasticity and Autocorrelation Consistent Covariance Matrix&lt;/em>. Econometrica, 55(3), 703–708.&lt;/li>
&lt;/ol></description></item></channel></rss>