<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>heterogeneous treatment effects | Carlos Mendez</title><link>https://carlos-mendez.org/tag/heterogeneous-treatment-effects/</link><atom:link href="https://carlos-mendez.org/tag/heterogeneous-treatment-effects/index.xml" rel="self" type="application/rss+xml"/><description>heterogeneous treatment effects</description><generator>Wowchemy (https://wowchemy.com)</generator><language>en-us</language><copyright>Carlos Mendez</copyright><lastBuildDate>Fri, 01 May 2026 00:00:00 +0000</lastBuildDate><image><url>https://carlos-mendez.org/media/icon_huedfae549300b4ca5d201a9bd09a3ecd5_79625_512x512_fill_lanczos_center_3.png</url><title>heterogeneous treatment effects</title><link>https://carlos-mendez.org/tag/heterogeneous-treatment-effects/</link></image><item><title>Conditional Average Treatment Effects (CATE) with Stata 19</title><link>https://carlos-mendez.org/post/stata_cate/</link><pubDate>Fri, 01 May 2026 00:00:00 +0000</pubDate><guid>https://carlos-mendez.org/post/stata_cate/</guid><description>&lt;h2 id="1-overview">1. Overview&lt;/h2>
&lt;p>The textbook causal-inference workflow ends with a single number — the &lt;strong>Average Treatment Effect (ATE)&lt;/strong>. But policy makers, doctors, and managers rarely care only about the average. They want to know &lt;em>for whom&lt;/em> the program works best, &lt;em>for whom&lt;/em> it does little, and &lt;em>whether&lt;/em> the gains are worth the cost in any particular subgroup. This question — how the treatment effect varies across the covariates — is captured by the &lt;strong>Conditional Average Treatment Effect (CATE)&lt;/strong>, also written $\tau(x) = E\{y(1) - y(0) \mid x = x\}$.&lt;/p>
&lt;p>Until very recently, estimating CATE in Stata required hand-rolled &lt;code>forvalues&lt;/code> loops, careful interactions, and uncomfortably-ad-hoc inference. Stata 19 changed that with the new &lt;code>cate&lt;/code> command, which builds on the doubly robust scores of Athey, Tibshirani &amp;amp; Wager (2019) and the partialing-out workflow of Chernozhukov et al. (2018). With one command, Stata 19 now runs cross-fitted lasso for the nuisance functions, a generalized random forest for the individual-effect function $\tau(x)$, and an honest-tree bootstrap for confidence intervals. Postestimation tools — &lt;code>estat heterogeneity&lt;/code>, &lt;code>estat projection&lt;/code>, &lt;code>categraph gateplot&lt;/code>, &lt;code>estat classification&lt;/code>, &lt;code>estat series&lt;/code> — turn the resulting object into pictures that beginners can read directly.&lt;/p>
&lt;p>This tutorial walks through the full &lt;code>cate&lt;/code> workflow on the canonical 401(k) eligibility study (&lt;code>webuse assets3&lt;/code>, 9,913 households). We start with a single ATE, show that it hides a wide fan of household-level effects, and then peel back the heterogeneity in five complementary ways: a histogram of individual effects, an IATE-by-covariate plot, a GATE on prespecified income groups, GATES on data-driven quartiles, and a smooth nonparametric series fit. The result is a complete picture of &lt;em>who benefits&lt;/em> — and a reusable template you can drop into your own observational data.&lt;/p>
&lt;blockquote>
&lt;p>&lt;strong>Prerequisite.&lt;/strong> This post requires &lt;strong>Stata 19 or later&lt;/strong>. The &lt;code>cate&lt;/code> command does not exist in Stata 18. The do-file aborts on startup if it detects an older Stata.&lt;/p>
&lt;/blockquote>
&lt;h3 id="11-learning-objectives">1.1 Learning objectives&lt;/h3>
&lt;p>By the end of this tutorial you should be able to:&lt;/p>
&lt;ul>
&lt;li>&lt;strong>Understand&lt;/strong> why the ATE alone can mislead and how the CATE function $\tau(x)$ describes treatment effect heterogeneity.&lt;/li>
&lt;li>&lt;strong>Implement&lt;/strong> Stata 19&amp;rsquo;s &lt;code>cate&lt;/code> command using both the partialing-out (PO) and the augmented inverse-probability weighting (AIPW) estimators on observational data.&lt;/li>
&lt;li>&lt;strong>Estimate&lt;/strong> group-level effects (GATE) on prespecified groups and data-driven quartiles (GATES) of the predicted effect.&lt;/li>
&lt;li>&lt;strong>Diagnose&lt;/strong> treatment-effect heterogeneity with &lt;code>estat heterogeneity&lt;/code>, summarize who responds with &lt;code>estat projection&lt;/code> and &lt;code>estat classification&lt;/code>, and visualize the dose-response with &lt;code>estat series&lt;/code>.&lt;/li>
&lt;li>&lt;strong>Compare&lt;/strong> doubly robust ML estimates (PO, AIPW) to a parametric &lt;code>teffects aipw&lt;/code> benchmark and judge whether the average is hiding important variation.&lt;/li>
&lt;/ul>
&lt;h3 id="12-methodological-overview">1.2 Methodological overview&lt;/h3>
&lt;p>The diagram below shows the two routes through the &lt;code>cate&lt;/code> command and the postestimation tools that probe the resulting CATE object.&lt;/p>
&lt;pre>&lt;code class="language-mermaid">flowchart TB
A[&amp;quot;assets3 dataset&amp;lt;br/&amp;gt;9,913 households&amp;lt;br/&amp;gt;e401k -&amp;gt; assets&amp;quot;]:::data
A --&amp;gt; B{cate command}:::main
B --&amp;gt;|&amp;quot;&amp;lt;b&amp;gt;cate po&amp;lt;/b&amp;gt;&amp;lt;br/&amp;gt;Partial-linear model&amp;lt;br/&amp;gt;Robust to small propensities&amp;quot;| C[&amp;quot;PO estimator&amp;lt;br/&amp;gt;Cross-fit lasso + causal forest&amp;quot;]:::po
B --&amp;gt;|&amp;quot;&amp;lt;b&amp;gt;cate aipw&amp;lt;/b&amp;gt;&amp;lt;br/&amp;gt;Fully interactive model&amp;lt;br/&amp;gt;Doubly robust, more efficient&amp;quot;| D[&amp;quot;AIPW estimator&amp;lt;br/&amp;gt;Cross-fit lasso + causal forest&amp;quot;]:::aipw
C --&amp;gt; E[&amp;quot;IATE function&amp;lt;br/&amp;gt;tau-hat(x_i) per household&amp;quot;]:::iate
D --&amp;gt; E
E --&amp;gt; F1[&amp;quot;categraph histogram&amp;lt;br/&amp;gt;distribution of effects&amp;quot;]:::post
E --&amp;gt; F2[&amp;quot;categraph iateplot&amp;lt;br/&amp;gt;tau vs covariate&amp;quot;]:::post
E --&amp;gt; F3[&amp;quot;estat heterogeneity&amp;lt;br/&amp;gt;H0: tau(x) constant&amp;quot;]:::post
E --&amp;gt; F4[&amp;quot;estat projection&amp;lt;br/&amp;gt;linear summary of who&amp;quot;]:::post
E --&amp;gt; F5[&amp;quot;GATE / GATES&amp;lt;br/&amp;gt;group-level effects&amp;quot;]:::post
E --&amp;gt; F6[&amp;quot;estat classification&amp;lt;br/&amp;gt;top vs bottom profile&amp;quot;]:::post
E --&amp;gt; F7[&amp;quot;estat series&amp;lt;br/&amp;gt;smooth derivative&amp;quot;]:::post
classDef data fill:#6a9bcc,stroke:#141413,color:#fff
classDef main fill:#141413,stroke:#141413,color:#fff
classDef po fill:#6a9bcc,stroke:#141413,color:#fff
classDef aipw fill:#d97757,stroke:#141413,color:#fff
classDef iate fill:#00d4c8,stroke:#141413,color:#141413
classDef post fill:#f5f5f5,stroke:#141413,color:#141413
&lt;/code>&lt;/pre>
&lt;p>The two branches (PO and AIPW) make different model assumptions but produce the same kind of object: a function $\hat{\tau}(x_i)$ that returns a predicted treatment effect for every household. Postestimation commands then summarize that function in different ways — as a distribution (histogram), a function of one covariate (&lt;code>iateplot&lt;/code>), a test (&lt;code>estat heterogeneity&lt;/code>), a regression summary (&lt;code>estat projection&lt;/code>), or a group-level table (GATE / GATES). All seven postestimation views answer slightly different questions, and the last three sections of this post show why a beginner should look at all of them rather than picking one favorite.&lt;/p>
&lt;hr>
&lt;h2 id="2-the-dataset-401k-eligibility-and-household-assets">2. The dataset: 401(k) eligibility and household assets&lt;/h2>
&lt;p>We use &lt;code>assets3&lt;/code>, an excerpt from Chernozhukov &amp;amp; Hansen (2004) shipped with Stata 19. Each row is one household. The outcome is total net financial assets in dollars; the treatment is whether the household head&amp;rsquo;s employer offers a 401(k) plan (i.e. eligibility, not actual participation). The economic question is whether eligibility on its own — independent of contribution choices — increases retirement wealth, and the standard concern is that eligible workers differ systematically from ineligible workers (they earn more, are older, work for larger employers).&lt;/p>
&lt;p>We load the data, declare which variables describe the heterogeneity we care about, and inspect the basic descriptive stats:&lt;/p>
&lt;pre>&lt;code class="language-stata">webuse assets3, clear
* Define the heterogeneity-of-interest covariates and (for this tutorial)
* the same set as nuisance controls.
global catecovars age educ i.incomecat i.pension i.married i.twoearn i.ira i.ownhome
global controls age educ i.incomecat i.pension i.married i.twoearn i.ira i.ownhome
global rseed 12345671
describe asset e401k age educ income incomecat pension married twoearn ira ownhome
summarize asset e401k age educ income, detail
tab e401k, missing
&lt;/code>&lt;/pre>
&lt;pre>&lt;code class="language-text">Variable Storage Display Value
name type format label Variable label
-------------------------------------------------------------
assets float %9.0g Net total financial assets
e401k byte %12.0g lbe401 401(k) eligibility
age byte %9.0g Age
educ byte %9.0g Years of education
income float %9.0g Household income
incomecat byte %9.0g Income category
pension byte %16.0g lbpen Pension benefits
married byte %11.0g lbmar Marital status
twoearn byte %9.0g lbyes Two-earner household
ira byte %9.0g lbyes IRA participation
ownhome byte %9.0g lbyes Homeowner
401(k) |
eligibility | Freq. Percent Cum.
-------------+-----------------------------------
Not eligible | 6,231 62.86 62.86
Eligible | 3,682 37.14 100.00
Total | 9,913 100.00
&lt;/code>&lt;/pre>
&lt;p>The dataset contains &lt;strong>9,913 households&lt;/strong>, of which &lt;strong>3,682 (37.1%) are eligible&lt;/strong> for a 401(k) and &lt;strong>6,231 (62.9%) are not&lt;/strong>. The asset distribution is extraordinarily right-skewed — mean \$18,054 against a median of just \$1,499, with a maximum of \$1.5 million and a minimum of −\$502,302 (households with negative net worth). Income, age, and education show much milder skew. Four key features matter for what follows: the treatment is roughly balanced (37% vs 63%, plenty of overlap on average), the outcome has heavy tails (so the treatment effect almost certainly varies across the distribution), and we have a rich set of demographic covariates to condition on.&lt;/p>
&lt;hr>
&lt;h2 id="3-the-naive-view-and-why-it-fails">3. The naive view (and why it fails)&lt;/h2>
&lt;p>Before reaching for any causal estimator it is healthy to look at the raw mean difference. If &lt;code>e401k&lt;/code> were randomly assigned the comparison would be the ATE. It isn&amp;rsquo;t — eligibility is a function of who chooses what employer — so the raw difference is biased. Showing this gap explicitly motivates everything that follows.&lt;/p>
&lt;pre>&lt;code class="language-stata">* Raw means by eligibility
tabstat asset, by(e401k) statistics(mean sd n)
&lt;/code>&lt;/pre>
&lt;pre>&lt;code class="language-text">Summary for variables: assets
Group variable: e401k (401(k) eligibility)
e401k | Mean SD N
-------------+------------------------------
Not eligible | 10789.9 54527.02 6231
Eligible | 30347.39 74800.21 3682
-------------+------------------------------
Total | 18054.17 63528.63 9913
&lt;/code>&lt;/pre>
&lt;p>Eligible households hold an average of &lt;strong>\$30,347&lt;/strong> in net financial assets versus &lt;strong>\$10,790&lt;/strong> for ineligible ones — a raw gap of &lt;strong>\$19,557&lt;/strong>. If we believed in random assignment we would call that the average effect of eligibility. But eligible workers are systematically different: they tend to be older, more educated, and earn substantially more. Some of that \$19,557 is causal, but a meaningful share is just selection. The next section pins down how much of the gap is causal once we adjust for those covariates.&lt;/p>
&lt;hr>
&lt;h2 id="4-a-first-ate-parametric-teffects-aipw">4. A first ATE: parametric &lt;code>teffects aipw&lt;/code>&lt;/h2>
&lt;p>Stata&amp;rsquo;s mature &lt;code>teffects&lt;/code> suite already supports doubly robust ATE estimation with parametric models. We use it here as a familiar, fast benchmark before introducing the new &lt;code>cate&lt;/code> command. The estimand is&lt;/p>
&lt;p>$$\text{ATE} = E\{y(1) - y(0)\}$$&lt;/p>
&lt;p>In words, this is the &lt;em>average&lt;/em> treatment effect across all households in the population. The augmented inverse-probability weighting (AIPW) estimator is doubly robust: it returns the right ATE if either the outcome model or the propensity score model is correctly specified — we don&amp;rsquo;t need both.&lt;/p>
&lt;pre>&lt;code class="language-stata">teffects aipw ///
(asset c.age c.educ i.incomecat i.pension i.married i.twoearn i.ira i.ownhome) ///
(e401k c.age c.educ i.incomecat i.pension i.married i.twoearn i.ira i.ownhome)
&lt;/code>&lt;/pre>
&lt;pre>&lt;code class="language-text">Treatment-effects estimation Number of obs = 9,913
Estimator : augmented IPW
Outcome model : linear by ML
Treatment model: logit
------------------------------------------------------------------------------
ATE |
e401k |
(Eligible |
vs |
Not elig..) | 8019.463 1152.038 6.96 0.000 5761.51 10277.42
-------------+----------------------------------------------------------------
POmean |
e401k |
Not eligi.. | 13930.46 817.613 17.04 0.000 12327.97 15532.96
------------------------------------------------------------------------------
&lt;/code>&lt;/pre>
&lt;p>The doubly robust ATE is &lt;strong>\$8,019&lt;/strong> with a 95% confidence interval of &lt;strong>[\$5,762, \$10,277]&lt;/strong> — about 58% above the average baseline assets of ineligible households (\$13,930). The naive raw gap (\$19,557) was therefore inflated by a factor of 2.4: roughly 60% of the observed asset gap between eligible and ineligible households is selection — they would have held more assets even without the program — and only 40% is the causal effect of eligibility. That said, \$8,019 is still just &lt;em>one&lt;/em> number. The cross-tabulation of mean assets by income category and eligibility (which we computed but suppressed for length here, see &lt;code>analysis.log&lt;/code>) shows differences ranging from \$5,011 in the lowest income category to \$20,949 in the highest — a 4× spread that the ATE flattens out. That spread is the CATE we now estimate properly.&lt;/p>
&lt;hr>
&lt;h2 id="5-the-cate-definition-model-and-the-cate-command">5. The CATE: definition, model, and the &lt;code>cate&lt;/code> command&lt;/h2>
&lt;p>The Conditional Average Treatment Effect at covariate value $x$ is defined as&lt;/p>
&lt;p>$$\tau(\mathbf{x}) = E\{y(1) - y(0) \mid \mathbf{x} = \mathbf{x}\}$$&lt;/p>
&lt;p>In words, this says: among all households whose covariates are $\mathbf{x}$, what is their &lt;em>average&lt;/em> treatment effect? The CATE is a &lt;em>function&lt;/em> of covariates, not a single number. If $\tau(\mathbf{x})$ happened to be constant, we&amp;rsquo;d be back at the ATE. Whenever it varies, the ATE is an average of these subgroup effects weighted by how common each $\mathbf{x}$ is in the data.&lt;/p>
&lt;p>To estimate $\tau(\mathbf{x})$ Stata 19 offers two model specifications:&lt;/p>
&lt;ol>
&lt;li>&lt;strong>Partial-linear (PO) model.&lt;/strong> Assumes the outcome can be written as&lt;/li>
&lt;/ol>
&lt;p>$$y = d \cdot \tau(\mathbf{x}) + g(\mathbf{x}, \mathbf{w}) + \epsilon, \qquad d = f(\mathbf{x}, \mathbf{w}) + u$$&lt;/p>
&lt;p>In words, the outcome is the treatment $d$ times the per-household effect $\tau(\mathbf{x})$, plus a flexible function $g$ of all covariates, plus noise; and the treatment itself is a flexible function $f$ of those covariates plus its own noise. PO partials out $g$ and $f$ using out-of-sample predictions (cross-fitting), then fits a generalized random forest on the residuals to recover $\tau(\mathbf{x})$. PO is the more robust choice when propensity scores can get close to 0 or 1.&lt;/p>
&lt;ol start="2">
&lt;li>&lt;strong>Fully interactive (AIPW) model.&lt;/strong> Assumes $y(1) = g_1(\mathbf{x}, \mathbf{w}) + \epsilon_1$ and $y(0) = g_0(\mathbf{x}, \mathbf{w}) + \epsilon_0$ — separate outcome models for treated and untreated households — and combines them with the propensity score to form the doubly-robust AIPW score (Section 9). AIPW is more efficient (narrower CIs) when both models are well-specified, but more sensitive to extreme propensities.&lt;/li>
&lt;/ol>
&lt;p>We start with PO. The variables in &lt;code>$catecovars&lt;/code> are the inputs to $\tau(\mathbf{x})$ — the dimensions on which we want to see heterogeneity — and the &lt;code>controls&lt;/code> (left at the default, which equals &lt;code>catecovars&lt;/code>) are passed to the nuisance models $g$ and $f$. The &lt;code>rseed()&lt;/code> option fixes the cross-fitting and random-forest internals so the run is reproducible.&lt;/p>
&lt;pre>&lt;code class="language-stata">cate po (asset $catecovars) (e401k), rseed($rseed)
&lt;/code>&lt;/pre>
&lt;pre>&lt;code class="language-text">Conditional average treatment effects Number of observations = 9,913
Estimator: Partialing out Number of folds in cross-fit = 10
Outcome model: Linear lasso Number of outcome controls = 17
Treatment model: Logit lasso Number of treatment controls = 17
CATE model: Random forest Number of CATE variables = 17
------------------------------------------------------------------------------
| Robust
assets | Coefficient std. err. z P&amp;gt;|z| [95% conf. interval]
-------------+----------------------------------------------------------------
ATE |
e401k |
(Eligible |
vs |
Not elig..) | 7937.182 1153.017 6.88 0.000 5677.309 10197.05
------------------------------------------------------------------------------
&lt;/code>&lt;/pre>
&lt;p>The PO ATE — averaged over the estimated $\hat{\tau}(\mathbf{x}_i)$ across the sample — is &lt;strong>\$7,937&lt;/strong> with a 95% CI of &lt;strong>[\$5,677, \$10,197]&lt;/strong>. That&amp;rsquo;s within \$80 of the parametric &lt;code>teffects aipw&lt;/code> ATE in the previous section, even though &lt;code>cate po&lt;/code> is doing something fundamentally different under the hood (cross-fit lasso for the nuisance models, causal forest for the IATE). When two very different estimators agree on the average, you can trust that average — and you can move on to looking at the heterogeneity.&lt;/p>
&lt;h3 id="51-is-there-heterogeneity-at-all-estat-heterogeneity">5.1 Is there heterogeneity at all? &lt;code>estat heterogeneity&lt;/code>&lt;/h3>
&lt;p>Before exploring how $\tau(\mathbf{x})$ varies, it is worth asking whether it varies. The &lt;code>estat heterogeneity&lt;/code> command tests the null hypothesis that $\tau(\mathbf{x})$ is constant — that there is, in fact, no heterogeneity to study.&lt;/p>
&lt;pre>&lt;code class="language-stata">estat heterogeneity
&lt;/code>&lt;/pre>
&lt;pre>&lt;code class="language-text">Treatment-effects heterogeneity test
H0: Treatment effects are homogeneous
chi2(1) = 4.11
Prob &amp;gt; chi2 = 0.0427
&lt;/code>&lt;/pre>
&lt;p>The test rejects homogeneity at the 5% level: &lt;strong>χ²(1) = 4.11, p = 0.043&lt;/strong>. In plain English: the data have enough information to distinguish the estimated CATE function $\hat{\tau}(\mathbf{x})$ from a constant. The rest of this tutorial is therefore not a hunt for noise — there is real heterogeneity, and the next sections describe what shape it takes.&lt;/p>
&lt;h3 id="52-who-responds-most-estat-projection">5.2 Who responds most? &lt;code>estat projection&lt;/code>&lt;/h3>
&lt;p>A causal forest fits $\hat{\tau}(\mathbf{x})$ flexibly, but a flexible function is hard to summarize in a paragraph. &lt;code>estat projection&lt;/code> regresses $\hat{\tau}_i$ on the covariates linearly. The coefficients are not causal (they&amp;rsquo;re a &lt;em>projection&lt;/em> of an already-estimated nonlinear function onto a linear basis), but they answer the practical question &amp;ldquo;which variables shift the predicted effect, and by how much?&amp;rdquo;.&lt;/p>
&lt;pre>&lt;code class="language-stata">estat projection $catecovars
&lt;/code>&lt;/pre>
&lt;pre>&lt;code class="language-text">Treatment-effects linear projection Number of obs = 9,913
F(11, 9901) = 4.90
Prob &amp;gt; F = 0.0000
age | 205.12 117.98 1.74 0.082 -26.15 436.39
educ | -442.46 488.47 -0.91 0.365 -1399.96 515.05
incomecat 1 | -2439.22 2013.52 -1.21 0.226 -6386.14 1507.69
incomecat 2 | 1874.82 2295.16 0.82 0.414 -2624.15 6373.79
incomecat 3 | 5707.69 3298.34 1.73 0.084 -757.73 12173.11
incomecat 4 | 18194.60 5398.39 3.37 0.001 7612.65 28776.54
pension Y | 3817.36 2454.44 1.56 0.120 -993.84 8628.55
ownhome Y | 3162.65 1669.59 1.89 0.058 -110.08 6435.38
&lt;/code>&lt;/pre>
&lt;p>The single dominant signal is income. Relative to households in the lowest income category, those in the highest income category have a predicted effect that is &lt;strong>\$18,195 higher&lt;/strong> (p = 0.001) — the only coefficient significant at the 1% level. Homeownership lifts the predicted effect by another \$3,163 (p = 0.058) and each additional year of age adds \$205 (p = 0.082); both are borderline. Education, marriage, two-earner status, and IRA participation are essentially flat. The R² of 0.0045 is not a critique — it tells us most of the heterogeneity is genuinely nonlinear (curvature that the random forest captures and a linear projection cannot). The rest of the post zooms into where that nonlinearity lives.&lt;/p>
&lt;hr>
&lt;h2 id="6-the-shape-of-individual-level-heterogeneity">6. The shape of individual-level heterogeneity&lt;/h2>
&lt;p>Before slicing the CATE by groups, it helps to look at the distribution of household-level effects. &lt;code>categraph histogram&lt;/code> plots the predicted $\hat{\tau}_i$ for every household in the sample.&lt;/p>
&lt;pre>&lt;code class="language-stata">categraph histogram, ///
title(&amp;quot;Distribution of individual treatment effects (PO)&amp;quot;) ///
xtitle(&amp;quot;Estimated tau_hat_i (dollars)&amp;quot;) ///
note(&amp;quot;Source: assets3, Stata 19 cate po&amp;quot;)
graph export &amp;quot;stata_cate_iate_histogram_po.png&amp;quot;, replace width(1200)
&lt;/code>&lt;/pre>
&lt;p>&lt;img src="stata_cate_iate_histogram_po.png" alt="Histogram of PO-estimated individual treatment effects across 9,913 households">&lt;/p>
&lt;p>The distribution is &lt;strong>strongly right-skewed&lt;/strong>. Most households cluster around a modest positive effect (the bulk of the mass sits near \$5,000–\$10,000), but a long right tail extends to \$80,000 and beyond. A small left tail dips into negative territory: a meaningful minority of households are estimated to gain little or nothing from 401(k) eligibility. This is the visual answer to &amp;ldquo;is the average hiding something?&amp;rdquo; — the average of \$7,937 is genuinely close to the median, but the spread on either side is huge. The next two views — IATE plots and GATE — describe &lt;em>who&lt;/em> sits in the right tail.&lt;/p>
&lt;h3 id="61-how-does-the-effect-vary-with-one-covariate-iate-plots">6.1 How does the effect vary with one covariate? IATE plots&lt;/h3>
&lt;p>The &lt;code>categraph iateplot&lt;/code> command holds all covariates except one fixed at sample-mean (continuous) or base (factor) values, and varies the one covariate of interest. The result is a slice through the multi-dimensional CATE function with confidence bands.&lt;/p>
&lt;pre>&lt;code class="language-stata">categraph iateplot age, ///
title(&amp;quot;Estimated CATE by age&amp;quot;) ///
ytitle(&amp;quot;tau_hat (dollars)&amp;quot;) xtitle(&amp;quot;Age (years)&amp;quot;)
graph export &amp;quot;stata_cate_iateplot_age.png&amp;quot;, replace width(1200)
&lt;/code>&lt;/pre>
&lt;p>&lt;img src="stata_cate_iateplot_age.png" alt="Estimated CATE by age, holding all other covariates at means or base values">&lt;/p>
&lt;p>The age slice is broadly increasing. Younger workers (mid-20s to early 30s) have small or even slightly negative predicted effects; the line crosses into clearly positive territory around age 35–40 and continues climbing through the 50s. The intuition is straightforward: 401(k) eligibility is most valuable to workers with the financial slack and the planning horizon to take advantage of tax-deferred saving. Confidence bands narrow in the middle of the age range where most of the data lives and widen at the extremes.&lt;/p>
&lt;p>The same exercise with education looks rather different:&lt;/p>
&lt;pre>&lt;code class="language-stata">categraph iateplot educ, ///
title(&amp;quot;Estimated CATE by years of education&amp;quot;) ///
ytitle(&amp;quot;tau_hat (dollars)&amp;quot;) xtitle(&amp;quot;Education (years)&amp;quot;)
graph export &amp;quot;stata_cate_iateplot_educ.png&amp;quot;, replace width(1200)
&lt;/code>&lt;/pre>
&lt;p>&lt;img src="stata_cate_iateplot_educ.png" alt="Estimated CATE by years of education">&lt;/p>
&lt;p>The education slice is &lt;strong>broadly flat&lt;/strong> at around \$1,000–\$3,000 across the entire range from 8 to 18 years of schooling. This is consistent with the linear projection (where the education coefficient was small and not significant). It is also a useful negative finding — once you condition on income, education adds little to the predicted effect.&lt;/p>
&lt;hr>
&lt;h2 id="7-group-level-effects-gate-on-prespecified-groups">7. Group-level effects: GATE on prespecified groups&lt;/h2>
&lt;p>Individual-level $\hat{\tau}_i$ is informative but noisy. A common practice is to summarize them by &lt;em>groups&lt;/em> — either prespecified (income category, region, education tier) or data-driven (top vs bottom quartile of predicted effect). Stata&amp;rsquo;s GATE and GATES estimators are the formal versions of these two strategies.&lt;/p>
&lt;p>The Group ATE (GATE) on a prespecified group $g$ is&lt;/p>
&lt;p>$$\tau(g) = E\{\Gamma_i \mid G_i = g\}$$&lt;/p>
&lt;p>In words, this says: the average AIPW orthogonal score $\Gamma_i$ within group $g$ — i.e., the doubly robust per-household effect score, averaged over households assigned to that group. We compute it on the income categories &lt;code>incomecat&lt;/code>. The clever bit is &lt;code>reestimate&lt;/code>: after running &lt;code>cate po&lt;/code> once, we tell Stata to recycle the fitted IATE function and just recompute group means, saving a slow second causal-forest fit.&lt;/p>
&lt;pre>&lt;code class="language-stata">cate, group(incomecat) reestimate
estat gatetest
&lt;/code>&lt;/pre>
&lt;pre>&lt;code class="language-text">GATE | Coefficient Std. err. z P&amp;gt;|z| [95% conf. interval]
incomecat |
0 | 4087.014 987.7124 4.14 0.000 2151.13 6022.90
1 | 1399.398 1663.193 0.84 0.400 -1860.40 4659.20
2 | 5154.329 1349.842 3.82 0.000 2508.69 7799.97
3 | 8532.238 2287.664 3.73 0.000 4048.50 13015.98
4 | 20510.94 4723.741 4.34 0.000 11252.58 29769.30
Group treatment-effects heterogeneity test
H0: Group average treatment effects are homogeneous
chi2(4) = 18.44
Prob &amp;gt; chi2 = 0.0010
&lt;/code>&lt;/pre>
&lt;pre>&lt;code class="language-stata">categraph gateplot, ///
title(&amp;quot;GATE by income category&amp;quot;) ///
ytitle(&amp;quot;tau_hat (dollars)&amp;quot;) xtitle(&amp;quot;Income category (1 = low, 5 = high)&amp;quot;)
graph export &amp;quot;stata_cate_gate_incomecat.png&amp;quot;, replace width(1200)
&lt;/code>&lt;/pre>
&lt;p>&lt;img src="stata_cate_gate_incomecat.png" alt="GATE by income category, with 95% confidence bands">&lt;/p>
&lt;p>The five group-level effects span an order of magnitude: \$4,087 (lowest income), \$1,399 (income category 1, not significant at p = 0.40), \$5,154, \$8,532, and &lt;strong>\$20,511&lt;/strong> in the highest income category — roughly five times the average. The joint test of equality (&lt;code>estat gatetest&lt;/code>) rejects strongly: &lt;strong>χ²(4) = 18.44, p = 0.001&lt;/strong>. There is one mild departure from monotonicity at category 1, which is interesting but lies just within sampling variability (its CI overlaps zero). Two important policy facts emerge: the marginal household in the top income category gains an average of about \$20,500 from 401(k) eligibility, and the marginal household at the bottom of the distribution gains about \$4,000 — but the middle-low (category 1) gains effectively nothing.&lt;/p>
&lt;hr>
&lt;h2 id="8-data-driven-groups-gates-on-quartiles-of-hattau">8. Data-driven groups: GATES on quartiles of $\hat{\tau}$&lt;/h2>
&lt;p>GATE on prespecified groups is principled but presupposes that the analyst already knows which groups matter. &lt;strong>GATES&lt;/strong> (&amp;ldquo;Group Average Treatment Effects Sorted&amp;rdquo;) flips this around: it lets the data sort households by their predicted effect, bins them into quantiles, and reports the mean effect within each bin. Cross-fitting protects against p-hacking — each unit&amp;rsquo;s bin is determined by an out-of-sample prediction, so observations cannot leak their own outcomes into their bin assignment.&lt;/p>
&lt;pre>&lt;code class="language-stata">cate po (asset $catecovars) (e401k), rseed($rseed) group(4)
&lt;/code>&lt;/pre>
&lt;pre>&lt;code class="language-text">GATES | Coefficient Std. err. z P&amp;gt;|z| [95% conf. interval]
rank |
1 | 17278.94 3440.125 5.02 0.000 10536.42 24021.46
2 | 8121.04 1691.008 4.80 0.000 4806.73 11435.35
3 | 3443.83 1437.640 2.40 0.017 626.11 6261.56
4 | 2919.20 2110.320 1.38 0.167 -1216.96 7055.35
ATE | 7938.21 1152.994 6.88 0.000 5678.38 10198.04
&lt;/code>&lt;/pre>
&lt;pre>&lt;code class="language-stata">categraph gateplot, ///
title(&amp;quot;GATES by data-driven quartile of estimated effect&amp;quot;) ///
ytitle(&amp;quot;tau_hat (dollars)&amp;quot;) xtitle(&amp;quot;Quartile (1 = highest tau_hat, 4 = lowest)&amp;quot;)
graph export &amp;quot;stata_cate_gates_quartiles.png&amp;quot;, replace width(1200)
&lt;/code>&lt;/pre>
&lt;p>&lt;img src="stata_cate_gates_quartiles.png" alt="GATES by data-driven quartile of the estimated treatment effect">&lt;/p>
&lt;p>The data-driven ladder is &lt;strong>clean and monotonic&lt;/strong>: the top quartile gains an average of &lt;strong>\$17,279&lt;/strong> (CI \$10,536–\$24,021), the second \$8,121, the third \$3,444, and the bottom &lt;strong>\$2,919&lt;/strong> — and the bottom quartile is &lt;em>not&lt;/em> statistically distinguishable from zero (p = 0.167). The top-to-bottom ratio is &lt;strong>5.9×&lt;/strong>. This is the single most informative summary of heterogeneity in the dataset because the bins are constructed by the data itself rather than by a researcher choice. Roughly one in four households in the sample appears to gain little or nothing from 401(k) eligibility, while another quarter gains over twice the average effect.&lt;/p>
&lt;h3 id="81-who-is-in-the-top-vs-the-bottom-quartile-estat-classification">8.1 Who is in the top vs the bottom quartile? &lt;code>estat classification&lt;/code>&lt;/h3>
&lt;p>The data sorted itself; now we can ask what makes the top quartile different. &lt;code>estat classification&lt;/code> runs a two-sample t-test for one variable at a time, comparing its mean in the top-effect rank group against its mean in the bottom-effect rank group.&lt;/p>
&lt;pre>&lt;code class="language-stata">estat classification age
estat classification educ
estat classification income
&lt;/code>&lt;/pre>
&lt;table>
&lt;thead>
&lt;tr>
&lt;th>Variable&lt;/th>
&lt;th style="text-align:right">Top quartile (n=2,480)&lt;/th>
&lt;th style="text-align:right">Bottom quartile (n=2,471)&lt;/th>
&lt;th style="text-align:right">Difference&lt;/th>
&lt;th style="text-align:right">t-statistic&lt;/th>
&lt;/tr>
&lt;/thead>
&lt;tbody>
&lt;tr>
&lt;td>Age (years)&lt;/td>
&lt;td style="text-align:right">45.15&lt;/td>
&lt;td style="text-align:right">34.98&lt;/td>
&lt;td style="text-align:right">10.17&lt;/td>
&lt;td style="text-align:right">35.67&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>Education (years)&lt;/td>
&lt;td style="text-align:right">14.02&lt;/td>
&lt;td style="text-align:right">12.65&lt;/td>
&lt;td style="text-align:right">1.37&lt;/td>
&lt;td style="text-align:right">18.62&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>Income (\$)&lt;/td>
&lt;td style="text-align:right">62,739&lt;/td>
&lt;td style="text-align:right">26,861&lt;/td>
&lt;td style="text-align:right">35,878&lt;/td>
&lt;td style="text-align:right">56.22&lt;/td>
&lt;/tr>
&lt;/tbody>
&lt;/table>
&lt;p>The high-effect quartile is sharply different from the low-effect quartile on every dimension: about &lt;strong>10 years older&lt;/strong> on average (45.1 vs 35.0), with &lt;strong>1.4 more years of education&lt;/strong> (14.0 vs 12.7), and &lt;strong>\$35,878 higher household income&lt;/strong> (\$62,739 vs \$26,861). All three differences are huge in t-statistic terms (19, 36, 56). Income is the dominant marker — exactly what the linear projection and the GATE-by-income picture already suggested. The story behind the numbers: 401(k) eligibility helps people who already have the financial slack and time-horizon to actually use it, and a substantial minority of the population has neither.&lt;/p>
&lt;hr>
&lt;h2 id="9-aipw-a-doubly-robust-contrast">9. AIPW: a doubly-robust contrast&lt;/h2>
&lt;p>So far we have used the partialing-out estimator. The fully interactive AIPW estimator fits separate outcome models for treated and untreated households and combines them with the propensity score via the AIPW orthogonal score:&lt;/p>
&lt;p>$$\Gamma_i = \left[\hat{y}(1)_i + \frac{d_i \, \{y_i - \hat{y}(1)_i\}}{\hat{f}_i}\right] - \left[\hat{y}(0)_i + \frac{(1-d_i) \, \{y_i - \hat{y}(0)_i\}}{1-\hat{f}_i}\right]$$&lt;/p>
&lt;p>In words, this says: the doubly robust per-household effect score is the predicted treated outcome minus the predicted untreated outcome, each corrected by an inverse-propensity weighted residual. It is &amp;ldquo;doubly robust&amp;rdquo; because it stays consistent if &lt;em>either&lt;/em> the outcome models OR the propensity-score model is correct — you only need to get one of them right. The cost is sensitivity to extreme propensities: if some households have $\hat{f}_i$ close to 0 or 1 the inverse weights blow up.&lt;/p>
&lt;pre>&lt;code class="language-stata">cate aipw (asset $catecovars) (e401k), rseed($rseed)
estat heterogeneity
&lt;/code>&lt;/pre>
&lt;pre>&lt;code class="language-text">ATE |
e401k |
(Eligible |
vs |
Not elig..) | 8120.264 1160.538 7.00 0.000 5845.652 10394.88
Treatment-effects heterogeneity test
H0: Treatment effects are homogeneous
chi2(1) = 5.54
Prob &amp;gt; chi2 = 0.0186
&lt;/code>&lt;/pre>
&lt;pre>&lt;code class="language-stata">categraph histogram, ///
title(&amp;quot;Distribution of individual treatment effects (AIPW)&amp;quot;) ///
xtitle(&amp;quot;Estimated tau_hat_i (dollars)&amp;quot;) ///
note(&amp;quot;Source: assets3, Stata 19 cate aipw&amp;quot;)
graph export &amp;quot;stata_cate_iate_histogram_aipw.png&amp;quot;, replace width(1200)
&lt;/code>&lt;/pre>
&lt;p>&lt;img src="stata_cate_iate_histogram_aipw.png" alt="Histogram of AIPW-estimated individual treatment effects">&lt;/p>
&lt;pre>&lt;code class="language-stata">categraph iateplot educ, ///
title(&amp;quot;Estimated CATE by education (AIPW)&amp;quot;) ///
ytitle(&amp;quot;tau_hat (dollars)&amp;quot;) xtitle(&amp;quot;Education (years)&amp;quot;)
graph export &amp;quot;stata_cate_iateplot_educ_aipw.png&amp;quot;, replace width(1200)
&lt;/code>&lt;/pre>
&lt;p>&lt;img src="stata_cate_iateplot_educ_aipw.png" alt="AIPW-estimated CATE by education">&lt;/p>
&lt;p>The AIPW ATE is &lt;strong>\$8,120&lt;/strong> — within \$200 of both the parametric &lt;code>teffects aipw&lt;/code> ATE (\$8,019) and the PO ATE (\$7,937). The heterogeneity test now rejects more strongly (&lt;strong>χ²(1) = 5.54, p = 0.019&lt;/strong>) than under PO (p = 0.043), consistent with AIPW&amp;rsquo;s higher efficiency when both nuisance models are well-specified. The AIPW IATE histogram (Figure 6) has the same right-skewed shape as the PO histogram but a slightly wider support — AIPW puts more mass in the tails because of the inverse-propensity correction, which is the visual signature of the overlap-sensitivity warning above. The AIPW education slice (Figure 7) is essentially identical in shape to the PO version: a broadly flat profile around the average. Across estimators, the substantive story does not change.&lt;/p>
&lt;hr>
&lt;h2 id="10-the-smooth-income-gradient-estat-series">10. The smooth income gradient: &lt;code>estat series&lt;/code>&lt;/h2>
&lt;p>&lt;code>categraph iateplot&lt;/code> showed the CATE as a function of one variable with the others fixed at reference values. &lt;code>estat series&lt;/code> is a complementary view — it fits a flexible smoother (cubic B-spline by default) of the predicted effect against one continuous covariate, marginalizing over the joint distribution of the others. For continuous variables like income this gives the cleanest &amp;ldquo;dose-response&amp;rdquo; picture.&lt;/p>
&lt;pre>&lt;code class="language-stata">estat series income if income &amp;lt;= 150000, graph knots(5)
graph export &amp;quot;stata_cate_series_income.png&amp;quot;, replace width(1200)
&lt;/code>&lt;/pre>
&lt;pre>&lt;code class="language-text">Nonparametric series regression for IATE
Cubic B-spline estimation Number of obs = 9,884
Number of knots = 5
------------------------------------------------------------------------------
| Robust
| Effect std. err. z P&amp;gt;|z| [95% conf. interval]
-------------+----------------------------------------------------------------
income | .2131162 .0502993 4.24 0.000 .1145313 .311701
------------------------------------------------------------------------------
Note: Effect estimates are averages of derivatives.
&lt;/code>&lt;/pre>
&lt;p>&lt;img src="stata_cate_series_income.png" alt="Cubic B-spline of estimated CATE against household income">&lt;/p>
&lt;p>The reported &amp;ldquo;Effect&amp;rdquo; is the &lt;strong>average derivative&lt;/strong> of the predicted treatment effect with respect to income: &lt;strong>0.213&lt;/strong> (SE 0.050, p &amp;lt; 0.001, 95% CI [0.115, 0.312]). Translated into dollars: &lt;strong>each additional \$1,000 of household income raises the predicted 401(k) treatment effect by about \$213 on average&lt;/strong>. The B-spline fit (Figure 8) reveals that this derivative is not constant — the slope is steepest in the middle of the income distribution and flatter at both ends — which is why a single linear-projection coefficient (\$18,195 for the highest income category) only partially captured the gradient. The series view smooths over the binning entirely.&lt;/p>
&lt;hr>
&lt;h2 id="11-putting-it-all-together-comparison-table">11. Putting it all together: comparison table&lt;/h2>
&lt;p>The four causal estimators we ran agree closely on the average and disagree only marginally on the heterogeneity p-value:&lt;/p>
&lt;table>
&lt;thead>
&lt;tr>
&lt;th>Estimator&lt;/th>
&lt;th style="text-align:right">ATE&lt;/th>
&lt;th style="text-align:center">95% CI&lt;/th>
&lt;th style="text-align:center">Heterogeneity test&lt;/th>
&lt;th>Notes&lt;/th>
&lt;/tr>
&lt;/thead>
&lt;tbody>
&lt;tr>
&lt;td>Naive raw difference&lt;/td>
&lt;td style="text-align:right">19,557&lt;/td>
&lt;td style="text-align:center">n/a&lt;/td>
&lt;td style="text-align:center">n/a&lt;/td>
&lt;td>Raw mean gap; mostly selection&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;code>teffects aipw&lt;/code> (parametric)&lt;/td>
&lt;td style="text-align:right">8,019&lt;/td>
&lt;td style="text-align:center">[5,762, 10,277]&lt;/td>
&lt;td style="text-align:center">—&lt;/td>
&lt;td>Mature, fast benchmark&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;code>cate po&lt;/code> (lasso + causal forest)&lt;/td>
&lt;td style="text-align:right">7,937&lt;/td>
&lt;td style="text-align:center">[5,677, 10,197]&lt;/td>
&lt;td style="text-align:center">χ²(1) = 4.11, p = 0.043&lt;/td>
&lt;td>Robust to extreme propensities&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;code>cate aipw&lt;/code> (lasso + causal forest, doubly robust)&lt;/td>
&lt;td style="text-align:right">8,120&lt;/td>
&lt;td style="text-align:center">[5,846, 10,395]&lt;/td>
&lt;td style="text-align:center">χ²(1) = 5.54, p = 0.019&lt;/td>
&lt;td>Most efficient; uses AIPW score&lt;/td>
&lt;/tr>
&lt;/tbody>
&lt;/table>
&lt;p>Three independent ML and parametric estimators bracket the true ATE within a \$183 spread. Both ML estimators reject homogeneity at the 5% level. The naive raw difference of \$19,557 was inflated by a factor of 2.4 — about \$11,500 of it was selection.&lt;/p>
&lt;hr>
&lt;h2 id="12-discussion-answering-the-question">12. Discussion: answering the question&lt;/h2>
&lt;p>We opened with the question, &lt;em>for whom&lt;/em> does 401(k) eligibility increase financial assets? The eight figures and four estimators in this post answer it concretely:&lt;/p>
&lt;ul>
&lt;li>&lt;strong>The average household gains about \$8,000.&lt;/strong> Across three estimators of the ATE that span very different model assumptions, the answer is \$7,937 to \$8,120 with a narrow range. The naive raw gap of \$19,557 overstated the causal effect by 2.4×.&lt;/li>
&lt;li>&lt;strong>But the average hides substantial heterogeneity.&lt;/strong> Both &lt;code>estat heterogeneity&lt;/code> tests reject a constant CATE at the 5% level; the GATE joint test rejects equality across income groups at p = 0.001; and the GATES quartile ladder spans \$2,919 (bottom quartile, not significant) to \$17,279 (top quartile) — a factor of 5.9.&lt;/li>
&lt;li>&lt;strong>Income is the dominant moderator.&lt;/strong> The linear projection coefficient on the highest income category is \$18,195 (p = 0.001). The smooth B-spline says each extra \$1,000 of income raises the effect by \$213 on average. The classification analysis says households in the top-effect quartile earn \$35,878 more on average than households in the bottom-effect quartile.&lt;/li>
&lt;li>&lt;strong>About a quarter of households gain little or nothing.&lt;/strong> The bottom GATES quartile cannot reject zero (p = 0.167), and a small left tail in both IATE histograms shows households with predicted effects close to zero or even slightly negative.&lt;/li>
&lt;li>&lt;strong>Age and homeownership matter at the margin.&lt;/strong> Older workers and homeowners gain more, but the effects are smaller and more uncertain than the income effect. Education and marital status are essentially flat once income is controlled for.&lt;/li>
&lt;/ul>
&lt;p>The &amp;ldquo;so what?&amp;rdquo; for policy: a 401(k) eligibility expansion targeted at low-income workers will have a much smaller per-capita asset effect than one targeted at high-income workers — but the lowest-income households still gain a real, statistically significant \$4,000 on average, suggesting the program is not pointless for them. A blanket expansion that ignores heterogeneity would systematically underestimate the gains to high earners and overestimate the gains to households in the second income decile.&lt;/p>
&lt;hr>
&lt;h2 id="13-summary-and-next-steps">13. Summary and next steps&lt;/h2>
&lt;p>&lt;strong>Method takeaways.&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>Stata 19&amp;rsquo;s &lt;code>cate&lt;/code> command unifies cross-fit ML nuisance estimation, doubly robust scores, causal-forest IATE estimation, and honest-tree inference into a single workflow. Two estimators (PO and AIPW) and seven postestimation views cover almost all practical heterogeneity questions.&lt;/li>
&lt;li>&lt;strong>PO&lt;/strong> (partialing-out) is more robust to extreme propensity scores; &lt;strong>AIPW&lt;/strong> is more efficient when both nuisance models are well-specified. They agree on the ATE in this dataset (\$7,937 vs \$8,120, a difference of 2.3%), which is the strongest possible robustness check.&lt;/li>
&lt;li>The four heterogeneity views — &lt;code>estat heterogeneity&lt;/code>, &lt;code>estat projection&lt;/code>, GATE/GATES, and &lt;code>estat series&lt;/code> — answer different questions. A beginner should look at all of them rather than picking a favorite.&lt;/li>
&lt;/ul>
&lt;p>&lt;strong>Data takeaways.&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>The 401(k) eligibility ATE on the assets3 sample is \$8,019 ± \$1,150.&lt;/li>
&lt;li>The CATE varies from \$1,399 (income category 1) to \$20,511 (highest income category) — a 15× spread.&lt;/li>
&lt;li>One in four households shows essentially no effect; one in four shows over twice the average.&lt;/li>
&lt;/ul>
&lt;p>&lt;strong>Limitations.&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>The CATE is identified under unconfoundedness (no unmeasured confounders) given the rich set of demographic covariates. If, for instance, employer match rates differ systematically across the income distribution and we don&amp;rsquo;t observe match rates, that would bias the income gradient.&lt;/li>
&lt;li>The bootstrap-of-little-bags inference behind the IATE confidence bands assumes honest random forests. With the default &lt;code>xfolds(10)&lt;/code> and the default forest settings, runtime is ≈9 minutes on Stata SE 19; StataNow MP cuts this by roughly 3×.&lt;/li>
&lt;li>We did not formally check propensity overlap in this post. As a follow-up, run &lt;code>teffects overlap&lt;/code> after the parametric AIPW or check &lt;code>estat osample&lt;/code> after the &lt;code>cate&lt;/code> command.&lt;/li>
&lt;/ul>
&lt;p>&lt;strong>Next steps.&lt;/strong> Try &lt;code>cate aipw ..., omethod(rforest) tmethod(rforest) oob&lt;/code> for a fully nonparametric specification with out-of-bag inference (faster and more flexible than the lasso default). Or move to the &lt;code>lung&lt;/code> dataset shipped with Stata 19 and explore &lt;code>estat policyeval&lt;/code> to compare expected outcomes under hypothetical assignment policies (e.g., &amp;ldquo;treat only households with predicted positive effect&amp;rdquo;).&lt;/p>
&lt;hr>
&lt;h2 id="14-exercises">14. Exercises&lt;/h2>
&lt;ol>
&lt;li>&lt;strong>Compare specifications.&lt;/strong> Re-run &lt;code>cate po&lt;/code> with &lt;code>omethod(rforest) tmethod(rforest)&lt;/code> (random-forest nuisance instead of lasso). How much do the GATE-by-income estimates change? Use &lt;code>oob&lt;/code> to speed up the run.&lt;/li>
&lt;li>&lt;strong>Build a custom group.&lt;/strong> Create a &amp;ldquo;high-effect candidate&amp;rdquo; indicator that is 1 if &lt;code>age &amp;gt; 40 &amp;amp; income &amp;gt; 50000 &amp;amp; ownhome == 1&lt;/code>, 0 otherwise. Run &lt;code>cate, group(high_eff_candidate) reestimate&lt;/code> and compare the two GATEs to the GATES top vs bottom quartile in this post.&lt;/li>
&lt;li>&lt;strong>Explore another moderator.&lt;/strong> Use &lt;code>categraph iateplot&lt;/code> to plot the predicted CATE against &lt;code>pension&lt;/code>, &lt;code>married&lt;/code>, &lt;code>twoearn&lt;/code>, and &lt;code>ira&lt;/code>. Which one shows the biggest difference between its categories?&lt;/li>
&lt;/ol>
&lt;hr>
&lt;h2 id="15-references">15. References&lt;/h2>
&lt;ol>
&lt;li>&lt;a href="https://doi.org/10.1214/18-AOS1709" target="_blank" rel="noopener">Athey, S., Tibshirani, J., &amp;amp; Wager, S. (2019). Generalized Random Forests. &lt;em>Annals of Statistics&lt;/em>, 47(2), 1148–1178.&lt;/a>&lt;/li>
&lt;li>&lt;a href="https://doi.org/10.1111/ectj.12097" target="_blank" rel="noopener">Chernozhukov, V., Chetverikov, D., Demirer, M., Duflo, E., Hansen, C., Newey, W., &amp;amp; Robins, J. (2018). Double/debiased machine learning for treatment and structural parameters. &lt;em>The Econometrics Journal&lt;/em>, 21(1), C1–C68.&lt;/a>&lt;/li>
&lt;li>&lt;a href="https://doi.org/10.1162/0034653041811734" target="_blank" rel="noopener">Chernozhukov, V., &amp;amp; Hansen, C. (2004). The effects of 401(k) participation on the wealth distribution: an instrumental quantile regression analysis. &lt;em>Review of Economics and Statistics&lt;/em>, 86(3), 735–751.&lt;/a>&lt;/li>
&lt;li>&lt;a href="https://doi.org/10.1093/ectj/utac015" target="_blank" rel="noopener">Knaus, M. C. (2022). Double machine learning-based programme evaluation under unconfoundedness. &lt;em>Econometrics Journal&lt;/em>, 25(3), 602–627.&lt;/a>&lt;/li>
&lt;li>&lt;a href="https://doi.org/10.2307/1912705" target="_blank" rel="noopener">Robinson, P. M. (1988). Root-N-consistent semiparametric regression. &lt;em>Econometrica&lt;/em>, 56(4), 931–954.&lt;/a>&lt;/li>
&lt;li>&lt;a href="https://www.stata.com/manuals/causal.pdf" target="_blank" rel="noopener">StataCorp. (2025). &lt;em>Stata 19 Causal Inference and Treatment-Effects Reference Manual: cate&lt;/em>.&lt;/a>&lt;/li>
&lt;/ol></description></item></channel></rss>