← Back to the post
Interactive data dictionary

Spatial Dynamic Panels with Common Factors: Credit Risk in US Banking

The Kripfganz–Sarafidis (2025) replication panel — 350 US commercial banks, quarterly 2006–2014.

350
banks
11
variables
2006–2014
quarters
12,600
obs

Downloads

Each dataset is available as a labeled Stata .dta and its source file.

⇩ Download all data (ZIP)stata_codebook.do

DatasetGrainRowsStataSource
v113i06bank-quarter12,600 × 11v113i06.dtav113i06.dta

Run stata_codebook.do in Stata once to attach long-form per-variable notes to the .dta files.

Load directly in code

Every file loads straight from GitHub (raw URLs). Swap the file name to load any dataset.

Stata

* Stata 14+ : `use` reads an https URL directly
global BASE "https://raw.githubusercontent.com/cmg777/starter-academic-v501/master/content/post/stata_spxtivdfreg/data/"
use "${BASE}v113i06.dta", clear
describe
notes

Python

!pip install -q pyreadstat
import pandas as pd
BASE = "https://raw.githubusercontent.com/cmg777/starter-academic-v501/master/content/post/stata_spxtivdfreg/data/"
df = pd.read_stata(BASE + "v113i06.dta")

# load every dataset at once
files = ["v113i06"]
data = {f: pd.read_stata(BASE + f + ".dta") for f in files}

# pyreadstat (richest metadata) reads LOCAL files -> download first
import pyreadstat, urllib.request
urllib.request.urlretrieve(BASE + "v113i06.dta", "v113i06.dta")
df, meta = pyreadstat.read_dta("v113i06.dta")

Copy and paste this snippet in Google Colab app. https://colab.research.google.com/notebooks/empty.ipynb

R

# R : haven::read_dta auto-downloads an https URL
library(haven)
BASE <- "https://raw.githubusercontent.com/cmg777/starter-academic-v501/master/content/post/stata_spxtivdfreg/data/"
df <- read_dta(paste0(BASE, "v113i06.dta"))

Overview & sources

Companion data for a Stata tutorial that replicates the empirical application of Kripfganz & Sarafidis (2025, Journal of Statistical Software 113(6)) with the spxtivdfreg package — a defactored instrumental-variables estimator for spatial dynamic panels with unobserved common factors. The panel models the non-performing-loan (NPL) ratio of 350 US commercial banks observed quarterly over 2006:Q1–2014:Q4 (36 quarters; 12,600 observations, 12,250 in the effective estimation sample), spanning the entire Global Financial Crisis. Credit risk is modelled through four simultaneous endogeneity channels — a spatial lag of NPL (ψ), temporal persistence (ρ), an endogenous regressor (INEFF, instrumented by INTEREST), and latent common factors — using a 350×350 economic-distance spatial weight matrix built from Spearman rank correlations of bank debt ratios. This page documents the bank-quarter panel (v113i06.dta); the weight matrix W.csv is a separate file (see note below).

One file documented here. v113i06 is a strongly balanced bank-quarter panel — one row per bank × quarter, 350 banks × 36 quarters = 12,600 rows. ID is the bank identifier and TIME the quarterly index (1–36, mapping 2006:Q1–2014:Q4); xtset ID TIME declares the panel. The companion spatial weight matrix W.csv — a 350×350, row-standardized, economic-distance matrix (6,300 nonzero entries, ~18 neighbours per bank) — is a bare numeric matrix with no variable columns and is therefore not a documented variable table here; spxtivdfreg loads it via spmatrix("W.csv", import).

Data sources

SourceProvidesReference / URL
Kripfganz &amp; Sarafidis (2025)Replicated study; the v113i06.dta bank-quarter panel and the W.csv weight matrix (JSS replication package)Kripfganz, S., & Sarafidis, V. (2025). Estimating spatial dynamic panel data models with unobserved common factors in Stata. Journal of Statistical Software, 113(6). https://doi.org/10.18637/jss.v113.i06
Method referencesEstimator and concepts (defactored IV, common factors, IV panels)Kripfganz & Sarafidis (2021), Stata Journal 21(3); Sarafidis & Wansbeek (2012), Econometric Reviews 31(5); Pesaran (2006), Econometrica 74(4).
Spatial-panel contextComparator packages and spatial-panel frameworkBelotti, Hughes & Mortari (2017), Stata Journal 17(1); Elhorst (2014), Spatial Econometrics, Springer.

Cite this data

Please cite this dataset as follows.

APA

Mendez, C. (2026). Spatial Dynamic Panels with Common Factors in Stata: Credit Risk in US Banking [Data set]. https://carlos-mendez.org/post/stata_spxtivdfreg/

Kripfganz, S., & Sarafidis, V. (2025). Estimating spatial dynamic panel data models with unobserved common factors in Stata. Journal of Statistical Software, 113(6). https://doi.org/10.18637/jss.v113.i06

BibTeX

@misc{mendez2026stataspxtivdfreg,
  author       = {Mendez, Carlos},
  title        = {Spatial Dynamic Panels with Common Factors in Stata: Credit Risk in US Banking},
  year         = {2026},
  howpublished = {\url{https://carlos-mendez.org/post/stata_spxtivdfreg/}},
  note         = {Data set}
}

@article{kripfganz2025spatial,
  author  = {Kripfganz, Sebastian and Sarafidis, Vasilis},
  title   = {Estimating Spatial Dynamic Panel Data Models with Unobserved Common Factors in {Stata}},
  journal = {Journal of Statistical Software},
  volume  = {113}, number = {6}, year = {2025},
  doi     = {10.18637/jss.v113.i06}
}

Variable explorer search & filter all 11 variables

Type to filter by name or label, or use the chips to filter by type. Each row shows a mini distribution. Click a header to sort.

VariableTypeDistributionLabelDefinitionUnitsIn filesSource
BUFFER#continuousmin -6.49 | median 1.81 | max 48Capital buffer (leverage ratio minus 8%)Capital buffer above the regulatory minimum (leverage ratio minus the 8% threshold).percentage pointsv113i06Kripfganz & Sarafidis (2025)
CAR#continuousmin 2.56 | median 14.5 | max 193Capital adequacy ratio (%)Regulatory capital adequacy ratio of the bank.%v113i06Kripfganz & Sarafidis (2025)
ID#identifierBank identifierAnonymized US commercial-bank identifier (1-350); the panel cross-section unit.1-350v113i06Kripfganz & Sarafidis (2025)
INEFF#continuousmin 0.0438 | median 0.266 | max 0.946Operational inefficiency (endogenous)Bank operational inefficiency; treated as the endogenous regressor in the NPL equation.ratiov113i06Kripfganz & Sarafidis (2025)
INTEREST#continuousmin -5.16 | median -1.96 | max 2.52Interest expenses / deposits (instrument)Interest expenses relative to deposits; the excluded instrument for the endogenous INEFF.ratiov113i06Kripfganz & Sarafidis (2025)
LIQUIDITY#continuousmin 0.0122 | median 0.78 | max 2.32Loan-to-deposit ratioLoans relative to deposits; the covariate with the largest effect on NPL in the full model.ratiov113i06Kripfganz & Sarafidis (2025)
NPL#continuousmin 0 | median 1.1 | max 23Non-performing loan ratio (%)Non-performing loans as a share of total loans, in percentage points; the dependent variable (credit risk).% (percentage points)v113i06Kripfganz & Sarafidis (2025)
PROFIT#continuousmin -190 | median 8.54 | max 217Profitability (return on equity, %)Bank profitability, annualized return on equity.% (annualized ROE)v113i06Kripfganz & Sarafidis (2025)
QUALITY#continuousmin -4.95 | median 0.126 | max 27.9Loan quality (loan-loss provisions / assets, %)Loan loss provisions as a share of assets; a flow indicator of asset quality.%v113i06Kripfganz & Sarafidis (2025)
SIZE#continuousmin 9.19 | median 11.8 | max 19.3Bank size, ln(total assets)Natural log of total assets; a proxy for bank scale and systemic exposure.log (ln assets)v113i06Kripfganz & Sarafidis (2025)
TIME#identifierQuarterly time indexQuarter counter 1-36, mapping 2006:Q1 (=1) to 2014:Q4 (=36).1-36 (quarters)v113i06Kripfganz & Sarafidis (2025)

Cross-file variable index

Which file each variable appears in (● = present).

Variablev113i06
BUFFER
CAR
ID
INEFF
INTEREST
LIQUIDITY
NPL
PROFIT
QUALITY
SIZE
TIME

Construction & formulas

The model is a spatial dynamic panel with interactive (factor) fixed effects, estimated by defactored IV. For bank i at quarter t:

The covariates x_it are the bank financial ratios INEFF, CAR, SIZE, BUFFER, PROFIT, QUALITY, and LIQUIDITY; INTEREST serves only as an excluded instrument for the endogenous INEFF.

The datasets

Switch datasets with the tabs. Each shows the full variable dictionary plus a sortable statistics table with mini distributions and data coverage.

expand to search (Ctrl/⌘+F) or print across all datasets

bank-quarter  12,600 × 11 · 2006:Q1-2014:Q4 (TIME 1-36) · 350 US commercial banks (strongly balanced)

Panel key: ID x TIME · Estimate the spatial dynamic panel model of credit risk with common factors (spxtivdfreg).

Variable dictionary

VariableLabelDefinitionConstructionUnitsSourceCoverage
ID identifierBank identifierAnonymized US commercial-bank identifier (1-350); the panel cross-section unit.Integer bank code from the replication package; declared with xtset ID TIME.1-350Kripfganz & Sarafidis (2025)350 banks
TIME identifierQuarterly time indexQuarter counter 1-36, mapping 2006:Q1 (=1) to 2014:Q4 (=36).Sequential quarter index from the replication package; the panel time variable.1-36 (quarters)Kripfganz & Sarafidis (2025)36 quarters
NPL continuousNon-performing loan ratio (%)Non-performing loans as a share of total loans, in percentage points; the dependent variable (credit risk).Bank-quarter NPL/total-loans from the replication package; modelled with spatial and temporal lags.% (percentage points)Kripfganz & Sarafidis (2025)bank-quarter
INEFF continuousOperational inefficiency (endogenous)Bank operational inefficiency; treated as the endogenous regressor in the NPL equation.Bank-quarter inefficiency measure; instrumented by INTEREST and lagged exogenous regressors.ratioKripfganz & Sarafidis (2025)bank-quarter
CAR continuousCapital adequacy ratio (%)Regulatory capital adequacy ratio of the bank.Bank-quarter CAR from the replication package; an exogenous covariate.%Kripfganz & Sarafidis (2025)bank-quarter
SIZE continuousBank size, ln(total assets)Natural log of total assets; a proxy for bank scale and systemic exposure.log of bank total assets, bank-quarter; an exogenous covariate.log (ln assets)Kripfganz & Sarafidis (2025)bank-quarter
BUFFER continuousCapital buffer (leverage ratio minus 8%)Capital buffer above the regulatory minimum (leverage ratio minus the 8% threshold).Leverage ratio minus 8, bank-quarter; an exogenous covariate (protective: enters NPL negatively).percentage pointsKripfganz & Sarafidis (2025)bank-quarter
PROFIT continuousProfitability (return on equity, %)Bank profitability, annualized return on equity.Bank-quarter ROE from the replication package; an exogenous covariate.% (annualized ROE)Kripfganz & Sarafidis (2025)bank-quarter
QUALITY continuousLoan quality (loan-loss provisions / assets, %)Loan loss provisions as a share of assets; a flow indicator of asset quality.Bank-quarter provisions/assets from the replication package; an exogenous covariate.%Kripfganz & Sarafidis (2025)bank-quarter
LIQUIDITY continuousLoan-to-deposit ratioLoans relative to deposits; the covariate with the largest effect on NPL in the full model.Bank-quarter loan-to-deposit ratio from the replication package; an exogenous covariate.ratioKripfganz & Sarafidis (2025)bank-quarter
INTEREST continuousInterest expenses / deposits (instrument)Interest expenses relative to deposits; the excluded instrument for the endogenous INEFF.Bank-quarter interest-expense/deposits from the replication package; enters only the iv() instrument set.ratioKripfganz & Sarafidis (2025)bank-quarter

Distribution & statistics (click a header to sort)

VariableDistributionCoverageNDistinctMinMeanMedianMaxSD
ID100%12,600350
TIME100%12,60036
NPLmin 0 | median 1.1 | max 23100%12,60011,74201.731.1023.042.11
INEFFmin 0.0438 | median 0.266 | max 0.946100%12,60012,5930.0440.2890.2660.9460.120
CARmin 2.56 | median 14.5 | max 193100%12,60012,5462.5617.6814.52193.010.31
SIZEmin 9.19 | median 11.8 | max 19.3100%12,60012,3279.1911.9811.8519.251.26
BUFFERmin -6.49 | median 1.81 | max 48100%12,60012,595-6.492.861.8148.003.82
PROFITmin -190 | median 8.54 | max 217100%12,60012,582-189.78.598.54217.410.38
QUALITYmin -4.95 | median 0.126 | max 27.9100%12,6008,517-4.950.2830.12627.870.625
LIQUIDITYmin 0.0122 | median 0.78 | max 2.32100%12,60012,5860.0120.7700.7802.320.222
INTERESTmin -5.16 | median -1.96 | max 2.52100%12,60012,065-5.16-1.91-1.962.520.933

Known limitations & caveats