Reference¶
This page documents every public function and method exposed by the
csdid package along with the meaning of every argument. Use it to
look up where an option goes (constructor, fit, aggte, etc.) and
what the default behavior is.
The package is organized around the ATTgt class. A typical workflow
is:
Build an estimator object with
csdid.att_gt.ATTgt— pass in the data and identifier columns.Call
csdid.att_gt.ATTgt.fit()— choose the 2x2 DiD estimator and the base period.Inspect the disaggregated group-time effects with
csdid.att_gt.ATTgt.summ_attgt()or plot them withcsdid.att_gt.ATTgt.plot_attgt().Aggregate to a smaller number of parameters with
csdid.att_gt.ATTgt.aggte()and plot withcsdid.att_gt.ATTgt.plot_aggte().
csdid.att_gt.ATTgt¶
- class ATTgt(yname, tname, idname, gname, data, control_group=['nevertreated', 'notyettreated'], xformla=None, panel=True, allow_unbalanced_panel=True, clustervar=None, weights_name=None, anticipation=0, cband=False, biters=1000, alp=0.05)¶
Group-time average treatment effects estimator. Sets up the design from the long-format panel (or repeated cross-section), runs the pre-processing step, and stores everything needed for
fit.- Parameters:
yname (str) – Name of the outcome column in
data.tname (str) – Name of the time column in
data.idname (str) – Name of the unit identifier column. Required for panel data; ignored when
panel=False.gname (str) – Name of the group column. This is the period in which a unit is first treated. Never-treated units should have
gname = 0.data (pandas.DataFrame) – The long-format dataset.
control_group (str or list of str) – Which units to use as controls. Either
"nevertreated"(default) or"notyettreated". The not-yet-treated group is at least as large as the never-treated group and changes across time. A list['nevertreated', 'notyettreated']resolves to the first entry.xformla (str) – Patsy-style formula for the covariates, e.g.
"lemp ~ lpop". The left-hand side is ignored; only the right-hand side is used to build the design matrix.None(or"y~1") requests unconditional parallel trends.panel (bool) –
True(default) treats the data as a panel.Falserequests repeated cross-section mode (idnameis then ignored).allow_unbalanced_panel (bool) – When
True(default), the package does not drop units with missing observations in some periods. Set toFalseto force a balanced panel.clustervar (str) – Name of an extra clustering variable. Up to two clustering variables are supported and one of them must be
idname.weights_name (str) – Name of a column with sampling weights. If
None, equal weights are used.anticipation (int) – Number of pre-treatment periods in which units are allowed to anticipate the treatment. Default
0.cband (bool) – When
True, store enough information for uniform confidence bands. DefaultFalse.biters (int) – Number of multiplier-bootstrap iterations used for standard errors. Default
1000.alp (float) – Significance level for confidence bands. Default
0.05.
- fit(est_method='dr', base_period='varying', bstrap=True)¶
Estimate group-time average treatment effects.
- Parameters:
est_method (str) –
The 2x2 DiD estimator used for each
(g, t)cell. One of:"dr"(default) — Sant’Anna and Zhao doubly-robust estimator."ipw"— inverse-probability-weighted estimator."reg"— outcome-regression estimator.
base_period (str) –
Reference period used to construct each \(ATT(g,t)\).
"varying"(default) — the reference period changes witht. For pre-treatment periods it ist - 1; for post-treatment periods it is the last period before treatment for groupg."universal"— a single, fixed reference period is used for each group (the last pre-treatment period, after anticipation). The entry at the base period itself is normalized to 0. With this option the result table contains one extra row per group.
bstrap (bool) – When
True(default), standard errors and critical values come from the multiplier bootstrap. WhenFalse, asymptotic standard errors and a normal critical value (1.96) are used.
- Returns:
self(the call is chainable). After fitting,self.results,self.MP, andself.did_objectare populated.
- summ_attgt(n=4)¶
Build a tidy DataFrame of the
ATT(g,t)table.- Parameters:
n (int) – Number of decimals to round to. Default
4.- Returns:
selfwith the table available asself.summary2.
- aggte(typec='group', balance_e=None, min_e=-inf, max_e=inf, na_rm=False, bstrap=None, biters=None, cband=None, alp=None, clustervars=None)¶
Aggregate the group-time effects into a smaller number of parameters. Returns
self; the aggregated object is stored onself.atte.- Parameters:
typec (str) –
Type of aggregation:
"simple"— weighted average of all post-treatmentATT(g,t)(weights proportional to group size)."dynamic"— average effects by length of exposure (event-study)."group"(default) — average effect per treatment cohort."calendar"— average effect per calendar period.
balance_e (int) – Only used when
typec='dynamic'. If set, drops cohorts that are not observed for at leastbalance_e + 1post-treatment periods, so the composition is constant in event time. DefaultNone(no balancing).min_e (float) – Smallest event time to include in the dynamic aggregation. Default
-inf.max_e (float) – Largest event time to include. Default
inf.na_rm (bool) – When
True, drop missing aggregated estimates. DefaultFalse.bstrap (bool) – Override the
bstrapsetting stored on the fitted object.None(default) inherits fromfit.biters (int) – Override the number of bootstrap iterations.
cband (bool) – Compute uniform confidence bands (requires
bstrap=True).Noneinherits.alp (float) – Override the significance level.
clustervars (list) – Override the clustering variables.
- plot_attgt(ylim=None, xlab=None, ylab=None, title='Group', xgap=1, ncol=1, legend=True, group=None, ref_line=0, theming=True, grtitle='Group')¶
Faceted plot of the group-time treatment effects, one subplot per cohort.
- Parameters:
ylim (tuple) –
(ymin, ymax)for the y-axis.xlab (str) – Label for the x-axis.
ylab (str) – Label for the y-axis.
title (str) – Subplot title prefix. Default
"Group".xgap (int) – Spacing between x-axis ticks. Default
1.ncol (int) – Number of columns in the facet grid. Default
1.legend (bool) – Whether to draw the legend. Default
True.group (list) – Subset of groups to plot.
Noneplots all.ref_line (float) – y-value of the reference line. Default
0.theming (bool) – Apply the package’s default styling. Default
True.grtitle (str) – Per-subplot title prefix (used together with the group value). Default
"Group".
- plot_aggte(ylim=None, xlab=None, ylab=None, title='', xgap=1, legend=True, ref_line=0, theming=True, **kwargs)¶
Plot the aggregated object stored on
self.atte(produced by a prior call toaggte()). Fortypec='group'it draws asplot; for the other types it draws agplot.- Parameters:
ylim (tuple) – y-axis limits.
xlab (str) – x-axis label.
ylab (str) – y-axis label.
title (str) – Plot title. Empty defaults to
"Average Effect by Group"(group aggregation) or"Average Effect by Length of Exposure"(dynamic / calendar).xgap (int) – Tick spacing on the x-axis.
legend (bool) – Show legend.
ref_line (float) – y-value of the reference line.
theming (bool) – Apply default styling.
Notes on argument inheritance¶
Some arguments can be specified in more than one place (for example
bstrap, biters, cband, alp). The resolution order is:
Value passed to
ATTgt.aggte()overrides the value set onATTgt.fit(), which overrides the value set onATTgt.Passing
NonetoATTgt.aggte()means “inherit from theMPobject built duringfit”.
This matches the R did package convention.