# !wget https://developer.nvidia.com/compute/cuda/9.0/Prod/local_installers/cuda-repo-ubuntu1604-9-0-local_9.0.176-1_amd64-deb
# !dpkg -i cuda-repo-ubuntu1604-9-0-local_9.0.176-1_amd64-deb
# !apt-key add /var/cuda-repo-9-0-local/7fa2af80.pub
# !apt update -q
# !apt install cuda gcc-6 g++-6 -y -q
# !ln -s /usr/bin/gcc-6 /usr/local/cuda/bin/gcc
# !ln -s /usr/bin/g++-6 /usr/local/cuda/bin/g++
# !curl -sSL "https://julialang-s3.julialang.org/bin/linux/x64/1.7/julia-1.7.3-linux-x86_64.tar.gz" -o julia.tar.gz
# !tar -xzf julia.tar.gz -C /usr --strip-components 1
# !rm -rf julia.tar.gz*
# !julia -e 'using Pkg; pkg"add IJulia; precompile"'
6. Analyzing RCT reemployment experiment#
6.1. Analyzing RCT data with Precision Adjustemnt#
6.1.1. Data#
In this lab, we analyze the Pennsylvania re-employment bonus experiment, which was previously studied in “Sequential testing of duration data: the case of the Pennsylvania ‘reemployment bonus’ experiment” (Bilias, 2000), among others. These experiments were conducted in the 1980s by the U.S. Department of Labor to test the incentive effects of alternative compensation schemes for unemployment insurance (UI).
In these experiments, UI claimants were randomly assigned either to a control group or one of five treatment groups. Actually, there are six treatment groups in the experiments. Here we focus on treatment group 4, but feel free to explore other treatment groups. In the control group the current rules of the UI applied. Individuals in the treatment groups were offered a cash bonus if they found a job within some pre-specified period of time (qualification period), provided that the job was retained for a specified duration. The treatments differed in the level of the bonus, the length of the qualification period, and whether the bonus was declining over time in the qualification period; see http://qed.econ.queensu.ca/jae/2000-v15.6/bilias/readme.b.txt for further details on data.
#import Pkg
#Pkg.add("DataFrames")
#Pkg.add("FilePaths")
#Pkg.add("Queryverse")
#Pkg.add("GLM")
#Pkg.add("StatsModels")
#Pkg.add("Combinatorics")
#Pkg.add("Iterators")
#Pkg.add("CategoricalArrays")
#Pkg.add("StatsBase")
#Pkg.add("Lasso")
#Pkg.add("TypedTables")
#Pkg.add("MacroTools")
#Pkg.add("NamedArrays")
#Pkg.add("DataTables")
#Pkg.add("Latexify")
#Pkg.add("PrettyTables")
#Pkg.add("TypedTables")
#Pkg.add("TexTables")
#Pkg.add("StatsModels")
#Pkg.add("DataTables")
#Pkg.add("FilePaths")
#Pkg.add("Combinatorics")
#Pkg.add("CategoricalArrays")
#Pkg.add("TypedTables")
#Pkg.add("MacroTools")
using GLM, StatsModels
using DataTables
using DelimitedFiles, DataFrames, Lasso
using FilePaths
using StatsModels, Combinatorics
using CategoricalArrays
using StatsBase, Statistics
using TypedTables
using MacroTools
using NamedArrays
using PrettyTables # Dataframe or Datatable to latex
using TexTables # pretty regression table and tex outcome
# Loading data
url = "https://github.com/d2cml-ai/14.388_jl/raw/main/data/penn_jae.dat"
mat, head = readdlm(download(url), header=true, Float64)
mat
df =DataFrame(mat, vec(head))
describe(df)
variable | mean | min | median | max | nmissing | eltype | |
---|---|---|---|---|---|---|---|
Symbol | Float64 | Float64 | Float64 | Float64 | Int64 | DataType | |
1 | abdt | 10693.6 | 10404.0 | 10691.0 | 10880.0 | 0 | Float64 |
2 | tg | 2.56889 | 0.0 | 2.0 | 6.0 | 0 | Float64 |
3 | inuidur1 | 12.9148 | 1.0 | 10.0 | 52.0 | 0 | Float64 |
4 | inuidur2 | 12.1938 | 0.0 | 9.0 | 52.0 | 0 | Float64 |
5 | female | 0.402142 | 0.0 | 0.0 | 1.0 | 0 | Float64 |
6 | black | 0.116653 | 0.0 | 0.0 | 1.0 | 0 | Float64 |
7 | hispanic | 0.0363689 | 0.0 | 0.0 | 1.0 | 0 | Float64 |
8 | othrace | 0.00575002 | 0.0 | 0.0 | 1.0 | 0 | Float64 |
9 | dep | 0.444045 | 0.0 | 0.0 | 2.0 | 0 | Float64 |
10 | q1 | 0.0136563 | 0.0 | 0.0 | 1.0 | 0 | Float64 |
11 | q2 | 0.206498 | 0.0 | 0.0 | 1.0 | 0 | Float64 |
12 | q3 | 0.237691 | 0.0 | 0.0 | 1.0 | 0 | Float64 |
13 | q4 | 0.232229 | 0.0 | 0.0 | 1.0 | 0 | Float64 |
14 | q5 | 0.232948 | 0.0 | 0.0 | 1.0 | 0 | Float64 |
15 | q6 | 0.0769784 | 0.0 | 0.0 | 1.0 | 0 | Float64 |
16 | recall | 0.108675 | 0.0 | 0.0 | 1.0 | 0 | Float64 |
17 | agelt35 | 0.543089 | 0.0 | 1.0 | 1.0 | 0 | Float64 |
18 | agegt54 | 0.106735 | 0.0 | 0.0 | 1.0 | 0 | Float64 |
19 | durable | 0.148638 | 0.0 | 0.0 | 1.0 | 0 | Float64 |
20 | nondurable | 0.10961 | 0.0 | 0.0 | 1.0 | 0 | Float64 |
21 | lusd | 0.265435 | 0.0 | 0.0 | 1.0 | 0 | Float64 |
22 | husd | 0.221807 | 0.0 | 0.0 | 1.0 | 0 | Float64 |
23 | muld | 0.438008 | 0.0 | 0.0 | 1.0 | 0 | Float64 |
#dimenntions of dataframe
a = size(df,1)
b = size(df,2)
23
# Filter control group and just treatment group number 4
penn = filter(row -> row[:tg] in [4,0], df)
first(penn,20)
20 rows × 23 columns (omitted printing of 14 columns)
abdt | tg | inuidur1 | inuidur2 | female | black | hispanic | othrace | dep | |
---|---|---|---|---|---|---|---|---|---|
Float64 | Float64 | Float64 | Float64 | Float64 | Float64 | Float64 | Float64 | Float64 | |
1 | 10824.0 | 0.0 | 18.0 | 18.0 | 0.0 | 0.0 | 0.0 | 0.0 | 2.0 |
2 | 10824.0 | 0.0 | 1.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
3 | 10747.0 | 0.0 | 27.0 | 27.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
4 | 10607.0 | 4.0 | 9.0 | 9.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
5 | 10831.0 | 0.0 | 27.0 | 27.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 |
6 | 10845.0 | 0.0 | 27.0 | 27.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 |
7 | 10831.0 | 0.0 | 9.0 | 9.0 | 1.0 | 0.0 | 0.0 | 0.0 | 1.0 |
8 | 10859.0 | 0.0 | 27.0 | 27.0 | 1.0 | 0.0 | 0.0 | 0.0 | 1.0 |
9 | 10516.0 | 0.0 | 15.0 | 15.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 |
10 | 10663.0 | 0.0 | 28.0 | 11.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 |
11 | 10747.0 | 0.0 | 12.0 | 12.0 | 1.0 | 0.0 | 0.0 | 0.0 | 2.0 |
12 | 10551.0 | 4.0 | 22.0 | 22.0 | 1.0 | 0.0 | 1.0 | 0.0 | 2.0 |
13 | 10768.0 | 0.0 | 18.0 | 18.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 |
14 | 10537.0 | 0.0 | 1.0 | 1.0 | 1.0 | 0.0 | 0.0 | 0.0 | 2.0 |
15 | 10600.0 | 4.0 | 7.0 | 7.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 |
16 | 10866.0 | 0.0 | 18.0 | 18.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
17 | 10572.0 | 0.0 | 14.0 | 14.0 | 0.0 | 0.0 | 0.0 | 0.0 | 2.0 |
18 | 10663.0 | 0.0 | 5.0 | 5.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
19 | 10789.0 | 0.0 | 9.0 | 9.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
20 | 10768.0 | 0.0 | 3.0 | 3.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 |
# Treatment group n°4
replace!(penn.tg, 4 => 1)
rename!(penn, "tg" => "T4")
# from float to string
penn[!,:dep] = string.(penn[!,:dep])
# dep varaible in categorical format
penn[!,:dep] = categorical(penn[!,:dep])
describe(penn)
23 rows × 7 columns
variable | mean | min | median | max | nmissing | eltype | |
---|---|---|---|---|---|---|---|
Symbol | Union… | Any | Union… | Any | Int64 | DataType | |
1 | abdt | 10695.4 | 10404.0 | 10698.0 | 10880.0 | 0 | Float64 |
2 | T4 | 0.342224 | 0.0 | 0.0 | 1.0 | 0 | Float64 |
3 | inuidur1 | 13.053 | 1.0 | 11.0 | 52.0 | 0 | Float64 |
4 | inuidur2 | 12.2812 | 0.0 | 10.0 | 52.0 | 0 | Float64 |
5 | female | 0.404001 | 0.0 | 0.0 | 1.0 | 0 | Float64 |
6 | black | 0.121985 | 0.0 | 0.0 | 1.0 | 0 | Float64 |
7 | hispanic | 0.0325554 | 0.0 | 0.0 | 1.0 | 0 | Float64 |
8 | othrace | 0.00725632 | 0.0 | 0.0 | 1.0 | 0 | Float64 |
9 | dep | 0.0 | 2.0 | 0 | CategoricalValue{String, UInt32} | ||
10 | q1 | 0.0127476 | 0.0 | 0.0 | 1.0 | 0 | Float64 |
11 | q2 | 0.203765 | 0.0 | 0.0 | 1.0 | 0 | Float64 |
12 | q3 | 0.235536 | 0.0 | 0.0 | 1.0 | 0 | Float64 |
13 | q4 | 0.225927 | 0.0 | 0.0 | 1.0 | 0 | Float64 |
14 | q5 | 0.25907 | 0.0 | 0.0 | 1.0 | 0 | Float64 |
15 | q6 | 0.0629535 | 0.0 | 0.0 | 1.0 | 0 | Float64 |
16 | recall | 0.110414 | 0.0 | 0.0 | 1.0 | 0 | Float64 |
17 | agelt35 | 0.545009 | 0.0 | 1.0 | 1.0 | 0 | Float64 |
18 | agegt54 | 0.109433 | 0.0 | 0.0 | 1.0 | 0 | Float64 |
19 | durable | 0.148068 | 0.0 | 0.0 | 1.0 | 0 | Float64 |
20 | nondurable | 0.109237 | 0.0 | 0.0 | 1.0 | 0 | Float64 |
21 | lusd | 0.261032 | 0.0 | 0.0 | 1.0 | 0 | Float64 |
22 | husd | 0.21867 | 0.0 | 0.0 | 1.0 | 0 | Float64 |
23 | muld | 0.444205 | 0.0 | 0.0 | 1.0 | 0 | Float64 |
6.1.1.1. Model#
To evaluate the impact of the treatments on unemployment duration, we consider the linear regression model:
where \(Y\) is the log of duration of unemployment, \(D\) is a treatment indicators, and \(W\) is a set of controls including age group dummies, gender, race, number of dependents, quarter of the experiment, location within the state, existence of recall expectations, and type of occupation. Here \(\beta_1\) is the ATE, if the RCT assumptions hold rigorously.
We also consider interactive regression model:
where \(W\)’s are demeaned (apart from the intercept), so that \(\alpha_1\) is the ATE, if the RCT assumptions hold rigorously.
Under RCT, the projection coefficient \(\beta_1\) has the interpretation of the causal effect of the treatment on the average outcome. We thus refer to \(\beta_1\) as the average treatment effect (ATE). Note that the covariates, here are independent of the treatment \(D\), so we can identify \(\beta_1\) by just linear regression of \(Y\) on \(D\), without adding covariates. However we do add covariates in an effort to improve the precision of our estimates of the average treatment effect.
6.1.2. Analysis#
We consider
classical 2-sample approach, no adjustment (CL)
classical linear regression adjustment (CRA)
interactive regression adjusment (IRA)
and carry out robust inference using the estimatr R packages.
6.2. Carry out covariate balance check#
This is done using “lm_robust” command which unlike “lm” in the base command automatically does the correct Eicher-Huber-White standard errors, instead othe classical non-robus formula based on the homoscdedasticity command.
# couples variables combinations
combinations_upto(x, n) = Iterators.flatten(combinations(x, i) for i in 1:n)
# combinations without same couple
expand_exp(args, deg::ConstantTerm) =
tuple(((&)(terms...) for terms in combinations_upto(args, deg.n))...)
StatsModels.apply_schema(t::FunctionTerm{typeof(^)}, sch::StatsModels.Schema, ctx::Type) =
apply_schema.(expand_exp(t.args_parsed...), Ref(sch), ctx)
# linear regression
reg1 = @formula(T4 ~ (female+black+othrace+dep+q2+q3+q4+q5+q6+agelt35+agegt54+durable+lusd+husd)^2)
reg1 = apply_schema(reg1, schema(reg1, penn))
FormulaTerm
Response:
T4(continuous)
Predictors:
female(continuous)
black(continuous)
othrace(continuous)
dep(DummyCoding:3→2)
q2(continuous)
q3(continuous)
q4(continuous)
q5(continuous)
q6(continuous)
agelt35(continuous)
agegt54(continuous)
durable(continuous)
lusd(continuous)
husd(continuous)
female(continuous) & black(continuous)
female(continuous) & othrace(continuous)
female(continuous) & dep(DummyCoding:3→2)
female(continuous) & q2(continuous)
female(continuous) & q3(continuous)
female(continuous) & q4(continuous)
female(continuous) & q5(continuous)
female(continuous) & q6(continuous)
female(continuous) & agelt35(continuous)
female(continuous) & agegt54(continuous)
female(continuous) & durable(continuous)
female(continuous) & lusd(continuous)
female(continuous) & husd(continuous)
black(continuous) & othrace(continuous)
black(continuous) & dep(DummyCoding:3→2)
black(continuous) & q2(continuous)
black(continuous) & q3(continuous)
black(continuous) & q4(continuous)
black(continuous) & q5(continuous)
black(continuous) & q6(continuous)
black(continuous) & agelt35(continuous)
black(continuous) & agegt54(continuous)
black(continuous) & durable(continuous)
black(continuous) & lusd(continuous)
black(continuous) & husd(continuous)
othrace(continuous) & dep(DummyCoding:3→2)
othrace(continuous) & q2(continuous)
othrace(continuous) & q3(continuous)
othrace(continuous) & q4(continuous)
othrace(continuous) & q5(continuous)
othrace(continuous) & q6(continuous)
othrace(continuous) & agelt35(continuous)
othrace(continuous) & agegt54(continuous)
othrace(continuous) & durable(continuous)
othrace(continuous) & lusd(continuous)
othrace(continuous) & husd(continuous)
dep(DummyCoding:3→2) & q2(continuous)
dep(DummyCoding:3→2) & q3(continuous)
dep(DummyCoding:3→2) & q4(continuous)
dep(DummyCoding:3→2) & q5(continuous)
dep(DummyCoding:3→2) & q6(continuous)
dep(DummyCoding:3→2) & agelt35(continuous)
dep(DummyCoding:3→2) & agegt54(continuous)
dep(DummyCoding:3→2) & durable(continuous)
dep(DummyCoding:3→2) & lusd(continuous)
dep(DummyCoding:3→2) & husd(continuous)
q2(continuous) & q3(continuous)
q2(continuous) & q4(continuous)
q2(continuous) & q5(continuous)
q2(continuous) & q6(continuous)
q2(continuous) & agelt35(continuous)
q2(continuous) & agegt54(continuous)
q2(continuous) & durable(continuous)
q2(continuous) & lusd(continuous)
q2(continuous) & husd(continuous)
q3(continuous) & q4(continuous)
q3(continuous) & q5(continuous)
q3(continuous) & q6(continuous)
q3(continuous) & agelt35(continuous)
q3(continuous) & agegt54(continuous)
q3(continuous) & durable(continuous)
q3(continuous) & lusd(continuous)
q3(continuous) & husd(continuous)
q4(continuous) & q5(continuous)
q4(continuous) & q6(continuous)
q4(continuous) & agelt35(continuous)
q4(continuous) & agegt54(continuous)
q4(continuous) & durable(continuous)
q4(continuous) & lusd(continuous)
q4(continuous) & husd(continuous)
q5(continuous) & q6(continuous)
q5(continuous) & agelt35(continuous)
q5(continuous) & agegt54(continuous)
q5(continuous) & durable(continuous)
q5(continuous) & lusd(continuous)
q5(continuous) & husd(continuous)
q6(continuous) & agelt35(continuous)
q6(continuous) & agegt54(continuous)
q6(continuous) & durable(continuous)
q6(continuous) & lusd(continuous)
q6(continuous) & husd(continuous)
agelt35(continuous) & agegt54(continuous)
agelt35(continuous) & durable(continuous)
agelt35(continuous) & lusd(continuous)
agelt35(continuous) & husd(continuous)
agegt54(continuous) & durable(continuous)
agegt54(continuous) & lusd(continuous)
agegt54(continuous) & husd(continuous)
durable(continuous) & lusd(continuous)
durable(continuous) & husd(continuous)
lusd(continuous) & husd(continuous)
m1 = lm(reg1, penn)
table = regtable( "Covariate Balance Check" => m1) # coeficientes, standar error, squared R, N (sample size )
| Covariate Balance Check
| (1)
---------------------------------------------
(Intercept) | 0.321*
| (0.167)
female | 0.104
| (0.138)
black | 0.072
| (0.087)
othrace | -0.345
| (0.294)
dep: 1.0 | -0.074
| (0.218)
dep: 2.0 | -0.109
| (0.165)
q2 | -0.027
| (0.168)
q3 | -0.006
| (0.167)
q4 | 0.043
| (0.168)
q5 | 0.094
| (0.167)
q6 | -0.222
| (0.167)
agelt35 | -0.109
| (0.135)
agegt54 | -0.437
| (0.302)
durable | -0.125
| (0.192)
lusd | 0.038
| (0.047)
husd | 0.095
| (0.080)
female & black | 0.089**
| (0.044)
female & othrace | -0.414**
| (0.201)
female & dep: 1.0 | 0.055
| (0.047)
female & dep: 2.0 | 0.045
| (0.041)
female & q2 | -0.189
| (0.137)
female & q3 | -0.165
| (0.137)
female & q4 | -0.176
| (0.137)
female & q5 | -0.203
| (0.136)
female & q6 | -0.043
| (0.145)
female & agelt35 | 0.073**
| (0.030)
female & agegt54 | 0.026
| (0.051)
female & durable | 0.020
| (0.044)
female & lusd | 0.002
| (0.034)
female & husd | 0.012
| (0.037)
black & othrace | 0.000
| (NaN)
black & dep: 1.0 | -0.117
| (0.071)
black & dep: 2.0 | -0.022
| (0.063)
black & q2 | -0.033
| (0.092)
black & q3 | -0.197**
| (0.089)
black & q4 | -0.125
| (0.088)
black & q5 | -0.210**
| (0.087)
black & q6 | 0.000
| (NaN)
black & agelt35 | 0.062
| (0.046)
black & agegt54 | 0.051
| (0.082)
black & durable | 0.105
| (0.067)
black & lusd | -0.021
| (0.055)
black & husd | 0.250
| (0.173)
othrace & dep: 1.0 | -0.858
| (0.615)
othrace & dep: 2.0 | 0.242
| (0.212)
othrace & q2 | 0.811**
| (0.371)
othrace & q3 | 0.000
| (NaN)
othrace & q4 | 0.789***
| (0.296)
othrace & q5 | 0.379
| (0.286)
othrace & q6 | 0.373
| (0.353)
othrace & agelt35 | 0.316
| (0.217)
othrace & agegt54 | 0.309
| (0.254)
othrace & durable | -0.191
| (0.230)
othrace & lusd | -0.091
| (0.215)
othrace & husd | 0.005
| (0.329)
dep: 1.0 & q2 | 0.166
| (0.217)
dep: 2.0 & q2 | 0.087
| (0.165)
dep: 1.0 & q3 | 0.121
| (0.217)
dep: 2.0 & q3 | 0.140
| (0.165)
dep: 1.0 & q4 | 0.085
| (0.216)
dep: 2.0 & q4 | 0.091
| (0.166)
dep: 1.0 & q5 | 0.110
| (0.215)
dep: 2.0 & q5 | 0.096
| (0.164)
dep: 1.0 & q6 | 0.101
| (0.229)
dep: 2.0 & q6 | 0.048
| (0.175)
dep: 1.0 & agelt35 | -0.074
| (0.048)
dep: 2.0 & agelt35 | -0.024
| (0.038)
dep: 1.0 & agegt54 | -0.081
| (0.070)
dep: 2.0 & agegt54 | 0.028
| (0.144)
dep: 1.0 & durable | -0.060
| (0.063)
dep: 2.0 & durable | 0.125**
| (0.050)
dep: 1.0 & lusd | 0.059
| (0.052)
dep: 2.0 & lusd | 0.052
| (0.046)
dep: 1.0 & husd | -0.043
| (0.057)
dep: 2.0 & husd | -0.088*
| (0.050)
q2 & q3 | 0.000
| (NaN)
q2 & q4 | 0.000
| (NaN)
q2 & q5 | 0.000
| (NaN)
q2 & q6 | 0.000
| (NaN)
q2 & agelt35 | 0.128
| (0.135)
q2 & agegt54 | 0.530*
| (0.302)
q2 & durable | 0.126
| (0.190)
q2 & lusd | 0.000
| (NaN)
q2 & husd | -0.019
| (0.080)
q3 & q4 | 0.000
| (NaN)
q3 & q5 | 0.000
| (NaN)
q3 & q6 | 0.000
| (NaN)
q3 & agelt35 | 0.131
| (0.135)
q3 & agegt54 | 0.478
| (0.301)
q3 & durable | 0.054
| (0.190)
q3 & lusd | 0.022
| (0.048)
q3 & husd | -0.060
| (0.079)
q4 & q5 | 0.000
| (NaN)
q4 & q6 | 0.000
| (NaN)
q4 & agelt35 | 0.131
| (0.135)
q4 & agegt54 | 0.540*
| (0.300)
q4 & durable | 0.096
| (0.191)
q4 & lusd | -0.027
| (0.049)
q4 & husd | -0.172**
| (0.080)
q5 & q6 | 0.000
| (NaN)
q5 & agelt35 | 0.071
| (0.134)
q5 & agegt54 | 0.444
| (0.300)
q5 & durable | 0.075
| (0.189)
q5 & lusd | 5.906e-05
| (0.048)
q5 & husd | -0.162**
| (0.078)
q6 & agelt35 | 0.170
| (0.143)
q6 & agegt54 | 0.566*
| (0.311)
q6 & durable | 0.256
| (0.199)
q6 & lusd | 0.076
| (0.075)
q6 & husd | 0.000
| (NaN)
agelt35 & agegt54 | 0.000
| (NaN)
agelt35 & durable | 0.009
| (0.041)
agelt35 & lusd | -0.026
| (0.035)
agelt35 & husd | 0.035
| (0.041)
agegt54 & durable | -0.062
| (0.068)
agegt54 & lusd | -0.037
| (0.057)
agegt54 & husd | -0.074
| (0.063)
durable & lusd | -0.055
| (0.044)
durable & husd | -1.692e-04
| (0.054)
lusd & husd | 0.000
| (NaN)
---------------------------------------------
N | 5099
$R^2$ | 0.029
6.3. Model specification#
# No adjustment (2-sample approach)
ols_cl = lm(@formula(log(inuidur1) ~ T4), penn)
table1 = regtable( "No adjustment model" => ols_cl) #
| No adjustment model
| (1)
----------------------------------
(Intercept) | 2.057***
| (0.021)
T4 | -0.085**
| (0.036)
----------------------------------
N | 5099
$R^2$ | 0.001
# adding controls
# Omitted dummies: q1, nondurable, muld
reg2 = @formula(log(inuidur1) ~ T4 + (female+black+othrace+dep+q2+q3+q4+q5+q6+agelt35+agegt54+durable+lusd+husd)^2)
reg2 = apply_schema(reg2, schema(reg2, penn))
ols_cra = lm(reg2, penn)
table2 = regtable("CRA model" => ols_cra)
| CRA model
| (1)
-------------------------------
(Intercept) | 2.633***
| (0.420)
T4 | -0.080**
| (0.036)
female | -0.115
| (0.347)
black | -0.441***
| (0.158)
othrace | -0.883
| (0.903)
dep: 1.0 | -0.720
| (0.550)
dep: 2.0 | -0.041
| (0.417)
q2 | -0.160
| (0.423)
q3 | -0.540
| (0.422)
q4 | -0.433
| (0.422)
q5 | -0.345
| (0.420)
q6 | -0.494
| (0.420)
agelt35 | -0.626*
| (0.340)
agegt54 | -0.361
| (0.760)
durable | -0.279
| (0.483)
lusd | -0.223
| (0.184)
husd | -0.170
| (0.201)
female & black | -0.155
| (0.111)
female & othrace | 0.310
| (0.507)
female & dep: 1.0 | -0.028
| (0.118)
female & dep: 2.0 | 0.148
| (0.103)
female & q2 | -0.087
| (0.346)
female & q3 | 0.205
| (0.345)
female & q4 | 0.273
| (0.345)
female & q5 | 0.061
| (0.344)
female & q6 | 0.287
| (0.366)
female & agelt35 | 0.126*
| (0.076)
female & agegt54 | 0.040
| (0.128)
female & durable | 0.068
| (0.112)
female & lusd | 0.084
| (0.085)
female & husd | 0.062
| (0.094)
black & othrace | 0.000
| (NaN)
black & dep: 1.0 | 0.157
| (0.180)
black & dep: 2.0 | -0.104
| (0.160)
black & q2 | 0.000
| (NaN)
black & q3 | -0.042
| (0.168)
black & q4 | 0.090
| (0.165)
black & q5 | 0.276*
| (0.162)
black & q6 | -0.459**
| (0.233)
black & agelt35 | 0.015
| (0.115)
black & agegt54 | 0.444**
| (0.207)
black & durable | 0.244
| (0.168)
black & lusd | 0.291**
| (0.139)
black & husd | 1.315***
| (0.437)
othrace & dep: 1.0 | 1.074
| (1.549)
othrace & dep: 2.0 | 0.007
| (0.533)
othrace & q2 | 0.000
| (NaN)
othrace & q3 | -0.387
| (0.935)
othrace & q4 | -0.301
| (0.789)
othrace & q5 | -0.319
| (0.751)
othrace & q6 | -1.744*
| (0.929)
othrace & agelt35 | 1.185**
| (0.547)
othrace & agegt54 | -0.222
| (0.640)
othrace & durable | 1.628***
| (0.579)
othrace & lusd | -0.071
| (0.541)
othrace & husd | -0.771
| (0.830)
dep: 1.0 & q2 | 0.640
| (0.548)
dep: 2.0 & q2 | -0.044
| (0.415)
dep: 1.0 & q3 | 0.698
| (0.546)
dep: 2.0 & q3 | -0.030
| (0.415)
dep: 1.0 & q4 | 0.504
| (0.544)
dep: 2.0 & q4 | 0.139
| (0.417)
dep: 1.0 & q5 | 0.528
| (0.542)
dep: 2.0 & q5 | -0.170
| (0.413)
dep: 1.0 & q6 | 1.095*
| (0.578)
dep: 2.0 & q6 | 0.341
| (0.441)
dep: 1.0 & agelt35 | 0.075
| (0.122)
dep: 2.0 & agelt35 | 0.033
| (0.096)
dep: 1.0 & agegt54 | 0.072
| (0.175)
dep: 2.0 & agegt54 | 0.157
| (0.362)
dep: 1.0 & durable | 0.295*
| (0.158)
dep: 2.0 & durable | 0.045
| (0.126)
dep: 1.0 & lusd | 0.150
| (0.131)
dep: 2.0 & lusd | 0.183
| (0.116)
dep: 1.0 & husd | 0.081
| (0.143)
dep: 2.0 & husd | 0.152
| (0.125)
q2 & q3 | 0.000
| (NaN)
q2 & q4 | 0.000
| (NaN)
q2 & q5 | 0.000
| (NaN)
q2 & q6 | 0.000
| (NaN)
q2 & agelt35 | 0.430
| (0.340)
q2 & agegt54 | 0.671
| (0.762)
q2 & durable | -0.106
| (0.479)
q2 & lusd | -0.043
| (0.190)
q2 & husd | -0.042
| (0.202)
q3 & q4 | 0.000
| (NaN)
q3 & q5 | 0.000
| (NaN)
q3 & q6 | 0.000
| (NaN)
q3 & agelt35 | 0.454
| (0.340)
q3 & agegt54 | 0.866
| (0.759)
q3 & durable | 0.245
| (0.479)
q3 & lusd | 0.086
| (0.187)
q3 & husd | 0.170
| (0.200)
q4 & q5 | 0.000
| (NaN)
q4 & q6 | 0.000
| (NaN)
q4 & agelt35 | 0.388
| (0.339)
q4 & agegt54 | 0.555
| (0.757)
q4 & durable | 0.217
| (0.481)
q4 & lusd | -0.095
| (0.188)
q4 & husd | -0.121
| (0.201)
q5 & q6 | 0.000
| (NaN)
q5 & agelt35 | 0.278
| (0.337)
q5 & agegt54 | 0.451
| (0.755)
q5 & durable | 0.288
| (0.477)
q5 & lusd | -0.104
| (0.186)
q5 & husd | -0.152
| (0.197)
q6 & agelt35 | 0.338
| (0.361)
q6 & agegt54 | 0.949
| (0.783)
q6 & durable | 0.409
| (0.503)
q6 & lusd | 0.000
| (NaN)
q6 & husd | 0.000
| (NaN)
agelt35 & agegt54 | 0.000
| (NaN)
agelt35 & durable | 0.025
| (0.104)
agelt35 & lusd | -0.065
| (0.089)
agelt35 & husd | 0.058
| (0.102)
agegt54 & durable | 0.032
| (0.170)
agegt54 & lusd | -0.148
| (0.143)
agegt54 & husd | -0.302*
| (0.158)
durable & lusd | 0.116
| (0.110)
durable & husd | 0.238*
| (0.136)
lusd & husd | 0.000
| (NaN)
-------------------------------
N | 5099
$R^2$ | 0.060
# demean function
function desv_mean(a)
A = mean(a, dims = 1)
M = zeros(Float64, size(X,1), size(X,2))
for i in 1:size(a,2)
M[:,i] = a[:,i] .- A[i]
end
return M
end
# Matrix Model & demean
X = StatsModels.modelmatrix(reg1.rhs,penn)
X = desv_mean(X) # matrix format
5099×119 Matrix{Float64}:
-0.404001 -0.121985 -0.00725632 -0.112179 … -0.0543244 -0.0280447 0.0
-0.404001 -0.121985 -0.00725632 -0.112179 -0.0543244 -0.0280447 0.0
-0.404001 -0.121985 -0.00725632 -0.112179 -0.0543244 -0.0280447 0.0
-0.404001 -0.121985 -0.00725632 -0.112179 -0.0543244 -0.0280447 0.0
-0.404001 -0.121985 -0.00725632 0.887821 0.945676 -0.0280447 0.0
0.595999 -0.121985 -0.00725632 -0.112179 … -0.0543244 -0.0280447 0.0
0.595999 -0.121985 -0.00725632 0.887821 -0.0543244 -0.0280447 0.0
0.595999 -0.121985 -0.00725632 0.887821 -0.0543244 -0.0280447 0.0
0.595999 -0.121985 -0.00725632 -0.112179 -0.0543244 -0.0280447 0.0
0.595999 -0.121985 -0.00725632 -0.112179 -0.0543244 -0.0280447 0.0
0.595999 -0.121985 -0.00725632 -0.112179 … -0.0543244 -0.0280447 0.0
0.595999 -0.121985 -0.00725632 -0.112179 -0.0543244 -0.0280447 0.0
0.595999 -0.121985 -0.00725632 -0.112179 -0.0543244 -0.0280447 0.0
⋮ ⋱
0.595999 -0.121985 0.992744 -0.112179 0.945676 -0.0280447 0.0
0.595999 -0.121985 0.992744 -0.112179 0.945676 -0.0280447 0.0
-0.404001 -0.121985 0.992744 -0.112179 -0.0543244 -0.0280447 0.0
0.595999 -0.121985 0.992744 -0.112179 … -0.0543244 -0.0280447 0.0
-0.404001 0.878015 -0.00725632 -0.112179 -0.0543244 -0.0280447 0.0
0.595999 0.878015 -0.00725632 -0.112179 -0.0543244 -0.0280447 0.0
0.595999 -0.121985 0.992744 -0.112179 -0.0543244 -0.0280447 0.0
-0.404001 -0.121985 -0.00725632 -0.112179 -0.0543244 -0.0280447 0.0
-0.404001 -0.121985 -0.00725632 -0.112179 … -0.0543244 -0.0280447 0.0
-0.404001 -0.121985 -0.00725632 -0.112179 -0.0543244 -0.0280447 0.0
-0.404001 -0.121985 -0.00725632 -0.112179 -0.0543244 -0.0280447 0.0
-0.404001 -0.121985 -0.00725632 -0.112179 0.945676 -0.0280447 0.0
Y = select(penn, [:inuidur1,:T4]) # select inuidur1 y T4
X = DataFrame(hcat(X, Matrix(select(penn, [:T4])).*X), :auto) # Joint X, (T4*X)
base = hcat(Y, X) # Joint inuidur1, T4, X y (T4*X)
base.inuidur1 = log.(base.inuidur1) # log(inuidur1)
terms = term.(names(base)) # term.() let us to get all variables as objects
#interactive regression model
ols_ira = lm(terms[1] ~ sum(terms[2:end]), base)
table3 = regtable("Interactive model" => ols_ira)
#terms[1] : select first variable. In this case, oucome of interest
#sum(terms[2:end]) : independent variables as regresors in the linear regression
| Interactive model
| (1)
--------------------------------
(Intercept) | 2.058***
| (0.021)
T4 | -0.076**
| (0.036)
x1 | -0.666
| (0.443)
x2 | -0.437**
| (0.196)
x3 | -1.735
| (2.163)
x4 | 0.036
| (0.682)
x5 | 0.212
| (0.495)
x6 | -0.255
| (0.525)
x7 | -0.621
| (0.523)
x8 | -0.480
| (0.524)
x9 | -0.372
| (0.522)
x10 | -0.677
| (0.519)
x11 | -0.678
| (0.433)
x12 | -0.304
| (0.811)
x13 | -0.838
| (0.586)
x14 | -0.099
| (0.220)
x15 | -0.063
| (0.237)
x16 | -0.215
| (0.138)
x17 | 0.599
| (0.889)
x18 | -0.173
| (0.145)
x19 | 0.217*
| (0.127)
x20 | 0.392
| (0.442)
x21 | 0.685
| (0.441)
x22 | 0.721
| (0.440)
x23 | 0.566
| (0.439)
x24 | 0.908*
| (0.465)
x25 | 0.170*
| (0.094)
x26 | 0.236
| (0.160)
x27 | 0.097
| (0.138)
x28 | 0.072
| (0.106)
x29 | 0.061
| (0.117)
x30 | 0.000
| (NaN)
x31 | 0.109
| (0.218)
x32 | -0.201
| (0.201)
x33 | 0.000
| (NaN)
x34 | -0.097
| (0.210)
x35 | 0.118
| (0.215)
x36 | 0.233
| (0.206)
x37 | -0.427
| (0.295)
x38 | 0.083
| (0.142)
x39 | 0.423
| (0.266)
x40 | 0.468**
| (0.216)
x41 | 0.309*
| (0.176)
x42 | 0.550
| (0.712)
x43 | 2.520
| (2.711)
x44 | 1.189
| (1.465)
x45 | 0.000
| (NaN)
x46 | -1.145
| (1.896)
x47 | 1.777
| (2.120)
x48 | -1.111
| (1.398)
x49 | -2.083
| (1.521)
x50 | 2.248
| (1.395)
x51 | 0.820
| (2.421)
x52 | 1.751**
| (0.848)
x53 | -0.694
| (0.855)
x54 | -0.180
| (1.322)
x55 | 0.234
| (0.682)
x56 | -0.218
| (0.492)
x57 | 0.154
| (0.678)
x58 | -0.254
| (0.493)
x59 | -0.024
| (0.676)
x60 | -0.162
| (0.494)
x61 | 0.140
| (0.674)
x62 | -0.563
| (0.489)
x63 | 0.483
| (0.711)
x64 | -0.005
| (0.521)
x65 | 0.051
| (0.151)
x66 | 0.119
| (0.118)
x67 | 0.037
| (0.213)
x68 | 0.223
| (0.492)
x69 | 0.340*
| (0.190)
x70 | 0.115
| (0.162)
x71 | -0.260
| (0.166)
x72 | 0.115
| (0.150)
x73 | -0.043
| (0.173)
x74 | 0.018
| (0.151)
x75 | 0.000
| (NaN)
x76 | 0.000
| (NaN)
x77 | 0.000
| (NaN)
x78 | 0.000
| (NaN)
x79 | 0.501
| (0.433)
x80 | 0.432
| (0.813)
x81 | 0.262
| (0.582)
x82 | -0.068
| (0.230)
x83 | -0.182
| (0.242)
x84 | 0.000
| (NaN)
x85 | 0.000
| (NaN)
x86 | 0.000
| (NaN)
x87 | 0.514
| (0.432)
x88 | 0.725
| (0.808)
x89 | 0.722
| (0.582)
x90 | -0.001
| (0.226)
x91 | 0.099
| (0.240)
x92 | 0.000
| (NaN)
x93 | 0.000
| (NaN)
x94 | 0.405
| (0.431)
x95 | 0.247
| (0.804)
x96 | 0.594
| (0.584)
x97 | -0.169
| (0.229)
x98 | -0.145
| (0.240)
x99 | 0.000
| (NaN)
x100 | 0.216
| (0.429)
x101 | 0.100
| (0.802)
x102 | 0.854
| (0.577)
x103 | -0.259
| (0.227)
x104 | -0.174
| (0.234)
x105 | 0.382
| (0.454)
x106 | 0.836
| (0.838)
x107 | 1.029*
| (0.610)
x108 | 0.000
| (NaN)
x109 | 0.000
| (NaN)
x110 | 0.000
| (NaN)
x111 | -0.003
| (0.128)
x112 | -0.057
| (0.111)
x113 | 0.055
| (0.124)
x114 | -0.124
| (0.207)
x115 | 0.091
| (0.184)
x116 | -0.315
| (0.192)
x117 | 0.263*
| (0.136)
x118 | 0.314*
| (0.167)
x119 | 0.000
| (NaN)
x120 | 1.569**
| (0.756)
x121 | -0.168
| (0.492)
x122 | 2.572
| (2.328)
x123 | -2.333**
| (1.182)
x124 | -0.562
| (0.967)
x125 | 0.343
| (0.935)
x126 | 0.292
| (0.933)
x127 | 0.176
| (0.933)
x128 | 0.104
| (0.928)
x129 | 0.596
| (0.942)
x130 | 0.184
| (0.726)
x131 | 0.227
| (0.580)
x132 | 1.159
| (1.134)
x133 | -0.138
| (0.234)
x134 | -0.447
| (0.474)
x135 | 0.138
| (0.237)
x136 | -0.559
| (1.301)
x137 | 0.396
| (0.255)
x138 | -0.108
| (0.220)
x139 | -1.329*
| (0.753)
x140 | -1.334*
| (0.752)
x141 | -1.268*
| (0.753)
x142 | -1.412*
| (0.750)
x143 | -1.709**
| (0.805)
x144 | -0.089
| (0.166)
x145 | -0.477*
| (0.277)
x146 | -0.077
| (0.245)
x147 | -0.037
| (0.181)
x148 | -0.152
| (0.203)
x149 | 0.000
| (NaN)
x150 | 0.190
| (0.407)
x151 | 0.290
| (0.338)
x152 | 0.149
| (0.505)
x153 | 0.309
| (0.497)
x154 | 0.061
| (0.484)
x155 | 0.303
| (0.483)
x156 | 0.000
| (NaN)
x157 | -0.170
| (0.251)
x158 | 0.107
| (0.435)
x159 | -0.733**
| (0.359)
x160 | 0.011
| (0.294)
x161 | 1.364
| (0.912)
x162 | 0.000
| (NaN)
x163 | -1.402
| (1.662)
x164 | -2.175
| (2.743)
x165 | 0.000
| (NaN)
x166 | -4.744*
| (2.577)
x167 | -0.444
| (1.955)
x168 | -1.039
| (2.548)
x169 | -0.775
| (1.813)
x170 | -0.570
| (2.814)
x171 | 0.378
| (1.467)
x172 | 1.213
| (1.265)
x173 | 0.000
| (NaN)
x174 | 1.434
| (1.173)
x175 | 0.225
| (0.967)
x176 | 1.720
| (1.172)
x177 | 0.441
| (0.965)
x178 | 1.661
| (1.166)
x179 | 0.707
| (0.974)
x180 | 1.155
| (1.162)
x181 | 0.900
| (0.965)
x182 | 1.945
| (1.260)
x183 | 0.984
| (1.038)
x184 | 0.038
| (0.261)
x185 | -0.268
| (0.206)
x186 | 0.331
| (0.384)
x187 | -0.317
| (0.755)
x188 | -0.394
| (0.359)
x189 | -0.234
| (0.271)
x190 | 1.135***
| (0.275)
x191 | 0.178
| (0.243)
x192 | 0.374
| (0.312)
x193 | 0.515*
| (0.278)
x194 | 0.000
| (NaN)
x195 | 0.000
| (NaN)
x196 | 0.000
| (NaN)
x197 | 0.000
| (NaN)
x198 | -0.302
| (0.724)
x199 | 0.115
| (0.610)
x200 | -0.517
| (1.121)
x201 | -0.149
| (0.260)
x202 | 0.532
| (0.469)
x203 | 0.000
| (NaN)
x204 | 0.000
| (NaN)
x205 | 0.000
| (NaN)
x206 | -0.226
| (0.724)
x207 | -0.194
| (0.608)
x208 | -0.849
| (1.123)
x209 | 0.000
| (NaN)
x210 | 0.311
| (0.466)
x211 | 0.000
| (NaN)
x212 | 0.000
| (NaN)
x213 | -0.135
| (0.724)
x214 | 0.353
| (0.584)
x215 | -0.628
| (1.127)
x216 | -0.012
| (0.247)
x217 | 0.303
| (0.470)
x218 | 0.000
| (NaN)
x219 | 0.097
| (0.720)
x220 | 0.482
| (0.580)
x221 | -1.240
| (1.120)
x222 | 0.301
| (0.244)
x223 | 0.295
| (0.462)
x224 | -0.001
| (0.794)
x225 | 0.000
| (NaN)
x226 | -1.365
| (1.178)
x227 | -0.315
| (0.428)
x228 | 0.000
| (NaN)
x229 | 0.000
| (NaN)
x230 | 0.144
| (0.227)
x231 | -0.014
| (0.189)
x232 | 0.029
| (0.224)
x233 | 0.545
| (0.389)
x234 | -0.579*
| (0.302)
x235 | 0.191
| (0.350)
x236 | -0.374
| (0.239)
x237 | -0.338
| (0.297)
x238 | 0.000
| (NaN)
--------------------------------
N | 5099
$R^2$ | 0.079
X = StatsModels.modelmatrix(reg2.rhs,penn)
X = desv_mean(X)
D = DataFrame([X[:,1]], :auto) # Treatment varaible
rename!(D, Dict(:x1 => :T4)) #rename x1 -> T4
X = DataFrame(hcat(X[:,2:end], X[:,1].*X[:,2:end]), :auto) # Join Controls (X) + T4*X "interactive"
Y = select(penn, [:inuidur1]) #select just inuidur1
Y.inuidur1 = log.(Y.inuidur1) # log(inuidur1)
5099-element Vector{Float64}:
2.8903717578961645
0.0
3.295836866004329
2.1972245773362196
3.295836866004329
3.295836866004329
2.1972245773362196
3.295836866004329
2.70805020110221
3.332204510175204
2.4849066497880004
3.091042453358316
2.8903717578961645
⋮
3.295836866004329
2.70805020110221
2.995732273553991
0.0
3.1354942159291497
2.5649493574615367
1.791759469228055
2.302585092994046
1.3862943611198906
2.1972245773362196
1.3862943611198906
3.295836866004329
6.4. Using HDMJL#
include("hdmjl/hdmjl.jl")
D_reg_0 = rlasso_arg( X, D, nothing, true, true, true, false, false,
nothing, 1.1, nothing, 5000, 15, 10^(-5), -Inf, true, Inf, true )
rlasso_arg(5099×238 DataFrame
Row │ x1 x2 x3 x4 x5 x6 x7 ⋯
│ Float64 Float64 Float64 Float64 Float64 Float64 Fl ⋯
──────┼─────────────────────────────────────────────────────────────────────────
1 │ -0.404001 -0.121985 -0.00725632 -0.112179 0.836242 -0.203765 -0 ⋯
2 │ -0.404001 -0.121985 -0.00725632 -0.112179 -0.163758 -0.203765 -0
3 │ -0.404001 -0.121985 -0.00725632 -0.112179 -0.163758 -0.203765 -0
4 │ -0.404001 -0.121985 -0.00725632 -0.112179 -0.163758 -0.203765 0
5 │ -0.404001 -0.121985 -0.00725632 0.887821 -0.163758 -0.203765 -0 ⋯
6 │ 0.595999 -0.121985 -0.00725632 -0.112179 -0.163758 -0.203765 -0
7 │ 0.595999 -0.121985 -0.00725632 0.887821 -0.163758 -0.203765 -0
8 │ 0.595999 -0.121985 -0.00725632 0.887821 -0.163758 -0.203765 -0
9 │ 0.595999 -0.121985 -0.00725632 -0.112179 -0.163758 0.796235 -0 ⋯
10 │ 0.595999 -0.121985 -0.00725632 -0.112179 -0.163758 -0.203765 0
11 │ 0.595999 -0.121985 -0.00725632 -0.112179 0.836242 -0.203765 -0
⋮ │ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋱
5090 │ -0.404001 -0.121985 0.992744 -0.112179 -0.163758 -0.203765 0
5091 │ 0.595999 -0.121985 0.992744 -0.112179 -0.163758 -0.203765 -0 ⋯
5092 │ -0.404001 0.878015 -0.00725632 -0.112179 -0.163758 -0.203765 -0
5093 │ 0.595999 0.878015 -0.00725632 -0.112179 -0.163758 -0.203765 -0
5094 │ 0.595999 -0.121985 0.992744 -0.112179 0.836242 -0.203765 -0
5095 │ -0.404001 -0.121985 -0.00725632 -0.112179 -0.163758 -0.203765 0 ⋯
5096 │ -0.404001 -0.121985 -0.00725632 -0.112179 0.836242 0.796235 -0
5097 │ -0.404001 -0.121985 -0.00725632 -0.112179 0.836242 0.796235 -0
5098 │ -0.404001 -0.121985 -0.00725632 -0.112179 -0.163758 -0.203765 -0
5099 │ -0.404001 -0.121985 -0.00725632 -0.112179 -0.163758 -0.203765 -0 ⋯
232 columns and 5078 rows omitted, 5099×1 DataFrame
Row │ T4
│ Float64
──────┼───────────
1 │ -0.342224
2 │ -0.342224
3 │ -0.342224
4 │ 0.657776
5 │ -0.342224
6 │ -0.342224
7 │ -0.342224
8 │ -0.342224
9 │ -0.342224
10 │ -0.342224
11 │ -0.342224
⋮ │ ⋮
5090 │ 0.657776
5091 │ -0.342224
5092 │ -0.342224
5093 │ 0.657776
5094 │ -0.342224
5095 │ 0.657776
5096 │ 0.657776
5097 │ -0.342224
5098 │ 0.657776
5099 │ -0.342224
5078 rows omitted, nothing, true, true, true, false, false, nothing, 1.1, nothing, 5000, 15, 1.0000000000000003e-5, -Inf, true, Inf, true)
# Outcome HDM model
D_resid = rlasso(D_reg_0)
Dict{String, Any} with 19 entries:
"tss" => 1147.82
"dev" => [-0.342224, -0.342224, -0.342224, 0.657776, -0.342224, -0.3…
"model" => [-0.404001 -0.121985 … 0.0101738 0.0; -0.404001 -0.121985 ……
"loadings" => [0.232712 0.155595 … 0.0435631 0.0]
"sigma" => [0.474501]
"lambda0" => 637.701
"lambda" => 238×2 DataFrame…
"intercept" => 1.82896e-17
"Xy" => [-3.98137, 2.13669, 5.33771, -1.75211, 3.24299, -3.5707, -4…
"iter" => 4
"residuals" => [-0.342224, -0.342224, -0.342224, 0.657776, -0.342224, -0.3…
"rss" => 1147.82
"index" => [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0 … 0.0, …
"beta" => 238×2 DataFrame…
"options" => Dict{String, Any}("intercept"=>true, "post"=>true, "meanx"=…
"x1" => Matrix{Float64}(undef, 5099, 0)
"pen" => Dict{String, Any}("lambda0"=>637.701, "lambda"=>[148.401; 9…
"startingval" => [-0.342224, -0.342224, -0.342224, 0.657776, -0.342224, -0.3…
"coefficients" => 239×2 DataFrame…
D_resid = rlasso(D_reg_0)["residuals"]
5099-element Vector{Float64}:
-0.34222396548342815
-0.34222396548342815
-0.34222396548342815
0.6577760345165719
-0.34222396548342815
-0.34222396548342815
-0.34222396548342815
-0.34222396548342815
-0.34222396548342815
-0.34222396548342815
-0.34222396548342815
0.6577760345165719
-0.34222396548342815
⋮
-0.34222396548342815
-0.34222396548342815
0.6577760345165719
-0.34222396548342815
-0.34222396548342815
0.6577760345165719
-0.34222396548342815
0.6577760345165719
0.6577760345165719
-0.34222396548342815
0.6577760345165719
-0.34222396548342815
Y_reg_0 = rlasso_arg( X, Y, nothing, true, true, true, false, false,
nothing, 1.1, nothing, 5000, 15, 10^(-5), -Inf, true, Inf, true )
rlasso_arg(5099×238 DataFrame
Row │ x1 x2 x3 x4 x5 x6 x7 ⋯
│ Float64 Float64 Float64 Float64 Float64 Float64 Fl ⋯
──────┼─────────────────────────────────────────────────────────────────────────
1 │ -0.404001 -0.121985 -0.00725632 -0.112179 0.836242 -0.203765 -0 ⋯
2 │ -0.404001 -0.121985 -0.00725632 -0.112179 -0.163758 -0.203765 -0
3 │ -0.404001 -0.121985 -0.00725632 -0.112179 -0.163758 -0.203765 -0
4 │ -0.404001 -0.121985 -0.00725632 -0.112179 -0.163758 -0.203765 0
5 │ -0.404001 -0.121985 -0.00725632 0.887821 -0.163758 -0.203765 -0 ⋯
6 │ 0.595999 -0.121985 -0.00725632 -0.112179 -0.163758 -0.203765 -0
7 │ 0.595999 -0.121985 -0.00725632 0.887821 -0.163758 -0.203765 -0
8 │ 0.595999 -0.121985 -0.00725632 0.887821 -0.163758 -0.203765 -0
9 │ 0.595999 -0.121985 -0.00725632 -0.112179 -0.163758 0.796235 -0 ⋯
10 │ 0.595999 -0.121985 -0.00725632 -0.112179 -0.163758 -0.203765 0
11 │ 0.595999 -0.121985 -0.00725632 -0.112179 0.836242 -0.203765 -0
⋮ │ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋱
5090 │ -0.404001 -0.121985 0.992744 -0.112179 -0.163758 -0.203765 0
5091 │ 0.595999 -0.121985 0.992744 -0.112179 -0.163758 -0.203765 -0 ⋯
5092 │ -0.404001 0.878015 -0.00725632 -0.112179 -0.163758 -0.203765 -0
5093 │ 0.595999 0.878015 -0.00725632 -0.112179 -0.163758 -0.203765 -0
5094 │ 0.595999 -0.121985 0.992744 -0.112179 0.836242 -0.203765 -0
5095 │ -0.404001 -0.121985 -0.00725632 -0.112179 -0.163758 -0.203765 0 ⋯
5096 │ -0.404001 -0.121985 -0.00725632 -0.112179 0.836242 0.796235 -0
5097 │ -0.404001 -0.121985 -0.00725632 -0.112179 0.836242 0.796235 -0
5098 │ -0.404001 -0.121985 -0.00725632 -0.112179 -0.163758 -0.203765 -0
5099 │ -0.404001 -0.121985 -0.00725632 -0.112179 -0.163758 -0.203765 -0 ⋯
232 columns and 5078 rows omitted, 5099×1 DataFrame
Row │ inuidur1
│ Float64
──────┼──────────
1 │ 2.89037
2 │ 0.0
3 │ 3.29584
4 │ 2.19722
5 │ 3.29584
6 │ 3.29584
7 │ 2.19722
8 │ 3.29584
9 │ 2.70805
10 │ 3.3322
11 │ 2.48491
⋮ │ ⋮
5090 │ 2.99573
5091 │ 0.0
5092 │ 3.13549
5093 │ 2.56495
5094 │ 1.79176
5095 │ 2.30259
5096 │ 1.38629
5097 │ 2.19722
5098 │ 1.38629
5099 │ 3.29584
5078 rows omitted, nothing, true, true, true, false, false, nothing, 1.1, nothing, 5000, 15, 1.0000000000000003e-5, -Inf, true, Inf, true)
Y_resid = rlasso(Y_reg_0)["residuals"]
D_resid = reshape(D_resid, length(D_resid), 1)
Lasso_ira = lm(D_resid, Y_resid)
LinearModel{GLM.LmResp{Vector{Float64}}, GLM.DensePredChol{Float64, CholeskyPivoted{Float64, Matrix{Float64}}}}:
Coefficients:
───────────────────────────────────────────────────────────────────
Coef. Std. Error t Pr(>|t|) Lower 95% Upper 95%
───────────────────────────────────────────────────────────────────
x1 -0.0788861 0.0355478 -2.22 0.0265 -0.148575 -0.00919709
───────────────────────────────────────────────────────────────────
# Comparative ATE estimation
table = NamedArray(zeros(4, 5))
table[1,2] = GLM.coeftable(ols_cl).cols[1][2]
table[2,2] = GLM.coeftable(ols_cl).cols[2][2]
table[3,2] = GLM.coeftable(ols_cl).cols[5][2]
table[4,2] = GLM.coeftable(ols_cl).cols[6][2]
table[1,3] = GLM.coeftable(ols_cra).cols[1][2]
table[2,3] = GLM.coeftable(ols_cra).cols[2][2]
table[3,3] = GLM.coeftable(ols_cra).cols[5][2]
table[4,3] = GLM.coeftable(ols_cra).cols[6][2]
table[1,4] = GLM.coeftable(ols_ira).cols[1][2]
table[2,4] = GLM.coeftable(ols_ira).cols[2][2]
table[3,4] = GLM.coeftable(ols_ira).cols[5][2]
table[4,4] = GLM.coeftable(ols_ira).cols[6][2]
table[1,5] = GLM.coeftable(Lasso_ira).cols[1][1]
table[2,5] = GLM.coeftable(Lasso_ira).cols[2][1]
table[3,5] = GLM.coeftable(Lasso_ira).cols[5][1]
table[4,5] = GLM.coeftable(Lasso_ira).cols[6][1]
T = DataFrame(table, [ :"Outcome", :"CL", :"CRA", :"IRA", :"IRA W Lasso"]) # table to dataframe
T[!,:Outcome] = string.(T[!,:Outcome]) # string - first column
T[1,1] = "Estimation"
T[2,1] = "Standar error"
T[3,1] = "Lower bound CI"
T[4,1] = "Upper bound CI"
header = (["Outcome", "CL", "CRA", "IRA", "IRA W Lasso"])
Outcome | CL | CRA | IRA | IRA W Lasso |
---|---|---|---|---|
Estimation | -0.0855 | -0.0797 | -0.0755 | -0.0789 |
Standar error | 0.0358 | 0.0356 | 0.0361 | 0.0355 |
Lower bound CI | -0.1557 | -0.1496 | -0.1462 | -0.1486 |
Upper bound CI | -0.0152 | -0.0098 | -0.0048 | -0.0092 |
\begin{table}
\begin{tabular}{ccccc}
\hline\hline
\textbf{Outcome} & \textbf{CL} & \textbf{CRA} & \textbf{IRA} & \textbf{IRA W Lasso} \\\hline
Estimation & -0.0855 & -0.0797 & -0.0755 & -0.0789 \\
Standar error & 0.0358 & 0.0356 & 0.0361 & 0.0355 \\
Lower bound CI & -0.1557 & -0.1496 & -0.1462 & -0.1486 \\
Upper bound CI & -0.0152 & -0.0098 & -0.0048 & -0.0092 \\\hline\hline
\end{tabular}
\end{table}
Treatment group 4 experiences an average decrease of about \(7.8\%\) in the length of unemployment spell.
Observe that regression estimators delivers estimates that are slighly more efficient (lower standard errors) than the simple 2 mean estimator, but essentially all methods have very similar standard errors. From IRA results we also see that there is not any statistically detectable heterogeneity. We also see the regression estimators offer slightly lower estimates – these difference occur perhaps to due minor imbalance in the treatment allocation, which the regression estimators try to correct.