11. Heterogenous Wage Effects#

We use US census data from the year 2012 to analyse the effect of gender and interaction effects of other variables with gender on wage jointly. The dependent variable is the logarithm of the wage, the target variable is female (in combination with other variables). All other variables denote some other socio-economic characteristics, e.g. marital status, education, and experience. For a detailed description of the variables we refer to the help page.

This analysis allows a closer look how discrimination according to gender is related to other socio-economic variables.

install.packages("librarian", quiet = T)
librarian::shelf(hdm, quiet = T)
data(cps2012)
str(cps2012)
also installing the dependency ‘BiocManager’



  These packages will be installed:

  'hdm'

  It may take some time.

also installing the dependencies ‘iterators’, ‘foreach’, ‘shape’, ‘Rcpp’, ‘RcppEigen’, ‘glmnet’, ‘checkmate’, ‘Formula’
'data.frame':	29217 obs. of  23 variables:
 $ year        : num  2012 2012 2012 2012 2012 ...
 $ lnw         : num  1.91 1.37 2.54 1.8 3.35 ...
 $ female      : num  1 1 0 1 0 0 0 0 0 1 ...
 $ widowed     : num  0 0 0 0 0 0 0 0 0 0 ...
 $ divorced    : num  0 0 0 0 0 0 0 0 0 0 ...
 $ separated   : num  0 0 0 0 0 0 0 0 0 0 ...
 $ nevermarried: num  0 0 0 0 0 0 1 0 0 0 ...
 $ hsd08       : num  0 0 0 0 0 0 0 0 0 0 ...
 $ hsd911      : num  0 1 0 0 0 0 0 0 0 0 ...
 $ hsg         : num  0 0 1 1 0 1 1 0 0 0 ...
 $ cg          : num  0 0 0 0 1 0 0 0 1 0 ...
 $ ad          : num  0 0 0 0 0 0 0 0 0 0 ...
 $ mw          : num  0 0 0 0 0 0 0 0 0 0 ...
 $ so          : num  0 0 0 0 0 0 0 0 0 0 ...
 $ we          : num  0 0 0 0 0 0 0 0 0 0 ...
 $ exp1        : num  22 30 19 14 15 23 33 23.5 15 15.5 ...
 $ exp2        : num  4.84 9 3.61 1.96 2.25 ...
 $ exp3        : num  10.65 27 6.86 2.74 3.38 ...
 $ exp4        : num  23.43 81 13.03 3.84 5.06 ...
 $ weight      : num  569 626 264 257 257 ...
 $ married     : logi  TRUE TRUE TRUE TRUE TRUE TRUE ...
 $ ne          : logi  TRUE TRUE TRUE TRUE TRUE TRUE ...
 $ sc          : logi  TRUE FALSE FALSE FALSE FALSE FALSE ...
# create the model matrix for the covariates
X <- model.matrix(~-1 + female + female:(widowed + divorced + separated + nevermarried +
hsd08 + hsd911 + hsg + cg + ad + mw + so + we + exp1 + exp2 + exp3) + +(widowed +
divorced + separated + nevermarried + hsd08 + hsd911 + hsg + cg + ad + mw + so +
we + exp1 + exp2 + exp3)^2, data = cps2012)
X <- X[, which(apply(X, 2, var) != 0)] # exclude all constant variables
demean<- function (x){ x- mean(x)}
X<- apply(X, 2, FUN=demean)
dim(X)

# target variables, index.gender specifices coefficients we are interested in
index.gender <- grep("female", colnames(X))
y <- cps2012$lnw
  1. 29217
  2. 116

The parameter estimates for the target parameters, i.e. all coefficients related to gender (i.e. by interaction with other variables) are calculated and summarized by the following commands:

# this cell takes a minute to run

effects.female <- rlassoEffects(x = X, y = y, index = index.gender)
result=summary(effects.female)
result$coef
A matrix: 16 × 4 of type dbl
Estimate.Std. Errort valuePr(>|t|)
female-0.1549232810.050162447-3.088431492.012161e-03
female:widowed 0.1360954840.090662629 1.501119971.333245e-01
female:divorced 0.1369393860.022181700 6.173529706.678200e-10
female:separated 0.0233027630.053211795 0.437924766.614408e-01
female:nevermarried 0.1868534830.019942393 9.369662097.276511e-21
female:hsd08 0.0278103120.120914496 0.229999828.180919e-01
female:hsd911-0.1193350400.051879684-2.300226822.143537e-02
female:hsg-0.0128897800.019223188-0.670532905.025181e-01
female:cg 0.0101385530.018326505 0.553218005.801141e-01
female:ad-0.0304637450.021806103-1.397028381.624050e-01
female:mw-0.0010634390.019191770-0.055411199.558109e-01
female:so-0.0081833430.019356818-0.422762826.724683e-01
female:we-0.0042261290.021168404-0.199643248.417596e-01
female:exp1 0.0049352590.007804275 0.632378865.271393e-01
female:exp2-0.1595193280.045299884-3.521406994.292632e-04
female:exp3 0.0384505790.007861100 4.891246801.001992e-06

Now, we estimate and plot confident intervals, first “pointwise” and then the joint confidence intervals.

pointwise.CI <- confint(effects.female, level = 0.90)
pointwise.CI
plot(effects.female, level=0.90) # plot of the effects
A matrix: 16 × 2 of type dbl
5 %95 %
female-0.237433164-0.072413398
female:widowed-0.013031271 0.285222239
female:divorced 0.100453736 0.173425037
female:separated-0.064222851 0.110828376
female:nevermarried 0.154051166 0.219655800
female:hsd08-0.171076335 0.226696960
female:hsd911-0.204669525-0.034000554
female:hsg-0.044509111 0.018729551
female:cg-0.020005866 0.040282971
female:ad-0.066331593 0.005404103
female:mw-0.032631091 0.030504214
female:so-0.040022474 0.023655789
female:we-0.039045055 0.030592798
female:exp1-0.007901632 0.017772149
female:exp2-0.234031007-0.085007650
female:exp3 0.025520220 0.051380937
Warning message:
“Ignoring unknown aesthetics: width, h”
../_images/14a2119029310f2d60a102a44aa4fbf193935438ee5dfa85a1b26db89c511aa8.png

Finally, we compare the pointwise confidence intervals to joint confidence intervals.

joint.CI <- confint(effects.female, level = 0.90, joint = TRUE)
joint.CI
plot(effects.female, joint=TRUE, level=0.90) # plot of the effects
A matrix: 16 × 2 of type dbl
5 %95 %
female-0.28721730-0.02262927
female:widowed-0.12010118 0.39229214
female:divorced 0.07792288 0.19595589
female:separated-0.10967329 0.15627881
female:nevermarried 0.13215504 0.24155193
female:hsd08-0.35426488 0.40988550
female:hsd911-0.26149507 0.02282500
female:hsg-0.06251119 0.03673163
female:cg-0.03907505 0.05935216
female:ad-0.09254849 0.03162100
female:mw-0.05200948 0.04988260
female:so-0.05997736 0.04361068
female:we-0.06254002 0.05408776
female:exp1-0.01541738 0.02528790
female:exp2-0.27787565-0.04116301
female:exp3 0.01793445 0.05896671
Warning message:
“Ignoring unknown aesthetics: width, h”
../_images/ccfc948d502f1d47b5afa6f61177967aa3e725c44c51fcec74eeb62da426bfa6.png