Some RCT Examples

# !wget https://developer.nvidia.com/compute/cuda/9.0/Prod/local_installers/cuda-repo-ubuntu1604-9-0-local_9.0.176-1_amd64-deb
# !dpkg -i cuda-repo-ubuntu1604-9-0-local_9.0.176-1_amd64-deb
# !apt-key add /var/cuda-repo-9-0-local/7fa2af80.pub
# !apt update -q
# !apt install cuda gcc-6 g++-6 -y -q
# !ln -s /usr/bin/gcc-6 /usr/local/cuda/bin/gcc
# !ln -s /usr/bin/g++-6 /usr/local/cuda/bin/g++

# !curl -sSL "https://julialang-s3.julialang.org/bin/linux/x64/1.7/julia-1.7.3-linux-x86_64.tar.gz" -o julia.tar.gz
# !tar -xzf julia.tar.gz -C /usr --strip-components 1
# !rm -rf julia.tar.gz*
# !julia -e 'using Pkg; pkg"add IJulia; precompile"'

## Install package 

#import Pkg; Pkg.add("CairoMakie")
#import Pkg; Pkg.add("Plots")
#import Pkg; Pkg.add("Distributions")
#import Pkg; Pkg.add("HypothesisTests")
#Pkg.add("ClinicalTrialUtilities")
using CairoMakie  # for density plot 
using ClinicalTrialUtilities
using Random   # for random seeds
using Statistics 
using Distributions
using HypothesisTests
using Images, FileIO

4. Some RCT Examples#

One of the earliest randomized experiments were the Polio vaccination trias conducted by the Public Health Service in 1954. The question was whether Salk vaccine prevented polio. Children in the study were randomly assigned either a treatment (polio vaccine shot) or a placebo (saline solution shot), without knowing which one they received. The doctors in the study, making the diagnosis, did not know whether a child received a vaccine or not. In other words, the trial was a double-blind, randomized control trial. The trial had to be large, because the rate at which Polio occured in the population was 50 per 100,000. The treatment group saw 33 polio cases per 200,745; the control group saw 115 cases per 201,229. The estimated avearage treatment effect is about

\[-40\]

with the 95% confidence band (based on approximate normality of the two sample means and their differences):

\[[−52,−28]\]

As this is an RCT, the confidence band suggests that the Polio vaccine caused the reduction in the risk of polio.

The interesting thing here is that we don’t need the underlying individual data to evaluate the effectivess of the vaccine. This is because the outcomes are Bernoulli random variables, and we have enough information to compute the estimate of ATE as well as the confidence intervals from the group case counts.

We also compute the Vaccine Efficacy metric, which refers to the following measure according to the CDC:

\[VE=\frac{Risk for Unvaccinated - Risk for Vaccinated}{Risk for Unvaccinated}\]

It describes the relative reduction in risk caused by vaccination.

It is staighforward to get the VE estimate by just plugging-in the numbers, but how do we get the approximate variance estimate? I am too lazy to do calculations for the delta method, so I will just use a simulation (a form of approximate bootstrap) to obtain the confidence intervals.

NV = 200745 # number of vaccinated (treated)
NU = 201229 # number of unvaccinated (control)
RV= 33/NV  # average outcome for vaccinated
RU =115/NU  # average outcome for unvaccinated
VE = (RU - RV)/RU # vaccine efficacy

0.7123516206478694

## Incidence per 100000

n = 100000

IncidenceRV = RV*n
IncidenceRU = RU*n

#Treatment effect - estimated reduction in incidence of polio per 100000 people

ATE_hat = n*(RV-RU)

println("Incidence per 100000 among treated: ", round(IncidenceRV; digits =4))

println("Incidence per 100000 among untreated: ", round(IncidenceRU; digits =4))

println("Estimated ATE of occurances per 100,000 is: ", round(ATE_hat; digits =4))

Incidence per 100000 among treated: 16.4388
Incidence per 100000 among untreated: 57.1488
Estimated ATE of occurances per 100,000 is: -40.7101

# outcomes (RV, RU) are Bernoulli
# variance, standard deviation and confidence interval of ATE 

var_RV = RV*(1-RV)/NV
var_RU = RU*(1-RU)/NU

var_ATE_hat  = n^2*(var_RV+var_RU)

std_ATE_hat = sqrt(var_ATE_hat)

println("Standard deviation for ATE is:  ", round(std_ATE_hat; digits =4))

Standard deviation for ATE is:  6.0474

# Confidence interval 

CI_ATE_hat =[ round(ATE_hat - 1.96*std_ATE_hat; digits = 4), round(ATE_hat + 1.96*std_ATE_hat; digits = 4)]

println("95% confidence interval of ATE is $CI_ATE_hat")

println("Overall VE is: " , round(VE; digits =4))

95% confidence interval of ATE is [-52.563, -28.8571]
Overall VE is: 0.7124

# Confidence interval based on approximate bootstrap

# Monte Carlo draws

Random.seed!(1)

B = 10000 #  number of boostraps
RVs = RV*ones(Float16, B)  + randn(B)*sqrt(var_RV)
RUs = RU*ones(Float16, B)  + randn(B)*sqrt(var_RU)
VEs= (RUs - RVs)./RUs

10000-element Vector{Float64}:
6817224261974857
6953664347250816
7531459258609844
6867222213078893
6165362111226124
7817779666902431
6333934469608703
6556843828007883
8379647737634552
8132928763558297
6864550692620768
729847409124033
6893974356034072
 ⋮
7124512490775646
743478827635636
6243684604532374
7093989620521691
7272183082401089
588658896827336
6917235071731522
7243046193192478
7552945654793395
753296520825968
727182181643485
8022899199587447

## Confidence interval of VEs

CI_VE = [round(quantile!(VEs, 0.025); digits = 4 ), round(quantile!(VEs, 0.975); digits = 4)]

println("two-sided 95% confidence interval of VE is $CI_VE")

## Approximate distribution of VEs estimates 

f = Figure(resolution = (600, 400))
Axis(f[1,1], title = "Approximate distribution of VEs estimates ")
    
density!(VEs, color = (:red, 0.1), strokecolor = :red, strokewidth = 1, strokearound = true, bandwidth = 0.008257) 

f

two-sided 95% confidence interval of VE is [0.59, 0.8179]

../_images/72a94f808195be78513edce7455fc70625a1b05ddf1a82a1209b273ac923f47d.png

4.1. Pfizer/BNTX Covid-19 RCT application#

Pfizer/BNTX was the first vaccine approved for emergency use to reduce the risk of Covid-19 decease. In studies to assess vaccine efficacy, volunteers were randomly assigned to receive either a treatment (2-dose vaccination) or a placebo, without knowing which they recieved. The doctors making the diagnoses did not know now whether a given volunteer received a vaccination or not. The results of the study are given in the following table.

img_path = "../data/imagen_RCT.png"
img = load(img_path)
imresize(img, ratio=1)

Here we see both the overall effects and the effects by age group. The confidence intervals for the overall ATE are tight and suggest high effectiveness of the vaccine. The confidence intervals for the age group 65-75 are much wider due to the relatively small number of people in this group. We could group 65-75 and >75 groups to evaluate the effectiveness of the vaccine for this broader age group and narrow the width of the confidence band.

We use the same approach as that for the Polio example. This gives slightly different results than the FDA result, because the FDA used inversion of exact binomial tests to construct confidence intervals. We use asymptotic approches based on approximate normality, which is more crude, but delivers a rather similar result.

# define inputs 

NV =  19965; # number vaccinated
NU =  20172; # number unvaccinated
RV = 9/NV; # average outcome for vaccinated
RU = 169/NU; # average outcome for unvaccinated
VE = (RU - RV)/RU; # vaccine efficacy

## Incidence per 100000

n = 100000

IncidenceRV = RV*n
IncidenceRU = RU*n

#Treatment effect - estimated reduction in incidence of polio per 100000 people

ATE_hat = n*(RV-RU)

println("Incidence per 100000 among treated: ", round(IncidenceRV; digits =4))

println("Incidence per 100000 among untreated: ", round(IncidenceRU; digits =4))

println("Estimated ATE of occurances per 100 000 is: ", round(ATE_hat; digits =4))

# outcomes (RV, RU) are Bernoulli
# variance, standard deviation and confidence interval of ATE 

var_RV = RV*(1-RV)/NV
var_RU = RU*(1-RU)/NU

var_ATE_hat  = n^2*(var_RV+var_RU)

std_ATE_hat = sqrt(var_ATE_hat)

println("Standard deviation for ATE is:  ", round(std_ATE_hat; digits =4))

# Confidence interval 

CI_ATE_hat =[ round(ATE_hat - 1.96*std_ATE_hat; digits = 4), round(ATE_hat + 1.96*std_ATE_hat; digits = 4)]

println("95% confidence interval of ATE is $CI_ATE_hat")

println("Overall VE is: " , round(VE; digits =4))

Random.seed!(1)

B = 10000 #  number of boostraps
RVs = RV*ones(Float16, B)  + randn(B)*sqrt(var_RV)
RUs = RU*ones(Float16, B)  + randn(B)*sqrt(var_RU)
VEs= (RUs - RVs)./RUs


## Confidence interval of VEs

CI_VE = [round(quantile!(VEs, 0.025); digits = 4 ), round(quantile!(VEs, 0.975); digits = 4)]

println("two-sided 95% confidence interval of VE is $CI_VE")

## Approximate distribution of VEs estimates 

f = Figure(resolution = (500, 400))
Axis(f[1,1], title = "Approximate distribution of VEs estimates ")
    
density!(VEs, color = (:blue, 0.1), strokecolor = :blue, strokewidth = 1, strokearound = true) 

f

In the code cell below we calculate the effectiveness of the vaccine for the two groups that are 65 or older

NV =  3239+805;
NU =  3255+812;
RV = 1/NV;
RU = (14+5)/NU;
VE = (RU - RV)/RU;

println("Overall VE is: " , round(VE; digits =4))

var_RV = RV*(1-RV)/NV
var_RU = RU*(1-RU)/NU

Overall VE is: 0.9471

1.1433299694990984e-6

Random.seed!(1)

B = 10000 #  number of boostraps

RVs = RV*ones(Float16, B)  + randn(B)*sqrt(var_RV) + 10^(-10)*ones(Float16, B) 
RUs = RU*ones(Float16, B)  + randn(B)*sqrt(var_RU) + 10^(-10)*ones(Float16, B) 
VEs= (RUs - RVs)./RUs

## Confidence interval of VEs

CI_VE = [round(quantile!(VEs, 0.025); digits = 4 ), round(quantile!(VEs, 0.975); digits = 4)]

OneSidedCI_VE = [round(quantile!(VEs, 0.05); digits = 4 ), 1]

println("two-sided  95% confidence interval of VE is $CI_VE")

println("one side 95% confidence interval of VE is $OneSidedCI_VE")

## Approximate distribution of VEs estimates 

f = Figure(resolution = (500, 400))
Axis(f[1,1], title = "Approximate distribution of VEs estimates ")
    
density!(VEs, color = (:gray, 0.1), strokewidth = 1, strokearound = true) 

f

Let’s try the parametric boostrap next, using the fact that the outcome is Bernouli.

NV =  3239+805;
NU =  3255+812;
RV = 1/NV;
RU = (14+5)/NU;
VE = (RU - RV)/RU;

println("Overall VE is: " , round(VE; digits =4))

Overall VE is: 0.9471

Random.seed!(1)

B = 10000 #  number of boostraps
VEs= (RUs - RVs)./RUs

RVs = rand(Binomial(NV,RV), B)
RUs = rand(Binomial(NU,RU), B)
VEs= (RUs - RVs)./RUs

## Confidence interval of VEs

CI_VE = [round(quantile!(VEs, 0.025); digits = 4 ), round(quantile!(VEs, 0.975); digits = 4)]

OneSidedCI_VE = [round(quantile!(VEs, 0.05); digits = 4 ), 1]

println("two-sided 95% confidence interval of VE is $CI_VE")

println("one sided 95% confidence interval of VE is $OneSidedCI_VE")

## Approximate distribution of VEs estimates 

f = Figure(resolution = (500, 400))
Axis(f[1,1], title = "Approximate distribution of VEs estimates ")
    
density!(VEs, color = (:gray, 0.1), strokewidth = 1, strokearound = true) 

f

4.2. Exact Binomial Test Inversion#

It is possible to use exact inference by inverting tests based on the exact binomial nature of the outcome variable. Here, we perform the Cornfield Procedure to find the exact confidence interval on the estimate of vaccine efficacy.

# Exact CI exploiting Bernoulli outcome using the Cornfield Procedure

NV =  19965;
NU =  20172;
RV = 9/NV;
RU = 169/NU;
VE = (RU - RV)/RU;

confint = orpropci(9::Int, NV::Int, 169::Int, NU::Int; alpha = 0.05)::ConfInt

Estimate: 0.053379861731311845 (0.027616272602670834 - 0.10317864685976501)

l=1-0.027616272602670834
u=1-0.10317864685976501
confint = (l,u)

(0.9723837273973291, 0.896821353140235)

Note that this exactly recovers the result in the FDA table (first row).

# Exact CI exploiting Bernoulli outcome for the two groups that are 65 or older

NV =  3239+805;
NU =  3255+812;
RV = 1/NV;
RU = (14+5)/NU;
VE = (RU - RV)/RU;

confint = orpropci(1::Int, NV::Int, 19::Int, NU::Int; alpha = 0.05)::ConfInt

Estimate: 0.05269666870614577 (0.00896277602085047 - 0.30983022294266616)

Next we repeat the cornfield procedure to find the exact confidence interval on vaccine effectiveness for the two groups that are 65 or older. Here we see a big discrepancy between various asymptotic approaches and the exact finite-sample inference. This occurs because the binomial counts are too low for central limit theorems to work successfully.

l=1-0.00896277602085047
u=1-0.30983022294266616
confint = (l,u)

(0.9910372239791495, 0.6901697770573338)

Some RCT Examples

Contents

4. Some RCT Examples#

4.1. Pfizer/BNTX Covid-19 RCT application#

4.2. Exact Binomial Test Inversion#