The metafor Package

This is an old revision of the document!

Rothman et al. (2008)

The Methods

In chapter 15 of their book, Rothman et al. (2008) describe a variety of methods used to conduct stratified analyses in epidemiological research. These methods are actually closely tied to methods used for conducting meta-analyses, so it is instructive to examine how the examples from the chapter can be replicated with the metafor package.

Dataset 1: Table 15-1

The Data

Table 15-1 (page 260) provides data from a large multicenter clinical trial that examined the efficacy of tolbutamide in preventing the complications of diabetes. Despite the use of random assignment, subjects in the treatment group tended to be older than those in the placebo group. Since the outcome (death versus survival) is related to age, the results from a crude analysis may be biased. A stratified analysis can be used to adjust for this age difference.

We can create the same dataset with:

dat <- data.frame(
age = c("Age <55", "Age 55+"),
ai = c(8,22),
bi = c(98,76),
ci = c(5,16),
di = c(115,69))

The contents of dat are then:

      age ai bi ci  di
1 Age <55  8 98  5 115
2 Age 55+ 22 76 16  69

With the convenience function to.table(), we can look at the data in table format with:

to.table(ai=ai, bi=bi, ci=ci, di=di, data=dat, measure="OR", slab=age,
         rows=c("Tolbutamide", "Placebo"), cols=c("Dead", "Surviving"))

, , Age <55
 
            Dead Surviving
Tolbutamide    8        98
Placebo        5       115
 
, , Age 55+
 
            Dead Surviving
Tolbutamide   22        76
Placebo       16        69

Stratum-Specific and Crude Risk Differences

The stratum-specific risk differences can be computed with:

summary(escalc(ai=ai, bi=bi, ci=ci, di=di, data=dat, measure="RD", digits=3, append=FALSE))

     yi    vi   sei    zi  ci.lb ci.ub
1 0.034 0.001 0.031 1.074 -0.028 0.096
2 0.036 0.004 0.060 0.606 -0.081 0.153

On the other hand, the crude risk difference for these data can be obtained with:

summary(escalc(ai=sum(ai), bi=sum(bi), ci=sum(ci), di=sum(di), data=dat, measure="RD", digits=3, append=FALSE))

     yi    vi   sei    zi  ci.lb ci.ub
1 0.045 0.001 0.033 1.368 -0.019 0.109

As noted by Rothman et al. (2008), the crude risk difference (i.e., $0.045$) actually falls outside of the range of the stratum-specific risk differences (i.e., $0.034$ and $0.036$).

Stratum-Specific and Crude Risk Ratios

The stratum-specific risk ratios can be computed with:

summary(escalc(ai=ai, bi=bi, ci=ci, di=di, data=dat, measure="RR", digits=2), transf=exp, append=FALSE)

    yi   vi  sei   zi ci.lb ci.ub
1 1.81 0.31 0.55 1.07  0.61  5.37
2 1.19 0.09 0.29 0.60  0.67  2.12

The crude risk ratio can be obtained with:

summary(escalc(ai=sum(ai), bi=sum(bi), ci=sum(ci), di=sum(di), data=dat, measure="RR", digits=2, append=FALSE), transf=exp)

    yi   vi  sei   zi ci.lb ci.ub
1 1.44 0.07 0.27 1.36  0.85  2.42

Here, the confounding is not as obvious.

Mantel-Haenszel Method for the Risk Difference

Later in the chapter on page 275, Rothman et al. (2008) use the Mantel-Haenszel method to obtain a pooled estimate of the risk difference based on the stratified data. The same results can be obtained with:

rma.mh(ai=ai, bi=bi, ci=ci, di=di, data=dat, measure="RD", digits=3, level=90)

Fixed-Effects Model (k = 2)
 
Test for Heterogeneity: 
Q(df = 1) = 0.002, p-val = 0.967
 
Model Results:
 
estimate       se     zval     pval    ci.lb    ci.ub 
   0.035    0.032    1.094    0.274   -0.018    0.087

For direct comparability with the results reported in the chapter, a 90% confidence interval is requested. We obtain a Mantel-Haenszel risk difference of $0.035$ with an estimated SD of $0.032$ and a 90% confidence interval with limits $-0.018$ and $0.087$.

Mantel-Haenszel Method for the Risk Ratio

For the risk ratio, the results from the Mantel-Haenszel method can be obtained with:

rma.mh(ai=ai, bi=bi, ci=ci, di=di, data=dat, measure="RR", digits=2, level=90)

Fixed-Effects Model (k = 2)
 
Test for Heterogeneity: 
Q(df = 1) = 0.45, p-val = 0.50
 
Model Results (log scale):
 
estimate       se     zval     pval    ci.lb    ci.ub 
    0.28     0.26     1.09     0.28    -0.14     0.71 
 
Model Results (RR scale):
 
estimate    ci.lb    ci.ub 
    1.33     0.87     2.03

The Mantel-Haenszel risk-ratio estimate is $1.33$. The limits of the 90% confidence interval are $0.87$ and $2.03$.

Cochran-Mantel-Haenszel Test

The Cochran-Mantel-Haenszel test for stratified 2×2 table data is described on page 278. It can be obtained with:

rma.mh(ai=ai, bi=bi, ci=ci, di=di, data=dat, measure="OR", correct=FALSE, digits=2, level=90)

Fixed-Effects Model (k = 2)
 
Test for Heterogeneity: 
Q(df = 1) = 0.35, p-val = 0.56
 
Model Results (log scale):
 
estimate       se     zval     pval    ci.lb    ci.ub 
    0.34     0.31     1.09     0.28    -0.17     0.85 
 
Model Results (OR scale):
 
estimate    ci.lb    ci.ub 
    1.40     0.84     2.34 
 
Cochran-Mantel-Haenszel Test:    CMH = 1.19, df = 1, p-val = 0.28
Tarone's Test for Heterogeneity: X^2 = 0.35, df = 1, p-val = 0.55

Note that the equation given in the book does not include the continuity correction (hence, the use of correct=FALSE above). This again matches what is reported in the chapter on page 278 (note: Rothman et al. report $\chi_{score} = 1.09$, so this value needs to be squared before comparing it against the chi-square value of $1.19$ given above).

Testing Homogeneity

The Wald-type test for homogeneity is described on page 279. The results reported here ($\chi^2_{Wald} = 0.45$ for the risk ratios, $\chi^2_{Wald} = 0.001$ for the risk differences) essentially match the results given in the output earlier (Q(df = 1) = 0.45, p-val = 0.50 for the risk ratios, Q(df = 1) = 0.002, p-val = 0.967 for the risk differences).¹⁾

Dataset 2: Table 15-2

The Data

Table 15-2 (page 264) provides stratified person-time data on age-specific coronary disease deaths among British male doctors for smokers versus nonsmokers.

We can create the same dataset with:

dat <- data.frame(
age = c("35-44", "45-54", "55-64", "65-74", "75-84"),
x1i = c(32, 104, 206, 186, 102),
t1i = c(52407, 43248, 28612, 12663, 5317) / 10000,
x2i = c(2, 12, 28, 28, 31),
t2i = c(18790, 10673, 5710, 2585, 1462) / 10000)

The contents of dat are then:

    age x1i    t1i x2i    t2i
1 35-44  32 5.2407   2 1.8790
2 45-54 104 4.3248  12 1.0673
3 55-64 206 2.8612  28 0.5710
4 65-74 186 1.2663  28 0.2585
5 75-84 102 0.5317  31 0.1462

Note that the person-years have been divided by 10,000 so that rates are given in terms of 10,000 person-years.

With the convenience function to.table(), we can look at the data in table format with:

to.table(x1i=x1i, x2i=x2i, t1i=t1i, t2i=t2i, data=dat, measure="IRR", slab=age,
         rows=c("Smokers", "Nonsmokers"), cols=c("Deaths", "Years"))

, , 35-44
 
           Deaths  Years
Smokers        32 5.2407
Nonsmokers      2 1.8790
 
, , 45-54
 
           Deaths  Years
Smokers       104 4.3248
Nonsmokers     12 1.0673
 
, , 55-64
 
           Deaths  Years
Smokers       206 2.8612
Nonsmokers     28 0.5710
 
, , 65-74
 
           Deaths  Years
Smokers       186 1.2663
Nonsmokers     28 0.2585
 
, , 75-84
 
           Deaths  Years
Smokers       102 0.5317
Nonsmokers     31 0.1462

Stratum-Specific and Crude Rate Differences

The stratum-specific rate differences can be computed with:

summary(escalc(x1i=x1i, x2i=x2i, t1i=t1i, t2i=t2i, data=dat, measure="IRD", digits=1, append=FALSE))

     yi     vi  sei   zi  ci.lb ci.ub
1   5.0    1.7  1.3  3.8    2.5   7.6
2  12.8   16.1  4.0  3.2    4.9  20.7
3  23.0  111.0 10.5  2.2    2.3  43.6
4  38.6  535.0 23.1  1.7   -6.8  83.9
5 -20.2 1811.1 42.6 -0.5 -103.6  63.2

On the other hand, the crude rate difference for these data can be obtained with:

summary(escalc(x1i=sum(x1i), x2i=sum(x2i), t1i=sum(t1i), t2i=sum(t2i), data=dat, measure="IRD", digits=1, append=FALSE))

    yi  vi sei  zi ci.lb ci.ub
1 18.5 9.7 3.1 6.0  12.4  24.6

Stratum-Specific and Crude Rate Ratios

The stratum-specific rate ratios can be computed with:

summary(escalc(x1i=x1i, x2i=x2i, t1i=t1i, t2i=t2i, data=dat, measure="IRR", digits=1, append=FALSE), transf=exp)

   yi  vi sei   zi ci.lb ci.ub
1 5.7 0.5 0.7  2.4   1.4  23.9
2 2.1 0.1 0.3  2.5   1.2   3.9
3 1.5 0.0 0.2  1.9   1.0   2.2
4 1.4 0.0 0.2  1.5   0.9   2.0
5 0.9 0.0 0.2 -0.5   0.6   1.4

The values are also given in the last column of Table 15-2.

The crude rate ratio can be obtained with:

summary(escalc(x1i=sum(x1i), x2i=sum(x2i), t1i=sum(t1i), t2i=sum(t2i), data=dat, measure="IRR", digits=1, append=FALSE), transf=exp)

   yi  vi sei  zi ci.lb ci.ub
1 1.7 0.0 0.1 5.1   1.4   2.1

Mantel-Haenszel Method for the Rate Difference

Later in the chapter on page 274, Rothman et al. (2008) use the Mantel-Haenszel method to obtain a pooled estimate of the rate difference based on the stratified data. The same results can be obtained with:

rma.mh(x1i=x1i, x2i=x2i, t1i=t1i, t2i=t2i, data=dat, measure="IRD", digits=2, level=90)

Fixed-Effects Model (k = 5)
 
Test for Heterogeneity: 
Q(df = 4) = 26.88, p-val < .01
 
Model Results:
 
estimate       se     zval     pval    ci.lb    ci.ub 
   11.44     3.09     3.70     <.01     6.35    16.53

We obtain a Mantel-Haenszel rate difference of $11.44$ and a 90% confidence interval with limits $6.35$ and $16.53$ per 10,000 person-years.

Mantel-Haenszel Method for the Rate Ratio

For the rate ratio, the results from the Mantel-Haenszel method can be obtained with:

rma.mh(x1i=x1i, x2i=x2i, t1i=t1i, t2i=t2i, data=dat, measure="IRR", digits=2, level=90)

Fixed-Effects Model (k = 5)
 
Test for Heterogeneity: 
Q(df = 4) = 10.41, p-val = 0.03
 
Model Results (log scale):
 
estimate       se     zval     pval    ci.lb    ci.ub 
    0.35     0.11     3.30     <.01     0.18     0.53 
 
Model Results (IRR scale):
 
estimate    ci.lb    ci.ub 
    1.42     1.19     1.70 
 
Mantel-Haenszel Test: MH = 10.70, df = 1, p-val < .01

The Mantel-Haenszel risk-ratio estimate is $1.42$. The limits of the 90% confidence interval are $1.19$ and $1.70$.

Mantel-Haenszel Test

The Mantel-Haenszel test for stratified person-time data is described on page 277. It can be obtained with:

rma.mh(x1i=x1i, x2i=x2i, t1i=t1i, t2i=t2i, data=dat, measure="IRR", level=90, correct=FALSE)

Fixed-Effects Model (k = 5)
 
Test for Heterogeneity: 
Q(df = 4) = 10.4117, p-val = 0.0340
 
Model Results (log scale):
 
estimate       se     zval     pval    ci.lb    ci.ub 
  0.3539   0.1072   3.3018   0.0010   0.1776   0.5303 
 
Model Results (IRR scale):
 
estimate    ci.lb    ci.ub 
  1.4247   1.1944   1.6994 
 
Mantel-Haenszel Test: MH = 11.0162, df = 1, p-val = 0.0009

Note that the equation given in the book does not include the continuity correction (hence, the use of correct=FALSE above). After squaring, Rothman et al, report (630-595.2)^2 / 110.1, which is equal to $10.9995$. The difference should only be due to rounding in the book.

Testing Homogeneity

The results from the Wald-type test for homogeneity are described on page 280. The results reported here ($\chi^2_{Wald} = 10.41$ for the rate ratio) matches the results given in the output earlier (Q(df = 4) = 10.41, p-val = 0.03 for the rate ratios).

Maximum-Likelihood Estimation

On pages 271-272, Rothman et al. discuss maximum likelihood estimators and the difference between unconditional versus conditional analyses. The unconditional MLE of the common rate ratio can be obtained with:

res <- rma.glmm(x1i=x1i, x2i=x2i, t1i=t1i, t2i=t2i, data=dat, measure="IRR", digits=2, level=90, model="UM.FS", method="FE")
predict(res, transf=exp)

 pred ci.lb ci.ub
 1.43  1.19  1.70

The conditional MLE can be obtained with:

res <- rma.glmm(x1i=x1i, x2i=x2i, t1i=t1i, t2i=t2i, data=dat, measure="IRR", digits=2, level=90, model="CM.EL", method="FE")
predict(res, transf=exp)

 pred ci.lb ci.ub
 1.43  1.19  1.70

As noted on page 274, these estimates are essentially indistinguishable from those obtained with the Mantel-Haenszel method.

Dataset 3: Table 15-5

The Data

Table 15-5 (page 276) provides stratified 2×2 table data from a case-control study of Down syndrome and maternal spermicide use before conception.

The same dataset can be created with:

dat <- data.frame(
age = c("<35", "35+"),
ai = c(3,1),
bi = c(9,3),
ci = c(104,5),
di = c(1059,86))

The contents of dat are then:

  age ai bi  ci   di
1 <35  3  9 104 1059
2 35+  1  3   5   86

With the convenience function to.table(), we can look at the data in table format with:

to.table(ai=ai, bi=bi, ci=ci, di=di, data=dat, measure="OR", slab=age,
         rows=c("Down Syndrome", "Control"), cols=c("Spermicide Use", "No Spermicide"))

, , <35
 
              Spermicide Use No Spermicide
Down Syndrome              3             9
Control                  104          1059
 
, , 35+
 
              Spermicide Use No Spermicide
Down Syndrome              1             3
Control                    5            86

Mantel-Haenszel Method for the Odds Ratio

On page 276, Rothman et al. (2008) describe the Mantel-Haenszel method for odds ratios. The method can be applied to the data from the case-control study with:

rma.mh(ai=ai, bi=bi, ci=ci, di=di, data=dat, measure="OR", digits=2, level=90, correct=FALSE)

Fixed-Effects Model (k = 2)
 
Test for Heterogeneity: 
Q(df = 1) = 0.14, p-val = 0.71
 
Model Results (log scale):
 
estimate       se     zval     pval    ci.lb    ci.ub 
    1.33     0.59     2.25     0.02     0.36     2.30 
 
Model Results (OR scale):
 
estimate    ci.lb    ci.ub 
    3.78     1.43    10.00 
 
Cochran-Mantel-Haenszel Test:    CMH = 5.81, df = 1, p-val = 0.02
Tarone's Test for Heterogeneity: X^2 = 0.14, df = 1, p-val = 0.71

The Mantel-Haenszel estimate of the odds ratio is $3.78$. The limits of the 90% confidence interval are $1.43$ and $10.00$.

Cochran-Mantel-Haenszel Test

The Cochran-Mantel-Haenszel test (without continuity correction) is already included in the output above. On page 278, Rothman et al. report the value $2.41$, which is equal to $5.81$ after squaring.

Maximum-Likelihood Estimation

On page 276, the unconditional and conditional MLEs of the common odds ratio are given as $3.79$ and $3.76$, respectively. We can obtain those values with:

res <- rma.glmm(ai=ai, bi=bi, ci=ci, di=di, data=dat, measure="OR", digits=2, level=90, model="UM.FS", method="FE")
predict(res, transf=exp)

 pred ci.lb ci.ub
 3.79  1.43 10.03

and

res <- rma.glmm(ai=ai, bi=bi, ci=ci, di=di, data=dat, measure="OR", digits=2, level=90, model="CM.EL", method="FE")
predict(res, transf=exp)

 pred ci.lb ci.ub
 3.76  1.43  9.94

Despite the relatively small counts in some of the cells, the Mantel-Haenszel method and the results based on maximum likelihood estimation are very similar.

References

Rothman, K. J., Greenland, S., & Lash, T. L. (2008). Modern epidemiology (3rd ed.). Philadelphia: Lippincott Williams & Wilkins.

¹⁾

While Rothman et al. use the MLE of the common risk ratio and the common risk difference to compute these Wald-type tests, the rma.mh() substitutes the corresponding Mantel-Haenszel estimates. Rothman et al. do not recommend this practice, but it is not clear to me why this would 'invalidate' the test statistics.

Table of Contents

Rothman et al. (2008)

The Methods

Dataset 1: Table 15-1

The Data

Stratum-Specific and Crude Risk Differences

Stratum-Specific and Crude Risk Ratios

Mantel-Haenszel Method for the Risk Difference

Mantel-Haenszel Method for the Risk Ratio

Cochran-Mantel-Haenszel Test

Testing Homogeneity

Dataset 2: Table 15-2

The Data

Stratum-Specific and Crude Rate Differences

Stratum-Specific and Crude Rate Ratios

Mantel-Haenszel Method for the Rate Difference

Mantel-Haenszel Method for the Rate Ratio

Mantel-Haenszel Test

Testing Homogeneity

Maximum-Likelihood Estimation

Dataset 3: Table 15-5

The Data

Mantel-Haenszel Method for the Odds Ratio

Cochran-Mantel-Haenszel Test

Maximum-Likelihood Estimation

References