# The metafor Package

A Meta-Analysis Package for R

### Sidebar

analyses:gleser2009

## Gleser & Olkin (2009)

### Introduction

The chapter on stochastically dependent effect sizes by Gleser and Olkin (2009) in The handbook of research synthesis and meta-analysis (2nd ed.) describes how multiple-treatment and multiple-endpoint studies can be meta-analyzed. In particular, whenever there is at least some overlap in the individuals used to compute multiple effect size estimates in a particular study, then the estimates from that study can no longer be treated as independent. The chapter describes how the correlation between the effect size estimates can be computed for a variety of outcome measures and how this information can be incorporated into the analysis.

Below, the analyses described in the chapter are recreated using R and the metafor package, so let's load the package first:

library(metafor)

### Multiple-Treatment Studies

In a multiple-treatment study, more than one treatment (or variation of the treatment) is compared against a single control group. For each treatment group, we can compute an effect size estimate, contrasting the treatment group against the control group. Since the information from the control group is repeatedly used to compute the effect size estimates, the resulting effect size estimates are correlated.

#### Dichotomous Response Variable

When the response variable within the individual studies is dichotomous, commonly used outcome measures for the meta-analysis are the risk difference, odds ratio, risk ratio, and the arcsine transformed risk difference. The data in Table 19.1 are used to illustrate these measures. This dataset can be recreated with:

dat <- data.frame(study=c(1,1,2,3,3,3), trt=c(1,2,1,1,2,3),
ai=c( 40, 40, 10,150,150,150), n1i=c(1000,1000,200,2000,2000,2000),
ci=c(100,150, 15, 40, 80, 50), n2i=c(4000,4000,400,1000,1000,1000))
dat$pti <- with(dat, ci / n2i) dat$pci <- with(dat, ai / n1i)
dat
  study trt  ai  n1i  ci  n2i    pti   pci
1     1   1  40 1000 100 4000 0.0250 0.040
2     1   2  40 1000 150 4000 0.0375 0.040
3     2   1  10  200  15  400 0.0375 0.050
4     3   1 150 2000  40 1000 0.0400 0.075
5     3   2 150 2000  80 1000 0.0800 0.075
6     3   3 150 2000  50 1000 0.0500 0.075

Note that the dataset is created in the "long format", with multiple rows for the multi-treatment studies (i.e., studies 1 and 3).

The risk differences and corresponding sampling variances can be added to the dataset with:

dat <- escalc(measure="RD", ai=ai, ci=ci, n1i=n1i, n2i=n2i, data=dat)
dat
  study trt  ai  n1i  ci  n2i    pti   pci      yi     vi
1     1   1  40 1000 100 4000 0.0250 0.040  0.0150 0.0000
2     1   2  40 1000 150 4000 0.0375 0.040  0.0025 0.0000
3     2   1  10  200  15  400 0.0375 0.050  0.0125 0.0003
4     3   1 150 2000  40 1000 0.0400 0.075  0.0350 0.0001
5     3   2 150 2000  80 1000 0.0800 0.075 -0.0050 0.0001
6     3   3 150 2000  50 1000 0.0500 0.075  0.0250 0.0001

As in the article, the risk differences are computed with $p_C - p_T$ to avoid a large number of negative estimates.

Multiple risk differences from the same study are correlated. We therefore need to construct the entire variance-covariance matrix for the 6 effect size estimates. This can be done with:

calc.v <- function(x) {
v <- matrix(x$pci*(1-x$pci)/x$n1i, nrow=nrow(x), ncol=nrow(x)) diag(v) <- x$vi
v
}
V <- bldiag(lapply(split(dat, dat$study), calc.v)) Since the numbers in V are quite small, we can use: round(V * 10^6, 3) to get a better impression of the structure of the V matrix:  [,1] [,2] [,3] [,4] [,5] [,6] [1,] 44.494 38.400 0.000 0.000 0.000 0.000 [2,] 38.400 47.423 0.000 0.000 0.000 0.000 [3,] 0.000 0.000 327.734 0.000 0.000 0.000 [4,] 0.000 0.000 0.000 73.088 34.688 34.688 [5,] 0.000 0.000 0.000 34.688 108.287 34.688 [6,] 0.000 0.000 0.000 34.688 34.688 82.188 The values along the diagonal are the (scaled) sampling variances of the risk differences (i.e., scaled by$10^6$). The off-diagonal elements are the (scaled) covariances. Note the block-diagonal structure: Multiple risk differences from the same study are correlated, while risk differences from different studies are uncorrelated. Now the model described in equation (19.5) can be fitted with: res <- rma.mv(yi, V, mods = ~ factor(trt) - 1, data=dat) res Multivariate Meta-Analysis Model (k = 6; method: FE) Variance Components: none Test for Residual Heterogeneity: QE(df = 3) = 7.1907, p-val = 0.0661 Test of Moderators (coefficient(s) 1,2,3): QM(df = 3) = 30.4173, p-val < .0001 Model Results: estimate se zval pval ci.lb ci.ub factor(trt)1 0.0200 0.0050 4.0347 <.0001 0.0103 0.0297 *** factor(trt)2 0.0043 0.0053 0.8011 0.4231 -0.0062 0.0148 factor(trt)3 0.0211 0.0084 2.5304 0.0114 0.0048 0.0375 * --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 The variance-covariance matrix (scaled by a factor of$10^6$) of the parameter estimates is equal to: round(vcov(res) * 10^6, 3)  factor(trt)1 factor(trt)2 factor(trt)3 factor(trt)1 24.612 19.954 13.323 factor(trt)2 19.954 28.538 13.255 factor(trt)3 13.323 13.255 69.806 These results match what is reported on pages 361-362.1) The analysis can also be conducted with (log) odds ratios as follows: dat <- escalc(measure="OR", ai=ai, ci=ci, n1i=n1i, n2i=n2i, data=dat) calc.v <- function(x) { v <- matrix(1/(x$n1i*x$pci*(1-x$pci)), nrow=nrow(x), ncol=nrow(x))
diag(v) <- x$vi v } V <- bldiag(lapply(split(dat, dat$study), calc.v))

res <- rma.mv(yi, V, mods = ~ factor(trt) - 1, data=dat)
res
Multivariate Meta-Analysis Model (k = 6; method: FE)

Variance Components: none

Test for Residual Heterogeneity:
QE(df = 3) = 2.0563, p-val = 0.5608

Test of Moderators (coefficient(s) 1,2,3):
QM(df = 3) = 31.1204, p-val < .0001

Model Results:

estimate      se    zval    pval    ci.lb   ci.ub
factor(trt)1    0.5099  0.1188  4.2918  <.0001   0.2771  0.7428  ***
factor(trt)2    0.0044  0.1086  0.0403  0.9679  -0.2084  0.2171
factor(trt)3    0.4301  0.1644  2.6162  0.0089   0.1079  0.7523   **

---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

These results are given on pages 362-363.

To analyze (log) risk ratios, we use:

dat <- escalc(measure="RR", ai=ai, ci=ci, n1i=n1i, n2i=n2i, data=dat)

calc.v <- function(x) {
v <- matrix((1-x$pci)/(x$n1i*x$pci), nrow=nrow(x), ncol=nrow(x)) diag(v) <- x$vi
v
}

V <- bldiag(lapply(split(dat, dat$study), calc.v)) res <- rma.mv(yi, V, mods = ~ factor(trt) - 1, data=dat) res Multivariate Meta-Analysis Model (k = 6; method: FE) Variance Components: none Test for Residual Heterogeneity: QE(df = 3) = 1.8954, p-val = 0.5944 Test of Moderators (coefficient(s) 1,2,3): QM(df = 3) = 30.9802, p-val < .0001 Model Results: estimate se zval pval ci.lb ci.ub factor(trt)1 0.4875 0.1135 4.2972 <.0001 0.2652 0.7099 *** factor(trt)2 0.0006 0.1018 0.0055 0.9956 -0.1991 0.2002 factor(trt)3 0.4047 0.1554 2.6035 0.0092 0.1000 0.7094 ** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Finally, the chapter illustrates how to analyze the difference of arcsine square-root transformed proportions (a variance stabilizing transformation for proportions). These results can be replicated with: dat <- escalc(measure="AS", ai=ai, ci=ci, n1i=n1i, n2i=n2i, data=dat) calc.v <- function(x) { v <- matrix(1/(4*x$n1i), nrow=nrow(x), ncol=nrow(x))
diag(v) <- x$vi v } V <- bldiag(lapply(split(dat, dat$study), calc.v))

res <- rma.mv(yi, V, mods = ~ factor(trt) - 1, data=dat)
res
Multivariate Meta-Analysis Model (k = 6; method: FE)

Variance Components: none

Test for Residual Heterogeneity:
QE(df = 3) = 4.2634, p-val = 0.2344

Test of Moderators (coefficient(s) 1,2,3):
QM(df = 3) = 31.8052, p-val < .0001

Model Results:

estimate      se    zval    pval    ci.lb   ci.ub
factor(trt)1    0.0505  0.0120  4.1921  <.0001   0.0269  0.0741  ***
factor(trt)2    0.0051  0.0123  0.4124  0.6800  -0.0191  0.0292
factor(trt)3    0.0491  0.0185  2.6503  0.0080   0.0128  0.0854   **

---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

These results are given on pages 364-365.2)

#### Quantitative Response Variable

When quantitative response variables are measured in the individual studies, the standardized mean difference is often used as the effect size measure. The illustrative dataset from Table 19.3 can be recreated with:

dat <- data.frame(study=c(1,1,2,3,4,4), trt=c(1,2,1,1,1,2),
m1i=c(7.87, 4.35, 9.32, 8.08, 7.44, 5.34),
m2i=c(-1.36, -1.36, 0.98, 1.17, 0.45, 0.45),
sdpi=c(4.2593,4.2593,2.8831,3.1764,2.9344,2.9344),
n1i=c(25,22,38,50,30,30), n2i=c(25,25,40,50,30,30))

In addition, the total sample sizes of the individual studies are needed for further computations. This information can be added to the dataset with:

dat$Ni <- unlist(lapply(split(dat, dat$study), function(x) rep(sum(x$n1i) + x$n2i, each=nrow(x))))

Finally, the standardized mean differences and corresponding sampling variances can be computed with:

dat$yi <- with(dat, (m1i-m2i)/sdpi) dat$vi <- with(dat, 1/n1i + 1/n2i + yi^2/(2*Ni))
dat

The dataset now contains:

  study trt  m1i   m2i   sdpi n1i n2i  Ni       yi         vi
1     1   1 7.87 -1.36 4.2593  25  25  72 2.167023 0.11261102
2     1   2 4.35 -1.36 4.2593  22  25  72 1.340596 0.09793508
3     2   1 9.32  0.98 2.8831  38  40  78 2.892720 0.10495571
4     3   1 8.08  1.17 3.1764  50  50 100 2.175419 0.06366223
5     4   1 7.44  0.45 2.9344  30  30  90 2.382088 0.09819080
6     4   2 5.34  0.45 2.9344  30  30  90 1.666439 0.08209456

Again, multiple standardized mean differences from the same study are correlated due to the repeated use of the information from the control group. The variance-covariance matrix of the effect size estimates can be constructed with:

calc.v <- function(x) {
v <- matrix(1/x$n2i + outer(x$yi, x$yi, "*")/(2*x$Ni), nrow=nrow(x), ncol=nrow(x))
diag(v) <- x$vi v } V <- bldiag(lapply(split(dat, dat$study), calc.v))
V
           [,1]       [,2]      [,3]       [,4]      [,5]       [,6]
[1,] 0.11261102 0.06017432 0.0000000 0.00000000 0.0000000 0.00000000
[2,] 0.06017432 0.09793508 0.0000000 0.00000000 0.0000000 0.00000000
[3,] 0.00000000 0.00000000 0.1049557 0.00000000 0.0000000 0.00000000
[4,] 0.00000000 0.00000000 0.0000000 0.06366223 0.0000000 0.00000000
[5,] 0.00000000 0.00000000 0.0000000 0.00000000 0.0981908 0.05538670
[6,] 0.00000000 0.00000000 0.0000000 0.00000000 0.0553867 0.08209456

Again, we note the block-diagonal structure.

The analysis can now be carried out with:

res <- rma.mv(yi, V, mods = ~ factor(trt) - 1, data=dat)
print(res, digits=3)
Multivariate Meta-Analysis Model (k = 6; method: FE)

Variance Components: none

Test for Residual Heterogeneity:
QE(df = 4) = 3.945, p-val = 0.414

Test of Moderators (coefficient(s) 1,2):
QM(df = 2) = 252.165, p-val < .001

Model Results:

estimate     se    zval   pval  ci.lb  ci.ub
factor(trt)1     2.374  0.150  15.804  <.001  2.080  2.669  ***
factor(trt)2     1.570  0.189   8.330  <.001  1.201  1.940  ***

---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

The variance-covariance matrix of the parameter estimates is equal to:

round(vcov(res), 5)
             factor(trt)1 factor(trt)2
factor(trt)1      0.02257      0.01244
factor(trt)2      0.01244      0.03554

The same results are given on page 367.

### Multiple-Endpoint Studies

In a multiple-endpoint study, more than one response variable is measured in the same group of subjects. The response variables may represent different ways of measuring the same construct (e.g., two different depression scales), different constructs (e.g., depression and anxiety), or multiple measurements over time (e.g., immediately after the treatment and at a later follow-up assessment). In any case, the corresponding effect size estimates are then (likely to be) correlated.

Such a scenario is illustrated with the data in Table 19.5. The dataset can be recreated with:

dat <- data.frame(school=c(1,1,2,2,3,3,4,4,5,5,6,6,7,7),
m1i=c(2.3,2.5,2.4,1.3,2.5,2.4,3.3,1.7,1.1,2.0,2.8,2.1,1.7,0.6),
m2i=c(10.3,6.6,9.7,3.1,8.7,3.7,7.5,8.5,2.2,2.1,3.8,1.4,1.8,3.9),
sdpi=c(8.2,7.3,8.3,8.9,8.5,8.3,7.7,9.8,9.1,10.4,9.6,7.9,9.2,10.2),
ri=rep(c(0.55,0.43,0.57,0.66,0.51,0.59,0.49), each=2),
n1i=rep(c(22,21,26,18,38,42,39), each=2),
n2i=rep(c(24,21,23,18,36,42,38), each=2))

The standardized mean differences, sampling variances, and covariances can be added to the dataset with:

dat$yi <- round(with(dat, (m2i-m1i)/sdpi), 3) dat$vi <- round(with(dat, 1/n1i + 1/n2i + yi^2/(2*(n1i+n2i))), 4)
dat$covi <- round(with(dat, (1/n1i + 1/n2i) * ri + (rep(sapply(split(dat$yi, dat$school), prod), each=2) / (2*(n1i+n2i))) * ri^2), 4) The contents of the resulting dataset are:  school outcome m1i m2i sdpi ri n1i n2i yi vi covi 1 1 math 2.3 10.3 8.2 0.55 22 24 0.976 0.0975 0.0497 2 1 reading 2.5 6.6 7.3 0.55 22 24 0.562 0.0906 0.0497 3 2 math 2.4 9.7 8.3 0.43 21 21 0.880 0.1045 0.0413 4 2 reading 1.3 3.1 8.9 0.43 21 21 0.202 0.0957 0.0413 5 3 math 2.5 8.7 8.5 0.57 26 23 0.729 0.0874 0.0471 6 3 reading 2.4 3.7 8.3 0.57 26 23 0.157 0.0822 0.0471 7 4 math 3.3 7.5 7.7 0.66 18 18 0.545 0.1152 0.0756 8 4 reading 1.7 8.5 9.8 0.66 18 18 0.694 0.1178 0.0756 9 5 math 1.1 2.2 9.1 0.51 38 36 0.121 0.0542 0.0276 10 5 reading 2.0 2.1 10.4 0.51 38 36 0.010 0.0541 0.0276 11 6 math 2.8 3.8 9.6 0.59 42 42 0.104 0.0477 0.0281 12 6 reading 2.1 1.4 7.9 0.59 42 42 -0.089 0.0477 0.0281 13 7 math 1.7 1.8 9.2 0.49 39 38 0.011 0.0520 0.0255 14 7 reading 0.6 3.9 10.2 0.49 39 38 0.324 0.0526 0.0255 The variance-covariance matrix for the entire dataset can be constructed with: V <- bldiag(lapply(split(dat, dat$school), function(x) matrix(c(x$vi, x$covi, x$covi, x$vi), nrow=2)))
V

The variance-covariance matrix again has a block-diagonal structure:

        [,1]   [,2]   [,3]   [,4]   [,5]   [,6]   [,7]   [,8]   [,9]  [,10]  [,11]  [,12]  [,13]  [,14]
[1,] 0.0975 0.0497 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000
[2,] 0.0497 0.0906 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000
[3,] 0.0000 0.0000 0.1045 0.0413 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000
[4,] 0.0000 0.0000 0.0413 0.0957 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000
[5,] 0.0000 0.0000 0.0000 0.0000 0.0874 0.0471 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000
[6,] 0.0000 0.0000 0.0000 0.0000 0.0471 0.0822 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000
[7,] 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.1152 0.0756 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000
[8,] 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0756 0.1178 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000
[9,] 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0542 0.0276 0.0000 0.0000 0.0000 0.0000
[10,] 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0276 0.0541 0.0000 0.0000 0.0000 0.0000
[11,] 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0477 0.0281 0.0000 0.0000
[12,] 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0281 0.0477 0.0000 0.0000
[13,] 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0520 0.0255
[14,] 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0255 0.0526

Finally, we can fit a model allowing for a different treatment effect depending on the outcome with:

res <- rma.mv(yi, V, mods = ~ outcome - 1, data=dat)
print(res, digits=3)
Multivariate Meta-Analysis Model (k = 14; method: FE)

Variance Components: none

Test for Residual Heterogeneity:
QE(df = 12) = 19.626, p-val = 0.074

Test of Moderators (coefficient(s) 1,2):
QM(df = 2) = 13.005, p-val = 0.001

Model Results:

estimate     se   zval   pval  ci.lb  ci.ub
outcomemath        0.362  0.100  3.603  <.001  0.165  0.558  ***
outcomereading     0.205  0.099  2.062  0.039  0.010  0.400    *

---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Categorical and/or quantitative moderator variables can also be included in such a model as described on pages 371-373.

Note that all of the models described above are fixed-effects models. Multivariate random/mixed-effects models can also be fitted with the rma.mv() function. For further examples, see Berkey et al. (1998) and van Houwelingen, Arends, and Stijnen (2002).

Gleser, L. J., & Olkin, I. (2009). Stochastically dependent effect sizes. In H. Cooper, L. V. Hedges, & J. C. Valentine (Eds.), The handbook of research synthesis and meta-analysis (2nd ed., pp. 357–376). New York: Russell Sage Foundation.

1)
The value of the $Q$ statistic given on page 362 (i.e., $Q = 17.191$) is a typo.
2)
Note that escalc() computes the arcsine transformed risk differences with the equation $arcsin[\sqrt{p_C}] - arcsin[\sqrt{p_T}]$, so the results differ by a multiplicative factor of 2 from those given in the chapter.

### Page Tools 