The metafor Package

This is an old revision of the document!

Assembling Data for a Meta-Analysis of Standardized Mean Differences

Suppose the goal of a meta-analysis is to aggregate the results from studies contrasting two groups (e.g., treatment versus control) and each study measured an outcome of interest using some quantitative scale. A commonly used effect size measure used to quantify the size of the group difference is then the standardized mean difference (also commonly known as Cohen's d).

As an example, consider the data reported in Normand (1999) on the length of the hospital stay of stroke patients under specialized care and under conventional/routine non-specialist care:

library(metafor)
dat.normand1999

  study             source n1i m1i sd1i n2i m2i sd2i
1     1          Edinburgh 155  55   47 156  75   64
2     2     Orpington-Mild  31  27    7  32  29    4
3     3 Orpington-Moderate  75  64   17  71 119   29
4     4   Orpington-Severe  18  66   20  18 137   48
5     5      Montreal-Home   8  14    8  13  18   11
6     6  Montreal-Transfer  57  19    7  52  18    4
7     7          Newcastle  34  52   45  33  41   34
8     8               Umea 110  21   16 183  31   27
9     9            Uppsala  60  30   27  52  23   20

Variables n1i and n2i indicate the number of patients under specialized and under routine care, respectively, variables m1i and m2i indicate the respective mean length of stay (in days) in the two groups during the acute hospital admission, and sd1i and sd2i are the corresponding standard deviations of the length of stay.

With this information, we can compute the standardized mean difference (and corresponding sampling variance) for each study with:

dat1 <- escalc(measure="SMD", m1i=m1i, sd1i=sd1i, n1i=n1i,
                              m2i=m2i, sd2i=sd2i, n2i=n2i, data=dat.normand1999)
dat1

  study             source n1i m1i sd1i n2i m2i sd2i      yi     vi
1     1          Edinburgh 155  55   47 156  75   64 -0.3552 0.0131
2     2     Orpington-Mild  31  27    7  32  29    4 -0.3479 0.0645
3     3 Orpington-Moderate  75  64   17  71 119   29 -2.3176 0.0458
4     4   Orpington-Severe  18  66   20  18 137   48 -1.8880 0.1606
5     5      Montreal-Home   8  14    8  13  18   11 -0.3840 0.2054
6     6  Montreal-Transfer  57  19    7  52  18    4  0.1721 0.0369
7     7          Newcastle  34  52   45  33  41   34  0.2721 0.0603
8     8               Umea 110  21   16 183  31   27 -0.4246 0.0149
9     9            Uppsala  60  30   27  52  23   20  0.2896 0.0363

Finally, a random-effects model can be fitted to these data with:

res1 <- rma(yi, vi, data=dat1)
res1

Random-Effects Model (k = 9; tau^2 estimator: REML)
 
tau^2 (estimated amount of total heterogeneity): 0.7908 (SE = 0.4281)
tau (square root of estimated tau^2 value):      0.8893
I^2 (total heterogeneity / total variability):   95.49%
H^2 (total variability / sampling variability):  22.20
 
Test for Heterogeneity: 
Q(df = 8) = 123.7293, p-val < .0001
 
Model Results:
 
estimate       se     zval     pval    ci.lb    ci.ub          
 -0.5371   0.3087  -1.7401   0.0818  -1.1421   0.0679        . 
 
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Therefore, stroke patients receiving specialized care are estimated to spent on average about half of a standard deviation fewer days in the hospital compared to patients receiving routine care ($\hat{\mu} = -0.54$ with 95% CI: $-1.14$ to $0.07$), but there is a considerable amount of heterogeneity in the findings (as indicated by the large estimate of $\tau^2$, the large $I^2$ value, and the significant $Q$-test).

Note: Since the exact same variable (number of days of the hospital stay) was measured in each study, it would also be possible (and probably better) to conduct the meta-analysis with the raw (unstandardized) mean difference as the effect size measure (or possibly with the (log transformed) ratio of means, since the variable was measured on a ratio scale). The data are used here just for illustration purposes.

Now suppose that the means and standard deviations of the length of stay variable are not reported in all studies, but that the following dataset could be assembled based on information reported in the studies:

dat2 <- dat.normand1999
dat2$dval <- with(dat2, (m1i - m2i) / sqrt(((n1i-1)*sd1i^2 + (n2i-1)*sd2i^2)/(n1i + n2i - 2)))
dat2$tval <- with(dat2, dval / sqrt(1/n1i + 1/n2i))
dat2$pval <- 2 * with(dat2, pt(abs(tval), df = n1i + n2i - 2, lower.tail=FALSE))
dat2$sign <- with(dat2, ifelse(m1i > m2i, 1, -1))
dat2[,c("dval","tval","pval")] <- round(dat2[,c("dval","tval","pval")], 2)
dat2[c(1,7),c(4,5,7,8,10,11,12)] <- NA
dat2[c(5,8),c(4,5,7,8, 9,11,12)] <- NA
dat2[c(2,9),c(4,5,7,8, 9,10)]    <- NA
dat2[c(3,4,6),c(9:12)] <- NA
dat2

  study             source n1i m1i sd1i n2i m2i sd2i  dval  tval pval sign
1     1          Edinburgh 155  NA   NA 156  NA   NA -0.36    NA   NA   NA
2     2     Orpington-Mild  31  NA   NA  32  NA   NA    NA    NA 0.17   -1
3     3 Orpington-Moderate  75  64   17  71 119   29    NA    NA   NA   NA
4     4   Orpington-Severe  18  66   20  18 137   48    NA    NA   NA   NA
5     5      Montreal-Home   8  NA   NA  13  NA   NA    NA -0.89   NA   NA
6     6  Montreal-Transfer  57  19    7  52  18    4    NA    NA   NA   NA
7     7          Newcastle  34  NA   NA  33  NA   NA  0.28    NA   NA   NA
8     8               Umea 110  NA   NA 183  NA   NA    NA -3.53   NA   NA
9     9            Uppsala  60  NA   NA  52  NA   NA    NA    NA 0.13    1

In particular, in studies 1 and 7, authors reported only the standardized mean difference value (Cohen's d value). In studies 5 and 8, the authors only reported the t-statistic from an independent samples t-test comparing the two groups, and in studies 2 and 9, the authors only reported the (two-sided) p-value corresponding to the t-test. For all studies, the group sizes are known.¹⁾

Given only this information, it is possible to reconstruct the full dataset for the meta-analysis (after each step, I would recommend examining the contents of the dat2 object to better appreciate what the code is doing).

For the studies reporting means and standard deviations (studies 3, 4, and 6), we can again use the escalc() function to obtain the (bias-corrected) standardized mean differences and corresponding sampling variances with:
```
dat2 <- escalc(measure="SMD", m1i=m1i, sd1i=sd1i, n1i=n1i,
                              m2i=m2i, sd2i=sd2i, n2i=n2i, data=dat2)
```
The p-values from the t-tests can be converted into the corresponding t-values with:
```
dat2$tval <- replmiss(dat2$tval, with(dat2, sign * qt(pval/2, df=n1i+n2i-2, lower.tail=FALSE)))
```
Note that the two-sided p-values need to be turned into one-sided p-values first (i.e., by dividing by 2) before using the qt() function. Also, the p-values do not contain information on the sign of the t-test values (i.e., whether the first or the second group had the lower mean), so that information needs to be available as well (as encoded in the sign variable). The replmiss() function is a little helper function in the metafor package that can be used to only replace the missing values.
The t-test values can be converted into the corresponding Cohen's d values with:
```
dat2$dval <- replmiss(dat2$dval, with(dat2, tval * sqrt(1/n1i + 1/n2i)))
```
This conversion is based on the well-known relationship between Cohen's d values and the test statistic of the independent samples t-test. Again, only missing values are replaced.
Cohen's d values are known to have a slight positive bias. The missing values for the yi variable can now be replaced with the bias-corrected values with:
```
dat2$yi <- replmiss(dat2$yi, with(dat2, (1 - 3/(4*(n1i+n2i-2) - 1)) * dval))
```
Any missing values for the vi variable (the sampling variances) can now be replaced with:
```
dat2$vi <- replmiss(dat2$vi, with(dat2, 1/n1i+ 1/n2i + yi^2/(2*(n1i+n2i))))
```
The code above uses the usual equation for computing (or rather: estimating) the large-sample sampling variance of standardized mean differences.

We can now examine the contents of the dataset:

dat2

  study             source n1i m1i sd1i n2i m2i sd2i       dval      tval pval sign      yi     vi
1     1          Edinburgh 155  NA   NA 156  NA   NA -0.3600000        NA   NA   NA -0.3591 0.0131
2     2     Orpington-Mild  31  NA   NA  32  NA   NA -0.3499419 -1.388614 0.17   -1 -0.3456 0.0645
3     3 Orpington-Moderate  75  64   17  71 119   29         NA        NA   NA   NA -2.3176 0.0458
4     4   Orpington-Severe  18  66   20  18 137   48         NA        NA   NA   NA -1.8880 0.1606
5     5      Montreal-Home   8  NA   NA  13  NA   NA -0.3999291 -0.890000   NA   NA -0.3839 0.2054
6     6  Montreal-Transfer  57  19    7  52  18    4         NA        NA   NA   NA  0.1721 0.0369
7     7          Newcastle  34  NA   NA  33  NA   NA  0.2800000        NA   NA   NA  0.2768 0.0603
8     8               Umea 110  NA   NA 183  NA   NA -0.4258794 -3.530000   NA   NA -0.4248 0.0149
9     9            Uppsala  60  NA   NA  52  NA   NA  0.2890341  1.525518 0.13    1  0.2871 0.0363

Any differences compared to dat1 are purely a result of the rounding of the dval, tval, and pval variables. However, the differences are negligible.

We can then fit a random-effects model to these data with:

res2 <- rma(yi, vi, data=dat2)
res2

Random-Effects Model (k = 9; tau^2 estimator: REML)
 
tau^2 (estimated amount of total heterogeneity): 0.7912 (SE = 0.4283)
tau (square root of estimated tau^2 value):      0.8895
I^2 (total heterogeneity / total variability):   95.50%
H^2 (total variability / sampling variability):  22.20
 
Test for Heterogeneity: 
Q(df = 8) = 123.7121, p-val < .0001
 
Model Results:
 
estimate       se     zval     pval    ci.lb    ci.ub          
 -0.5371   0.3087  -1.7398   0.0819  -1.1422   0.0680        . 
 
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

The results are essentially the same as obtained earlier.

Some final notes:

The conversion of a p-value to the corresponding t-value as shown above assumes that the exact p-value is reported. If authors only report that the p-value fell below a certain threshold (e.g., $p < .01$ or if authors only state that the test was significant – which typically implies $p < .05$), then a common approach is to use the value of the cutoff reported (e.g., if $p < .01$ is reported, then assume $p = .01$), which is conservative (since the actual p-value was below that assumed value by some unknown amount). The conversion will therefore tend to be much less accurate.

The conversion of the p-value shown above is only applicable if authors actually conducted an independent samples t-test (i.e., it is not appropriate if that p-value was based on a Mann–Whitney U test or some other non-parametric test).

When authors report only the Cohen's d value and/or the test statistic from an independent samples t-test, care must be taken to ensure the correct sign of the value (authors may only report the absolute value or may have interchanged the groups).

A useful reference for these types of transformations is Lipsey and Wilson (2001).

References

Lipsey, M. W., & Wilson, D. B. (2001). Practical meta-analysis. Thousand Oaks, CA: Sage.

Normand, S. T. (1999). Meta-analysis: Formulating, evaluating, combining, and reporting. Statistics in Medicine, 18(3), 321–359.

¹⁾

In some cases, only the total sample size of a study may be known (i.e., ni = n1i + n2i). If it is not at all possible to obtain information about the group sizes (e.g., by contacting the author(s) of the study), one possibility would be to assume that both groups are of the same size (i.e., n1i = n2i = ni/2). In the present dataset, this would not be far from the truth in most cases.