# The metafor Package

A Meta-Analysis Package for R

### Sidebar

tips:regression_with_rma

## Linear Regression and the Mixed-Effects Meta-Regression Model

The standard linear regression model is given by $$y_i = \beta_0 + \beta_1 x_{i1} + \beta_2 x_{i2} + \ldots + \beta_p x_{ip} + e_i,$$ where $e_i \sim N(0, \sigma^2)$. Models of this sort can be fitted with the R function lm(). The mixed-effects meta-regression model is given by $$y_i = \beta_0 + \beta_1 x_{i1} + \beta_2 x_{i2} + \ldots + \beta_p x_{ip} + u_i + e_i,$$ where $u_i \sim N(0, \tau^2)$ and $e_i \sim N(0, v_i)$, where $v_i$ are the (approximately) known sampling variances of the observed outcomes or effect size estimates (e.g., standardized mean differences, log odds ratios). Models of this sort can be fitted with the function rma() from the metafor package.

Consequently, if one were to set $v_i = 0$ for all outcomes, then the standard linear regression model and the mixed-effects meta-regression model are actually identical (with $\sigma^2$ denoting the same parameter as $\tau^2$). This equivalence can be demonstrated with an arbitrary dataset:

library(metafor)
stackloss
Air.Flow Water.Temp Acid.Conc. stack.loss
1        80         27         89         42
2        80         27         88         37
3        75         25         90         37
4        62         24         87         28
5        62         22         87         18
6        62         23         87         18
7        62         24         93         19
8        62         24         93         20
9        58         23         87         15
10       58         18         80         14
11       58         18         89         14
12       58         17         88         13
13       58         18         82         11
14       58         19         93         12
15       50         18         89          8
16       50         18         86          7
17       50         19         72          8
18       50         19         79          8
19       50         20         80          9
20       56         20         82         15
21       70         20         91         15

See help(stackloss) for more details on this dataset. Most importantly, variable stack.loss is the dependent variable with Air.Flow, Water.Temp, and Acid.Conc serving as potentially relevant predictors.

Now let's fit the standard linear regression model to these data with both the lm() and the rma() functions with:

res.lm <- lm(stack.loss ~ Air.Flow + Water.Temp + Acid.Conc., data=stackloss)

and

stackloss$vi <- 0 res.rma <- rma(stack.loss, vi, mods = ~ Air.Flow + Water.Temp + Acid.Conc., data=stackloss) Note that all sampling variances are set to 0 for rma() (the function will actually issue a warning that the dataset includes outcomes with non-positive sampling variances – which would be rather strange in the meta-analytic context – but this can be safely ignored here). We can now compare the output from the two models with: summary(res.lm) Call: lm(formula = stack.loss ~ Air.Flow + Water.Temp + Acid.Conc., data = stackloss) Residuals: Min 1Q Median 3Q Max -7.2377 -1.7117 -0.4551 2.3614 5.6978 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) -39.9197 11.8960 -3.356 0.00375 ** Air.Flow 0.7156 0.1349 5.307 5.8e-05 *** Water.Temp 1.2953 0.3680 3.520 0.00263 ** Acid.Conc. -0.1521 0.1563 -0.973 0.34405 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 3.243 on 17 degrees of freedom Multiple R-squared: 0.9136, Adjusted R-squared: 0.8983 F-statistic: 59.9 on 3 and 17 DF, p-value: 3.016e-09 and res.rma Mixed-Effects Model (k = 21; tau^2 estimator: REML) tau^2 (estimated amount of residual heterogeneity): 10.5194 (SE = 3.6081) tau (square root of estimated tau^2 value): 3.2434 R^2 (amount of heterogeneity accounted for): 89.83% Test of Moderators (coefficient(s) 2:4): QM(df = 3) = 179.7067, p-val < .0001 Model Results: estimate se zval pval ci.lb ci.ub intrcpt -39.9197 11.8960 -3.3557 0.0008 -63.2354 -16.6039 *** Air.Flow 0.7156 0.1349 5.3066 <.0001 0.4513 0.9800 *** Water.Temp 1.2953 0.3680 3.5196 0.0004 0.5740 2.0166 *** Acid.Conc. -0.1521 0.1563 -0.9733 0.3304 -0.4585 0.1542 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 A few things are worth noting: • The estimated model coefficients, corresponding standard errors, and the test statistics are exactly the same. However, lm() computes the p-values based on the t-distribution, while rma() uses (by default) the standard normal distribution. • The omnibus test of the model coefficients conducted by lm() is an F-test (F-statistic: 59.9 on 3 and 17 DF, p-value: 3.016e-09), while rma() uses by default a chi-square test (QM(df = 3) = 179.7067, p-val < .0001). • The estimated Residual standard error: 3.243 is the same as the estimate of tau: 3.2434. • The (pseudo) R-squared value reported by rma() (89.83%) actually corresponds to the adjusted R-squared value provided by lm() (0.8983). • The$I^2$and$H^2$statistics typically reported by the rma() function are missing, since these statistics cannot be computed when the dataset includes outcomes with non-positive sampling variances. Similarly, the usually reported results from the$Q\$-test for heterogeneity are omitted for the same reason.

To get full correspondence between the two models, we can use the 'Knapp & Hartung' method when fitting the model with the rma() function:

res.rma <- rma(stack.loss, vi, mods =  ~ Air.Flow + Water.Temp + Acid.Conc., data=stackloss, test="knha")
res.rma
Mixed-Effects Model (k = 21; tau^2 estimator: REML)

tau^2 (estimated amount of residual heterogeneity):     10.5194 (SE = 3.6081)
tau (square root of estimated tau^2 value):             3.2434
R^2 (amount of heterogeneity accounted for):            89.83%

Test of Moderators (coefficient(s) 2:4):
F(df1 = 3, df2 = 17) = 59.9022, p-val < .0001

Model Results:

estimate       se     tval    pval     ci.lb     ci.ub
intrcpt     -39.9197  11.8960  -3.3557  0.0038  -65.0180  -14.8213   **
Air.Flow      0.7156   0.1349   5.3066  <.0001    0.4311    1.0002  ***
Water.Temp    1.2953   0.3680   3.5196  0.0026    0.5188    2.0717   **
Acid.Conc.   -0.1521   0.1563  -0.9733  0.3440   -0.4819    0.1776

---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Now the p-values and the results from the omnibus test of the moderators also match up completely with the output from the lm() function.

tips/regression_with_rma.txt · Last modified: 2018/11/23 08:58 by Wolfgang Viechtbauer