method="ML"
in the call to rma()
, then the unadjusted R-square value will be provided.
The standard linear regression model is given by $$y_i = \beta_0 + \beta_1 x_{i1} + \beta_2 x_{i2} + \ldots + \beta_p x_{ip} + e_i,$$ where $e_i \sim N(0, \sigma^2)$. Models of this sort can be fitted with the R function lm()
. The mixed-effects meta-regression model is given by $$y_i = \beta_0 + \beta_1 x_{i1} + \beta_2 x_{i2} + \ldots + \beta_p x_{ip} + u_i + e_i,$$ where $u_i \sim N(0, \tau^2)$ and $e_i \sim N(0, v_i)$, where $v_i$ are the (approximately) known sampling variances of the observed outcomes or effect size estimates (e.g., standardized mean differences, log odds ratios). Models of this sort can be fitted with the function rma()
from the metafor package.
Consequently, if one were to set $v_i = 0$ for all outcomes, then the standard linear regression model and the mixed-effects meta-regression model are actually identical (with $\sigma^2$ denoting the same parameter as $\tau^2$). This equivalence can be demonstrated with an arbitrary dataset:
library(metafor) stackloss
Air.Flow Water.Temp Acid.Conc. stack.loss 1 80 27 89 42 2 80 27 88 37 3 75 25 90 37 4 62 24 87 28 5 62 22 87 18 6 62 23 87 18 7 62 24 93 19 8 62 24 93 20 9 58 23 87 15 10 58 18 80 14 11 58 18 89 14 12 58 17 88 13 13 58 18 82 11 14 58 19 93 12 15 50 18 89 8 16 50 18 86 7 17 50 19 72 8 18 50 19 79 8 19 50 20 80 9 20 56 20 82 15 21 70 20 91 15
See help(stackloss)
for more details on this dataset. Most importantly, variable stack.loss
is the dependent variable with Air.Flow
, Water.Temp
, and Acid.Conc
serving as potentially relevant predictors.
Now let's fit the standard linear regression model to these data with both the lm()
and the rma()
functions with:
res.lm <- lm(stack.loss ~ Air.Flow + Water.Temp + Acid.Conc., data=stackloss)
and
stackloss$vi <- 0 res.rma <- rma(stack.loss, vi, mods = ~ Air.Flow + Water.Temp + Acid.Conc., data=stackloss)
Note that all sampling variances are set to 0 for rma()
(the function will actually issue a warning that the dataset includes outcomes with non-positive sampling variances – which would be rather strange in the meta-analytic context – but this can be safely ignored here).
We can now compare the output from the two models with:
summary(res.lm)
Call: lm(formula = stack.loss ~ Air.Flow + Water.Temp + Acid.Conc., data = stackloss) Residuals: Min 1Q Median 3Q Max -7.2377 -1.7117 -0.4551 2.3614 5.6978 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) -39.9197 11.8960 -3.356 0.00375 ** Air.Flow 0.7156 0.1349 5.307 5.8e-05 *** Water.Temp 1.2953 0.3680 3.520 0.00263 ** Acid.Conc. -0.1521 0.1563 -0.973 0.34405 --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: 3.243 on 17 degrees of freedom Multiple R-squared: 0.9136, Adjusted R-squared: 0.8983 F-statistic: 59.9 on 3 and 17 DF, p-value: 3.016e-09
and
res.rma
Mixed-Effects Model (k = 21; tau^2 estimator: REML) tau^2 (estimated amount of residual heterogeneity): 10.5194 (SE = 3.6081) tau (square root of estimated tau^2 value): 3.2434 R^2 (amount of heterogeneity accounted for): 89.83% Test of Moderators (coefficients 2:4): QM(df = 3) = 179.7067, p-val < .0001 Model Results: estimate se zval pval ci.lb ci.ub intrcpt -39.9197 11.8960 -3.3557 0.0008 -63.2354 -16.6039 *** Air.Flow 0.7156 0.1349 5.3066 <.0001 0.4513 0.9800 *** Water.Temp 1.2953 0.3680 3.5196 0.0004 0.5740 2.0166 *** Acid.Conc. -0.1521 0.1563 -0.9733 0.3304 -0.4585 0.1542 --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
A few things are worth noting:
lm()
computes the p-values based on the t-distribution, while rma()
uses (by default) the standard normal distribution.lm()
is an F-test (F-statistic: 59.9 on 3 and 17 DF, p-value: 3.016e-09
), while rma()
uses by default a chi-square test (QM(df = 3) = 179.7067, p-val < .0001
).Residual standard error: 3.243
is the same as the estimate of tau: 3.2434
.rma()
(89.83%
) corresponds to the adjusted R-squared value provided by lm()
(0.8983
).1)rma()
function are missing, since these statistics cannot be computed when the dataset includes outcomes with non-positive sampling variances. Similarly, the usually reported results from the $Q$-test for heterogeneity are omitted for the same reason.
To get full correspondence between the two models, we can use the 'Knapp & Hartung' method when fitting the model with the rma()
function:
res.rma <- rma(stack.loss, vi, mods = ~ Air.Flow + Water.Temp + Acid.Conc., data=stackloss, test="knha") res.rma
Mixed-Effects Model (k = 21; tau^2 estimator: REML) tau^2 (estimated amount of residual heterogeneity): 10.5194 (SE = 3.6081) tau (square root of estimated tau^2 value): 3.2434 R^2 (amount of heterogeneity accounted for): 89.83% Test of Moderators (coefficients 2:4): F(df1 = 3, df2 = 17) = 59.9022, p-val < .0001 Model Results: estimate se tval df pval ci.lb ci.ub intrcpt -39.9197 11.8960 -3.3557 17 0.0038 -65.0180 -14.8213 ** Air.Flow 0.7156 0.1349 5.3066 17 <.0001 0.4311 1.0002 *** Water.Temp 1.2953 0.3680 3.5196 17 0.0026 0.5188 2.0717 ** Acid.Conc. -0.1521 0.1563 -0.9733 17 0.3440 -0.4819 0.1776 --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Now the p-values and the results from the omnibus test of the moderators also match up completely with the output from the lm()
function.
method="ML"
in the call to rma()
, then the unadjusted R-square value will be provided.