The metafor Package

This is an old revision of the document!

Testing Factors and Linear Combinations of Parameters

The examples below show how to test factors and linear combinations of parameters in (mixed-effects) meta-regression models.

Testing Factors

Meta-regression models can easily handle categorical predictors (factors) via appropriate coding of factors in terms of dummy variables. We can use the dataset for the BCG vaccine meta-analysis (Colditz et al., 1994) as an illustration.

The (log) risk ratios and corresponding sampling variances for the 13 studies can be obtained with:

library(metafor)
dat <- escalc(measure="RR", ai=tpos, bi=tneg, ci=cpos, di=cneg, data=dat.bcg)
dat

   trial               author year tpos  tneg cpos  cneg ablat      alloc      yi     vi
1      1              Aronson 1948    4   119   11   128    44     random -0.8893 0.3256
2      2     Ferguson & Simes 1949    6   300   29   274    55     random -1.5854 0.1946
3      3      Rosenthal et al 1960    3   228   11   209    42     random -1.3481 0.4154
4      4    Hart & Sutherland 1977   62 13536  248 12619    52     random -1.4416 0.0200
5      5 Frimodt-Moller et al 1973   33  5036   47  5761    13  alternate -0.2175 0.0512
6      6      Stein & Aronson 1953  180  1361  372  1079    44  alternate -0.7861 0.0069
7      7     Vandiviere et al 1973    8  2537   10   619    19     random -1.6209 0.2230
8      8           TPT Madras 1980  505 87886  499 87892    13     random  0.0120 0.0040
9      9     Coetzee & Berjak 1968   29  7470   45  7232    27     random -0.4694 0.0564
10    10      Rosenthal et al 1961   17  1699   65  1600    42 systematic -1.3713 0.0730
11    11       Comstock et al 1974  186 50448  141 27197    18 systematic -0.3394 0.0124
12    12   Comstock & Webster 1969    5  2493    3  2338    33 systematic  0.4459 0.5325
13    13       Comstock et al 1976   27 16886   29 17825    33 systematic -0.0173 0.0714

For easier interpretation of the results, we will "center" the publication year and absolute latitude moderators at their minimum values (1948 and 13 degrees, respectively):

dat$year  <- dat$year  - 1948
dat$ablat <- dat$ablat - 13

Now let's fit a mixed-effects meta-regression model to these data, including publication year (year), absolute latitude (ablat), and the method of treatment allocation (alloc) as predictors/moderators.¹⁾ The standard way of handling a factor in regression models is to create dummy variables representing the various levels. One of the dummies is then left out of the model, with the corresponding level becoming the "reference" level. Instead of creating the dummy variables manually, we can let R handle the dummy coding of the allocation factor for us:

res <- rma(yi, vi, mods = ~ factor(alloc) + year + ablat, data=dat)
res

Mixed-Effects Model (k = 13; tau^2 estimator: REML)
 
tau^2 (estimated amount of residual heterogeneity):     0.1796 (SE = 0.1425)
tau (square root of estimated tau^2 value):             0.4238
I^2 (residual heterogeneity / unaccounted variability): 73.09%
H^2 (unaccounted variability / sampling variability):   3.72
R^2 (amount of heterogeneity accounted for):            42.67%
 
Test for Residual Heterogeneity: 
QE(df = 8) = 26.2030, p-val = 0.0010
 
Test of Moderators (coefficient(s) 2,3,4,5): 
QM(df = 4) = 9.5254, p-val = 0.0492
 
Model Results:
 
                         estimate      se     zval    pval    ci.lb   ci.ub   
intrcpt                   -0.2324  0.5550  -0.4187  0.6754  -1.3201  0.8554   
factor(alloc)random       -0.3421  0.4180  -0.8183  0.4132  -1.1613  0.4772   
factor(alloc)systematic    0.0101  0.4467   0.0226  0.9820  -0.8654  0.8856   
year                       0.0075  0.0194   0.3849  0.7003  -0.0306  0.0456   
ablat                     -0.0236  0.0132  -1.7816  0.0748  -0.0495  0.0024  .
 
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

By default, alternate was made the reference level (since the first letter "a" comes before "r" and "s"), so the corresponding dummy variable for this level has been left out.²⁾ Therefore, the model intercept (coefficients $b_0$) is the estimated (average) log risk ratio for alternate allocation in 1948 at 13 degrees absolute latitude. Coefficients $b_1$ and $b_2$ indicate how much lower/higher the estimated (average) log risk ratio is for random and systematic allocation, respectively, compared to alternate allocation. Finally, the coefficients for year and absolute latitude (i.e., $b_3$ and $b_4$) estimate how the (average) log risk ratio changes for a one-unit increase in the respective moderator.

The results under "Test of Moderators" indicate that we can (just) reject the null hypothesis $H_0: \beta_1 = \beta_2 = \beta_3 = \beta_4 = 0$ (i.e., this is a so-called omnibus test of coefficients 2 through 5 – the first being the intercept, which is typically denoted as $\beta_0$ in model equations and is not included in the test by default). In other words, it appears as if at least part of the heterogeneity in the true effects is related to some of the predictors/moderators included in the model.

Now consider the two coefficients for the allocation factor. Neither coefficient is significantly different from 0 (as indicated by the corresponding p-values), but this isn't a proper test of the factor as a whole. Instead, we want to simultaneously test whether both coefficients are simultaneously equal to 0. We can do this with:

anova(res, btt=2:3)

Test of Moderators (coefficient(s) 2,3):
QM(df = 2) = 1.3663, p-val = 0.5050

This is a (Wald-type) chi-square test of the null hypothesis $H_0: \beta_1 = \beta_2 = 0$ with two degrees of freedom. It indicates that the factor as a whole is not significant.

The car package includes the linearHypothesis() function that can be used for the same purpose. One can either specify a matrix (or vector) of one or more linear combinations of coefficients to test or a character vector giving the hypothesis in symbolic form. Therefore, both of the following commands produce the same output:

library(car)
linearHypothesis(res, rbind(c(0,1,0,0,0),c(0,0,1,0,0)))
linearHypothesis(res, c("factor(alloc)random = 0", "factor(alloc)systematic = 0"))

Linear hypothesis test
 
Hypothesis:
factor(alloc)random = 0
factor(alloc)systematic = 0
 
Model 1: restricted model
Model 2: res
 
  Res.Df Df  Chisq Pr(>Chisq)
1     10
2      8  2 1.3663      0.505

The multcomp package provides similar functionality. Here, the commands would be:

library(multcomp)
summary(glht(res, linfct=rbind(b1=c(0,1,0,0,0), b2=c(0,0,1,0,0))), test=Chisqtest())

         General Linear Hypotheses
 
Linear Hypotheses:
        Estimate
b1 == 0  -0.3421
b2 == 0   0.0101
 
Global Test:
  Chisq DF Pr(>Chisq)
1 1.366  2      0.505

The same results could also have been obtained directly if we had fitted the model with the command:

rma(yi, vi, mods = ~ factor(alloc) + year + ablat, data=dat, btt=2:3)

With btt=2:3, only coefficients 2 and 3 will be included in the omnibus test (output not shown).

Note that the reference level chosen for a factor does not alter the results of the omnibus test. For example, we could make random the reference level with:

res <- rma(yi, vi, mods = ~ relevel(factor(alloc), ref="random") + year + ablat, data=dat, btt=2:3)

The results of anova(res, btt=2:3) are unchanged (output not shown).

Likelihood Ratio Tests

We can also use likelihood ratio tests (LRT) to test sets of model coefficients. For this, we fit both the model with and the model without the treatment allocation factor (i.e., the "full" and the "reduced" model) and then conduct a model comparison. When the two models to be compared differ in terms of their fixed effects (i.e., model coefficients) – as in the present example – then one must fit both models using maximum likelihood (ML) estimation, and not restricted maximum likelihood (REML) estimation (which is the default). So, the LRT can be conducted with:

res0 <- rma(yi, vi, mods = ~ year + ablat, data=dat, method="ML")
res1 <- rma(yi, vi, mods = ~ factor(alloc) + year + ablat, data=dat, method="ML")
anova(res0, res1)

        df     AIC     BIC    AICc  logLik    LRT   pval      QE  tau^2 R^2
Full     6 25.8412 29.2309 39.8412 -6.9206               26.2030 0.0329
Reduced  4 23.2922 25.5520 28.2922 -7.6461 1.4510 0.4841 28.3251 0.0269  0%

In the present example, the test statistic for the LRT is equal to $1.4510$, which (asymptotically) follows a chi-square distribution with two degrees of freedom (i.e., the difference in the number of parameters in the full and the reduced model) under the null hypothesis (i.e., that the reduced model fits the data as well as the full model). This leads to a p-value of $.48$, which does not lead us to reject the null hypothesis and hence to the conclusion that the inclusion of the allocation factor in the model does not provide a significantly better fit.

Knapp & Hartung Adjustment

Knapp and Hartung (2003) described an adjustment to the standard Wald-type tests typically used in mixed-effects meta-regression models that leads to tests with better statistical properties (i.e., better control of the Type I error rate). For individual model coefficients, the method leads to t-test for individual model coefficients. The adjustment can also be applied when testing sets of model coefficients, which leads to F-tests. In the present example, the allocation factor can be tested with:

res <- rma(yi, vi, mods = ~ factor(alloc) + year + ablat, data=dat, btt=2:3, test="knha")
res

Mixed-Effects Model (k = 13; tau^2 estimator: REML)
 
tau^2 (estimated amount of residual heterogeneity):     0.1796 (SE = 0.1425)
tau (square root of estimated tau^2 value):             0.4238
I^2 (residual heterogeneity / unaccounted variability): 73.09%
H^2 (unaccounted variability / sampling variability):   3.72
R^2 (amount of heterogeneity accounted for):            42.67%
 
Test for Residual Heterogeneity: 
QE(df = 8) = 26.2030, p-val = 0.0010
 
Test of Moderators (coefficient(s) 2,3): 
F(df1 = 2, df2 = 8) = 0.6503, p-val = 0.5474
 
Model Results:
 
                         estimate      se     tval    pval    ci.lb   ci.ub   
intrcpt                   -0.2324  0.5688  -0.4085  0.6936  -1.5441  1.0794   
factor(alloc)random       -0.3421  0.4284  -0.7984  0.4477  -1.3300  0.6459   
factor(alloc)systematic    0.0101  0.4579   0.0221  0.9829  -1.0457  1.0659   
year                       0.0075  0.0199   0.3755  0.7170  -0.0385  0.0534   
ablat                     -0.0236  0.0136  -1.7382  0.1204  -0.0548  0.0077   
 
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

We obtain $F(2,8) = .65, p = .55$, which yields the same conclusion as obtained earlier.

Here, the btt argument was used directly to only include the two coefficients corresponding to the allocation factor in the omnibus test (with the results again given under "Test of Moderators"). Equivalently, the same results could have been obtained with (output not shown):

anova(res, btt=2:3)

If we want to use the linearHypothesis() function from the car package, then we must explicitly request the F-test with:

linearHypothesis(res, c("factor(alloc)random = 0", "factor(alloc)systematic = 0"), test="F")

Linear hypothesis test
 
Hypothesis:
factor(alloc)random = 0
factor(alloc)systematic = 0
 
Model 1: restricted model
Model 2: res
 
  Res.Df Df      F Pr(>F)
1     10                 
2      8  2 0.6503 0.5474

Note that the model must have been fitted with test="knha" for this to yield the correct F-test.

Finally, using the multcomp package:

summary(glht(res, linfct=rbind(b1=c(0,1,0,0,0), b2=c(0,0,1,0,0))), test=Ftest())

         General Linear Hypotheses
 
Linear Hypotheses:
        Estimate
b1 == 0  -0.3421
b2 == 0   0.0101
 
Global Test:
       F DF1 DF2 Pr(>F)
1 0.6503   2   8 0.5474

Again, it only makes sense to request an F-test here when the original model was fitted with test="knha".

Permutation Test

Yet another possibility is to conduct a permutation test (Higgins & Thompson, 2004), which is implemented in the permutest() function. For this, the rows of the model matrix are permuted, so that any association between the predictors and the observed outcomes is purely due to chance. This process is repeated many times (ideally for all possible unique permutations), yielding the distribution of the test statistic under the null hypothesis. For the omnibus test, the p-value is the proportion of times that the test statistic under the permuted data is as extreme or more extreme than the actually observed one.

We can try to carry out an exact permutation test (for all possible unique permutations) with:

res <- rma(yi, vi, mods = ~ factor(alloc) + year + ablat, data=dat, btt=2:3)
permutest(res, exact=TRUE)

Running 6227020800 iterations for exact permutation test.
Error in permutest.rma.uni(res, exact = T) : 
  Number of iterations requested too large.

So, the exact test would require more than $6 \times 10^9$ model fits, which is just not feasible (even on a fast computer, this could easily take years to complete). So, instead, we can approximate the exact permutation-based p-value by going through a smaller number (as specified by the iter argument) of random permutations (the default is 1000). Let's go with 10000 iterations (this may still take a minutes or so to complete):

permutest(res, iter=10000)

Test of Moderators (coefficient(s) 2,3): 
QM(df = 2) = 1.3663, p-val* = 0.5344
 
Model Results:
 
                         estimate      se     zval   pval*    ci.lb   ci.ub   
intrcpt                   -0.2324  0.5550  -0.4187  1.0000  -1.3201  0.8554   
factor(alloc)random       -0.3421  0.4180  -0.8183  0.4540  -1.1613  0.4772   
factor(alloc)systematic    0.0101  0.4467   0.0226  0.9740  -0.8654  0.8856   
year                       0.0075  0.0194   0.3849  0.7486  -0.0306  0.0456   
ablat                     -0.0236  0.0132  -1.7816  0.1062  -0.0495  0.0024   
 
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

The p-value for the omnibus test is equal to $.53$, which is very close to the one obtained with the Knapp and Hartung adjustment.³⁾

Testing Linear Combinations

In a similar way, one can also test linear combinations of model coefficients. For example, one could test whether random and systematic allocation lead to significantly different effects with:

res <- rma(yi, vi, mods = ~ factor(alloc) + year + ablat, data=dat)
anova(res, L=c(0,1,-1,0,0))

Hypothesis:                                                    
1: factor(alloc)random - factor(alloc)systematic = 0
 
Results:
   estimate     se    zval   pval
1:  -0.3522 0.3357 -1.0489 0.2942
 
Test of Hypothesis:
QM(df = 1) = 1.1002, p-val = 0.2942

We can also use the linearHypothesis() function from the car package for this.

linearHypothesis(res, c(0,1,-1,0,0))

Linear hypothesis test
 
Hypothesis:
factor(alloc)random - factor(alloc)systematic = 0
 
Model 1: restricted model
Model 2: res
 
  Res.Df Df  Chisq Pr(>Chisq)
1      9                     
2      8  1 1.1002     0.2942

Apparently there is no significant difference between random and systematic allocation.

If the model was fitted with the Knapp and Hartung adjustment, then we again get an F-test:

res <- rma(yi, vi, mods = ~ factor(alloc) + year + ablat, data=dat, test="knha")
anova(res, L=c(0,1,-1,0,0))

Hypothesis:                                                    
1: factor(alloc)random - factor(alloc)systematic = 0
 
Results:
   estimate     se    tval   pval
1:  -0.3522 0.3441 -1.0234 0.3361
 
Test of Hypothesis:
F(df1 = 1, df2 = 8) = 1.0473, p-val = 0.3361

linearHypothesis(res, c(0,1,-1,0,0), test="F")

Linear hypothesis test
 
Hypothesis:
factor(alloc)random - factor(alloc)systematic = 0
 
Model 1: restricted model
Model 2: res
 
  Res.Df Df      F Pr(>F)
1      9                 
2      8  1 1.0473 0.3361

As another example, we could test whether the estimated average log risk ratio based on random allocation in 1970 at an absolute latitude of 30 degrees is significantly different from zero. Recall that these two moderators were "centered" at 1948 and 13, respectively. So, this linear hypothesis can be tested with:

anova(res, L=c(1,1,0,1970-1948,30-13))

Hypothesis:                                                         
1: intrcpt + factor(alloc)random + 22*year + 17*ablat = 0
 
Results:
   estimate     se    tval   pval
1:  -0.8103 0.2155 -3.7607 0.0055
 
Test of Hypothesis:
F(df1 = 1, df2 = 8) = 14.1432, p-val = 0.0055

Or analogously with the linearHypothesis() function:

linearHypothesis(res, c(1,1,0,1970-1948,30-13), test="F")

Linear hypothesis test
 
Hypothesis:
intrcpt  + factor(alloc)random  + 22 year  + 17 ablat = 0
 
Model 1: restricted model
Model 2: res
 
  Res.Df Df      F   Pr(>F)   
1      9                      
2      8  1 14.143 0.005538 **
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

So, the estimated average log risk ratio for this combination of moderator values is significantly different from zero.

We can obtain the corresponding estimated average log risk ratio with:

tmp <- predict(res, newmods=c(1,0,1970-1948,30-13))
tmp

    pred     se   ci.lb   ci.ub   cr.lb  cr.ub
 -0.8103 0.2155 -1.3072 -0.3135 -1.9067 0.2860

In addition, the corresponding standard error, confidence interval, and credibility/prediction interval are given. Note that the intercept is automatically added (by default) and hence should not be included in the vector specified for the newmods argument for the predict() function.

Since the F-value given for the test above is based on one degree of freedom in the numerator, the value must in fact be identical with the squared ratio of the predicted value to its standard error:

(tmp$pred/tmp$se)^2

[1] 14.14316

As we see, that works out as expected.

All Pairwise Comparisons

Finally, when including a factor in a model, one may be interested in obtaining all pairwise comparisons (contrasts) between the factor levels. Let's start with the model without the Knapp and Hartung adjustment.

res <- rma(yi, vi, mods = ~ factor(alloc) + year + ablat, data=dat)
res

The table of the model coefficients then look as follows:

                         estimate      se     zval    pval    ci.lb   ci.ub   
intrcpt                   -0.2324  0.5550  -0.4187  0.6754  -1.3201  0.8554   
factor(alloc)random       -0.3421  0.4180  -0.8183  0.4132  -1.1613  0.4772   
factor(alloc)systematic    0.0101  0.4467   0.0226  0.9820  -0.8654  0.8856   
year                       0.0075  0.0194   0.3849  0.7003  -0.0306  0.0456   
ablat                     -0.0236  0.0132  -1.7816  0.0748  -0.0495  0.0024

Note that coefficients $b_1$ and $b_2$ are already contrasts (between random and alternate allocation and between systematic and alternate allocation). The contrast between systematic and random allocation is just $b_2-b_1$ (the direction here is arbitrary, but we will use this to make things match up further below).

All these comparisons can be obtained in a single table with:

anova(res, L=rbind(c(0,1,0,0,0), c(0,0,1,0,0), c(0,-1,1,0,0)))

Hypotheses:                                                     
1:                            factor(alloc)random = 0
2:                        factor(alloc)systematic = 0
3: -factor(alloc)random + factor(alloc)systematic = 0
 
Results:
   estimate     se    zval   pval
1:  -0.3421 0.4180 -0.8183 0.4132
2:   0.0101 0.4467  0.0226 0.9820
3:   0.3522 0.3357  1.0489 0.2942

The multcomp package can also be used for this:

summary(glht(res, linfct=rbind(c(0,1,0,0,0), c(0,0,1,0,0), c(0,-1,1,0,0))), test=adjusted("none"))

         Simultaneous Tests for General Linear Hypotheses
 
Fit: rma(yi = yi, vi = vi, mods = ~factor(alloc) + year + ablat, data = dat)
 
Linear Hypotheses:
       Estimate Std. Error z value Pr(>|z|)
1 == 0  -0.3421     0.4180  -0.818    0.413
2 == 0   0.0101     0.4467   0.023    0.982
3 == 0   0.3522     0.3357   1.049    0.294
(Adjusted p values reported -- none method)

The p-values are unadjusted, so the chances of making at least one Type I error can accumulate quickly when there are many factor levels.

Corresponding confidence intervals (CIs) can be obtained with:

confint(glht(res, linfct=rbind(c(0,1,0,0,0), c(0,0,1,0,0), c(0,-1,1,0,0))), calpha=univariate_calpha())

         Simultaneous Confidence Intervals
 
Fit: rma(yi = yi, vi = vi, mods = ~factor(alloc) + year + ablat, data = dat)
 
Quantile = 1.96
95% confidence level
 
Linear Hypotheses:
       Estimate lwr     upr    
1 == 0 -0.3421  -1.1613  0.4772
2 == 0  0.0101  -0.8654  0.8856
3 == 0  0.3522  -0.3059  1.0102

These are univariate confidence intervals (that have 95% coverage probability for each contrast individually). Simultaneous (family-wise) confidence intervals can also be obtained by leaving out the calpha=univariate_calpha() argument (these will have 95% coverage probability for all three contrasts simultaneously).

We can also use the predict() function to obtain the CIs (and also the credibility/prediction intervals). Recall that, by default, the intercept is automatically added to the vector(s) specified via the newmods argument (when the model actually includes an intercept), which we do not want to happen here. So, we use:

predict(res, newmods=rbind(c(1,0,0,0), c(0,1,0,0), c(-1,1,0,0)), intercept=FALSE)

     pred     se   ci.lb  ci.ub   cr.lb  cr.ub
1 -0.3421 0.4180 -1.1613 0.4772 -1.5087 0.8246
2  0.0101 0.4467 -0.8654 0.8856 -1.1967 1.2169
3  0.3522 0.3357 -0.3059 1.0102 -0.7075 1.4118

We could also use an alternative parameterization of the same model, leaving out the intercept altogether, and including all three dummy variables:

res <- rma(yi, vi, mods = ~ factor(alloc) + year + ablat - 1, data=dat)
res

The table of coefficients now looks like this:

                         estimate      se     zval    pval    ci.lb   ci.ub   
factor(alloc)alternate    -0.2324  0.5550  -0.4187  0.6754  -1.3201  0.8554   
factor(alloc)random       -0.5744  0.6547  -0.8774  0.3803  -1.8576  0.7087   
factor(alloc)systematic   -0.2223  0.6636  -0.3349  0.7377  -1.5230  1.0784   
year                       0.0075  0.0194   0.3849  0.7003  -0.0306  0.0456   
ablat                     -0.0236  0.0132  -1.7816  0.0748  -0.0495  0.0024

So, $b_1$, $b_2$, and $b_3$ are now the estimated (average) log risk ratios for alternate, random, and systematic allocation in 1948 at 13 degrees absolute latitude.

To obtain all pairwise comparisons and interval bounds, we would now use:

anova(res, L=rbind(c(-1,1,0,0,0), c(-1,0,1,0,0), c(0,-1,1,0,0)))

Hypotheses:                                                        
1:     -factor(alloc)alternate + factor(alloc)random = 0
2: -factor(alloc)alternate + factor(alloc)systematic = 0
3:    -factor(alloc)random + factor(alloc)systematic = 0
 
Results:
   estimate     se    zval   pval
1:  -0.3421 0.4180 -0.8183 0.4132
2:   0.0101 0.4467  0.0226 0.9820
3:   0.3522 0.3357  1.0489 0.2942

Or with:

summary(glht(res, linfct=rbind(c(-1,1,0,0,0), c(-1,0,1,0,0), c(0,-1,1,0,0))), test=adjusted("none"))

Confidence intervals for the contrasts can be obtained with:

confint(glht(res, linfct=rbind(c(-1,1,0,0,0), c(-1,0,1,0,0), c(0,-1,1,0,0))), calpha=univariate_calpha())
predict(res, newmods=rbind(c(-1,1,0,0,0), c(-1,0,1,0,0), c(0,-1,1,0,0)))

The results are exactly the same as above.

The contrMat() function from the multcomp package can also be used to automatically set up the contrast matrix with:

summary(glht(res, linfct=cbind(contrMat(c("alternate"=1,"random"=1,"systematic"=1), type="Tukey"), 0, 0)), test=adjusted("none"))

This will again yield the same results as above. An advantage is that the contrasts are labeled more informatively in the output.

The p-values for the contrasts can be adjusted using various methods (see help(summary.glht) and help(p.adjust) for possible options). For example, Holm's method is often a good choice, as it provides strong control of the family-wise error rate, but is not as overly conservative as the Bonferroni correction. We can obtain the adjusted p-values with:

summary(glht(res, linfct=cbind(contrMat(c("alternate"=1,"random"=1,"systematic"=1), type="Tukey"), 0, 0)), test=adjusted("holm"))

         Simultaneous Tests for General Linear Hypotheses
 
Linear Hypotheses:
                            Estimate Std. Error z value Pr(>|z|)
random - alternate == 0      -0.3421     0.4180  -0.818    0.883
systematic - alternate == 0   0.0101     0.4467   0.023    0.982
systematic - random == 0      0.3522     0.3357   1.049    0.883
(Adjusted p values reported -- holm method)

The Knapp and Hartung adjustment can be used in this context as well.

res <- rma(yi, vi, mods = ~ factor(alloc) + year + ablat - 1, data=dat, test="knha")

Then all of the following commands yield the same results:

anova(res, L=rbind(c(-1,1,0,0,0), c(-1,0,1,0,0), c(0,-1,1,0,0)))
summary(glht(res, linfct=rbind(c(-1,1,0,0,0), c(-1,0,1,0,0), c(0,-1,1,0,0)), df=df.residual(res)), test=adjusted("none"))
summary(glht(res, linfct=cbind(contrMat(c("alternate"=1,"random"=1,"systematic"=1), type="Tukey"), 0, 0), df=df.residual(res)), test=adjusted("none"))

The output based on the anova() function looks like this:

Hypotheses:                                                        
1:     -factor(alloc)alternate + factor(alloc)random = 0
2: -factor(alloc)alternate + factor(alloc)systematic = 0
3:    -factor(alloc)random + factor(alloc)systematic = 0
 
Results:
   estimate     se    tval   pval
1:  -0.3421 0.4284 -0.7984 0.4477
2:   0.0101 0.4579  0.0221 0.9829
3:   0.3522 0.3441  1.0234 0.3361

Note that for the glht() function, we must use the df argument to specify the degrees of freedom for the t-tests. This value can be obtained via df.residual(res).

Similarly, we can obtain confidence interval bounds for the contrasts with:

confint(glht(res, linfct=rbind(c(-1,1,0,0,0), c(-1,0,1,0,0), c(0,-1,1,0,0)), df=df.residual(res)), calpha=univariate_calpha())

         Simultaneous Confidence Intervals
 
Fit: rma(yi = yi, vi = vi, mods = ~factor(alloc) + year + ablat - 1, data = dat, test="knha")
 
Quantile = 2.306
95% confidence level
 
Linear Hypotheses:
       Estimate lwr     upr    
1 == 0 -0.3421  -1.3300  0.6459
2 == 0  0.0101  -1.0457  1.0659
3 == 0  0.3522  -0.4414  1.1457

These intervals are now based on t-distributions. Simultaneous (family-wise) confidence interval bounds can again be obtained by leaving out the calpha=univariate_calpha() argument.

If we also want the credibility/prediction intervals, we could use the predict() command:

predict(res, newmods=rbind(c(-1,1,0,0,0), c(-1,0,1,0,0), c(0,-1,1,0,0)))

     pred     se   ci.lb  ci.ub   cr.lb  cr.ub
1 -0.3421 0.4284 -1.3300 0.6459 -1.7317 1.0476
2  0.0101 0.4579 -1.0457 1.0659 -1.4286 1.4488
3  0.3522 0.3441 -0.4414 1.1457 -0.9067 1.6110

References

Colditz, G. A., Brewer, T. F., Berkey, C. S., Wilson, M. E., Burdick, E., Fineberg, H. V., et al. (1994). Efficacy of BCG vaccine in the prevention of tuberculosis: Meta-analysis of the published literature. Journal of the American Medical Association, 271(9), 698–702.

Higgins, J. P. T., & Thompson, S. G. (2004). Controlling the risk of spurious findings from meta-regression. Statistics in Medicine, 23(11), 1663–1682.

Knapp, G., & Hartung, J. (2003). Improved tests for a random effects meta-regression with a single covariate. Statistics in Medicine, 22(17), 2693–2710.

¹⁾

With only $k=13$ studies, results from models including multiple moderators should be interpreted cautiously. The example is used here primarily for illustration purposes.

²⁾

Since alloc is a character variable, the coercion to a factor via the factor() function isn't strictly necessary (since this would have happened automatically anyway), but it's better to be explicit. In some cases, a numeric variable may in fact be used to represent levels of a categorical variable, in which case use of the factor() function is crucial.

³⁾

When running the permutation test with REML estimation (or some other iterative estimation method, such as ML or EB), there may be the occasional warning that the "Fisher scoring algorithm did not converge". Unless this happens a lot, this can be safely ignored. Alternatively, one could also fit the model with one of the non-iterative methods (such as DL), which will avoid non-convergence issues altogether.

Table of Contents