tips:i2_multilevel_multivariate
no way to compare when less than two revisions
Differences
This shows you the differences between two versions of the page.
Previous revisionNext revision | |||
— | tips:i2_multilevel_multivariate [2021/01/06 10:58] – [General Equation for I^2] Wolfgang Viechtbauer | ||
---|---|---|---|
Line 1: | Line 1: | ||
+ | ===== I^2 for Multilevel and Multivariate Models ===== | ||
+ | |||
+ | The $I^2$ statistic was introduced by Higgins and Thompson in their seminal 2002 paper and has become a rather popular statistic to report in meta-analyses, | ||
+ | |||
+ | For a standard random-effects models, the $I^2$ statistic is computed with $$I^2 = 100\% \times \frac{\hat{\tau}^2}{\hat{\tau}^2 + \tilde{v}}, | ||
+ | |||
+ | **Sidenote**: | ||
+ | |||
+ | However, this caveat aside, $I^2$ is a very useful measure because it directly indicates to what extent heterogeneity contributes to the total variance. In addition, most people find $I^2$ easier to interpret than estimates of $\tau^2$. | ||
+ | |||
+ | ==== Standard Random-Effects Model ==== | ||
+ | |||
+ | Let's try out the computation for a standard random-effects model (see [[analyses: | ||
+ | <code rsplus> | ||
+ | library(metafor) | ||
+ | dat <- escalc(measure=" | ||
+ | res <- rma(yi, vi, data=dat) | ||
+ | res$I2 | ||
+ | </ | ||
+ | <code output> | ||
+ | [1] 92.22139 | ||
+ | </ | ||
+ | So, we estimate that roughly 92% of the total variance is due to heterogeneity (i.e., variance in the true effects), while the remaining 8% can be attributed to sampling variance. | ||
+ | |||
+ | Manually computing $I^2$ as described above yields the same result: | ||
+ | <code rsplus> | ||
+ | k <- res$k | ||
+ | wi <- 1/dat$vi | ||
+ | vt <- (k-1) * sum(wi) / (sum(wi)^2 - sum(wi^2)) | ||
+ | 100 * res$tau2 / (res$tau2 + vt) | ||
+ | </ | ||
+ | <code output> | ||
+ | [1] 92.22139 | ||
+ | </ | ||
+ | |||
+ | ==== General Equation for $\boldsymbol{I}^2$ ==== | ||
+ | |||
+ | Before we continue with more complex models, it is useful to point out a more general equation for computing $I^2,$ which also applies to models involving moderator variables (i.e., mixed-effects meta-regression models). This will also become important when dealing with models where sampling errors are no longer independent. So, let us define $$\mathbf{P} = \mathbf{W} - \mathbf{W} \mathbf{X} (\mathbf{X}' | ||
+ | |||
+ | Let's try this out for the example above: | ||
+ | <code rsplus> | ||
+ | W <- diag(1/ | ||
+ | X <- model.matrix(res) | ||
+ | P <- W - W %*% X %*% solve(t(X) %*% W %*% X) %*% t(X) %*% W | ||
+ | 100 * res$tau2 / (res$tau2 + (res$k-res$p)/ | ||
+ | </ | ||
+ | <code output> | ||
+ | [1] 92.22139 | ||
+ | </ | ||
+ | |||
+ | For a model with moderators, this is also how '' | ||
+ | <code rsplus> | ||
+ | res <- rma(yi, vi, mods = ~ ablat, data=dat) | ||
+ | res$I2 | ||
+ | </ | ||
+ | <code output> | ||
+ | [1] 68.39313 | ||
+ | </ | ||
+ | <code rsplus> | ||
+ | X <- model.matrix(res) | ||
+ | P <- W - W %*% X %*% solve(t(X) %*% W %*% X) %*% t(X) %*% W | ||
+ | 100 * res$tau2 / (res$tau2 + (res$k-res$p)/ | ||
+ | </ | ||
+ | <code output> | ||
+ | [1] 68.39313 | ||
+ | </ | ||
+ | (although instead of using '' | ||
+ | |||
+ | In models with moderators, the $I^2$ statistic indicates how much of the unaccounted variance in the observed effects or outcomes (which is composed of unaccounted variance in the true effects, that is, residual heterogeneity, | ||
+ | |||
+ | ==== Multilevel Models ==== | ||
+ | |||
+ | Multilevel structures arise when the estimates can be grouped together based on some higher-level clustering variable (e.g., paper, lab or research group, species). In that case, true effects belonging to the same group may be more similar to each other than true effects for different groups. Meta-analytic multilevel models can be used to account for the between- and within-cluster heterogeneity and hence the intracluster (or intraclass) correlation in the true effects. See [[analyses: | ||
+ | |||
+ | In fact, let's use the same example here. First, we can fit the multilevel random-effects model with: | ||
+ | <code rsplus> | ||
+ | dat <- dat.konstantopoulos2011 | ||
+ | res <- rma.mv(yi, vi, random = ~ 1 | district/ | ||
+ | res | ||
+ | </ | ||
+ | <code output> | ||
+ | Multivariate Meta-Analysis Model (k = 56; method: REML) | ||
+ | |||
+ | Variance Components: | ||
+ | |||
+ | estim sqrt nlvls fixed | ||
+ | sigma^2.1 | ||
+ | sigma^2.2 | ||
+ | |||
+ | Test for Heterogeneity: | ||
+ | Q(df = 55) = 578.8640, p-val < .0001 | ||
+ | |||
+ | Model Results: | ||
+ | |||
+ | estimate | ||
+ | 0.1847 | ||
+ | |||
+ | --- | ||
+ | Signif. codes: | ||
+ | </ | ||
+ | Note that the model contains two variance components ($\sigma^2_1$ and $\sigma^2_2$), | ||
+ | |||
+ | Based on the discussion above, it is now very easy to generalize the concept of $I^2$ to such a model (see also Nakagawa & Santos, 2012). That is, we can first compute: | ||
+ | <code rsplus> | ||
+ | W <- diag(1/ | ||
+ | X <- model.matrix(res) | ||
+ | P <- W - W %*% X %*% solve(t(X) %*% W %*% X) %*% t(X) %*% W | ||
+ | 100 * sum(res$sigma2) / (sum(res$sigma2) + (res$k-res$p)/ | ||
+ | </ | ||
+ | <code output> | ||
+ | [1] 95.18731 | ||
+ | </ | ||
+ | Note that we have summed up the two variance components in the numerator and denominator. Therefore, this statistic can be thought of as the overall $I^2$ value that indicates how much of the total variance can be attributed to the total amount of heterogeneity (which is the sum of between- and within-cluster heterogeneity). In this case, the value is again very large, with approximately 95% of the total variance due to heterogeneity. | ||
+ | |||
+ | However, we can also break things down to estimate how much of the total variance can be attributed to between- and within-cluster heterogeneity separately: | ||
+ | <code rsplus> | ||
+ | 100 * res$sigma2 / (sum(res$sigma2) + (res$k-res$p)/ | ||
+ | </ | ||
+ | <code output> | ||
+ | [1] 63.32484 31.86248 | ||
+ | </ | ||
+ | Therefore, about 63% of the total variance is estimated to be due to between-cluster heterogeneity, | ||
+ | |||
+ | ==== Multivariate Models ==== | ||
+ | |||
+ | Now we will consider the same type of generalization, | ||
+ | |||
+ | <code rsplus> | ||
+ | dat <- dat.berkey1998 | ||
+ | V <- lapply(split(dat[, | ||
+ | V <- bldiag(V) | ||
+ | res <- rma.mv(yi, V, mods = ~ outcome - 1, random = ~ outcome | trial, struct=" | ||
+ | res | ||
+ | </ | ||
+ | <code output> | ||
+ | Multivariate Meta-Analysis Model (k = 10; method: REML) | ||
+ | |||
+ | Variance Components: | ||
+ | |||
+ | outer factor: trial | ||
+ | inner factor: outcome (nlvls = 2) | ||
+ | |||
+ | estim sqrt k.lvl fixed level | ||
+ | tau^2.1 | ||
+ | tau^2.2 | ||
+ | |||
+ | rho.AL | ||
+ | AL | ||
+ | PD 0.6088 | ||
+ | |||
+ | Test for Residual Heterogeneity: | ||
+ | QE(df = 8) = 128.2267, p-val < .0001 | ||
+ | |||
+ | Test of Moderators (coefficient(s) 1,2): | ||
+ | QM(df = 2) = 108.8616, p-val < .0001 | ||
+ | |||
+ | Model Results: | ||
+ | |||
+ | | ||
+ | outcomeAL | ||
+ | outcomePD | ||
+ | |||
+ | --- | ||
+ | Signif. codes: | ||
+ | </ | ||
+ | |||
+ | Two things are worth noting here. First of all, we allow the amount of heterogeneity to differ for the two outcomes (AL = attachment level; PD = probing depth) by using an unstructured variance-covariance matrix for the true effects (i.e., '' | ||
+ | |||
+ | Therefore, a possible generalization of $I^2$ to this model is: | ||
+ | <code rsplus> | ||
+ | W <- solve(V) | ||
+ | X <- model.matrix(res) | ||
+ | P <- W - W %*% X %*% solve(t(X) %*% W %*% X) %*% t(X) %*% W | ||
+ | 100 * res$tau2 / (res$tau2 + (res$k-res$p)/ | ||
+ | </ | ||
+ | <code output> | ||
+ | [1] 93.07407 82.84449 | ||
+ | </ | ||
+ | Hence, about 93% of the total (unaccounted for) variance is due to heterogeneity in the true effects for outcome AL and about 83% due to heterogeneity in the true effects for outcome PD. | ||
+ | |||
+ | The approach above computes the ' | ||
+ | <code rsplus> | ||
+ | c(100 * res$tau2[1] / (res$tau2[1] + (sum(dat$outcome == " | ||
+ | 100 * res$tau2[2] / (res$tau2[2] + (sum(dat$outcome == " | ||
+ | </ | ||
+ | <code output> | ||
+ | [1] 94.8571 75.1876 | ||
+ | </ | ||
+ | Not much of a difference, but if sampling variances had been very dissimilar for the two outcomes, then this could make more of a difference. | ||
+ | |||
+ | ==== Jackson et al. (2012) Approach ==== | ||
+ | |||
+ | For multivariate models, Jackson et al. (2012) describe a different approach for computing $I^2$-type statistics that is based on the variance-covariance matrix of the fixed effects under the model with random effects and the model without. So, we fit these two models: | ||
+ | <code rsplus> | ||
+ | res.R <- rma.mv(yi, V, mods = ~ outcome - 1, random = ~ outcome | trial, struct=" | ||
+ | res.F <- rma.mv(yi, V, mods = ~ outcome - 1, data=dat) | ||
+ | </ | ||
+ | Then $I^2$-type statistics for the two outcomes can be computed with: | ||
+ | <code rsplus> | ||
+ | c(100 * (vcov(res.R)[1, | ||
+ | 100 * (vcov(res.R)[2, | ||
+ | </ | ||
+ | <code output> | ||
+ | [1] 95.49916 76.42214 | ||
+ | </ | ||
+ | These values are very similar to the ones obtained above when computing separate values for the ' | ||
+ | |||
+ | ==== References ==== | ||
+ | |||
+ | Borenstein, M., Hedges, L. V., Higgins, J. P. T., & Rothstein, H. R. (2009). // | ||
+ | |||
+ | Higgins, J. P. T., & Thompson, S. G. (2002). Quantifying heterogeneity in a meta-analysis. // | ||
+ | |||
+ | Jackson, D., White, I. R., & Riley, R. D. (2012). Quantifying the impact of between-study heterogeneity in multivariate meta-analyses. // | ||
+ | |||
+ | Konstantopoulos, | ||
+ | |||
+ | Nakagawa, S., & Santos, E. S. A. (2012). Methodological issues and advances in biological meta-analysis. // | ||
+ | |||
+ | Takkouche, B., Cadarso-Suárez, | ||
+ | |||
+ | Takkouche, B., Khudyakov, P., Costa-Bouzas, | ||
tips/i2_multilevel_multivariate.txt · Last modified: 2022/10/24 10:10 by Wolfgang Viechtbauer