Differences

This shows you the differences between two versions of the page.

--- analyses:konstantopoulos2011 [2018/12/08 12:56] – external edit 127.0.0.1
+++ analyses:konstantopoulos2011 [2020/05/01 14:01] – Wolfgang Viechtbauer
@@ Line 21: / Line 21: @@
 </code>
 <code output>
-   district study year    yi    vi
+   district school study year     yi    vi
-        11     1 1976 -0.18 0.118
+        11      1     1 1976 -0.180 0.118
-        11     2 1976 -0.22 0.118
+        11      2     2 1976 -0.220 0.118
-        11     3 1976  0.23 0.144
+        11      3     3 1976  0.230 0.144
-        11     4 1976 -0.30 0.144
+        11      4     4 1976 -0.300 0.144
-        12     5 1989  0.13 0.014
+        12      1     5 1989  0.130 0.014
-        12     6 1989 -0.26 0.014
+        12      2     6 1989 -0.260 0.014
-        12     7 1989  0.19 0.015
+        12      3     7 1989  0.190 0.015
-        12     8 1989  0.32 0.024
+        12      4     8 1989  0.320 0.024
-        18     9 1994  0.45 0.023
+        18      1     9 1994  0.450 0.023
-       18    10 1994  0.38 0.043
+       18      2    10 1994  0.380 0.043
-       18    11 1994  0.29 0.012
+       18      3    11 1994  0.290 0.012
-       27    12 1976  0.16 0.020
+       27      1    12 1976  0.160 0.020
-       27    13 1976  0.65 0.004
+       27      2    13 1976  0.650 0.004
-       27    14 1976  0.36 0.004
+       27      3    14 1976  0.360 0.004
-       27    15 1976  0.60 0.007
+       27      4    15 1976  0.600 0.007
 </code>
 So, 4 studies were conducted in district 11, 4 studies in district 12, 3 studies in district 18, and so on. Variables ''yi'' and ''vi'' are the standardized mean differences and corresponding sampling variances.
@@ Line 76: / Line 76: @@
    4   3   4   4  11   3   8   6   5   4
 </code>
-So, as noted in the article, the data have an unbalanced structure, with the number of studies per district ranging from 3 to 11 (''range(table(dat$district))'') with an average of 5.1 (''round(mean(table(dat$district)), 1)'').
+So, as noted in the article, the data have an unbalanced structure, with the number of studies/schools per district ranging from 3 to 11 (''range(table(dat$district))'') with an average of 5.1 (''round(mean(table(dat$district)), 1)'').
 To obtain the descriptives about the effect size estimates per district (Table 3 in the paper), we can use:
@@ Line 99: / Line 99: @@
 ==== Two-Level Model ====
-First, a standard (two-level) random-effects model is fitted to the data. We can do the same with:
+First, a standard (two-level) random-effects model is fitted to the data. Here, we treat the 56 studies as independent (which we later will see is not justified). We can fit such a model with:
 <code rsplus>
 res <- rma(yi, vi, data=dat)
@@ Line 112: / Line 112: @@
 H^2 (total variability / sampling variability):  18.89
 Test for Heterogeneity:
 Q(df = 55) = 578.864, p-val < .001
 Model Results:
-estimate       se     zval     pval    ci.lb    ci.ub
+estimate     se   zval   pval  ci.lb  ci.ub
-.128    0.044    2.916    0.004    0.042    0.214       **
+.128  0.044  2.916  0.004  0.042  0.214  **
 ---
@@ Line 148: / Line 148: @@
 R^2 (amount of heterogeneity accounted for):            0.00%
 Test for Residual Heterogeneity:
 QE(df = 54) = 550.260, p-val < .001
-Test of Moderators (coefficient(s) 2):
+Test of Moderators (coefficient 2):
 QM(df = 1) = 1.383, p-val = 0.240
 Model Results:
                       estimate     se   zval   pval   ci.lb  ci.ub
 intrcpt                  0.126  0.044  2.859  0.004   0.040  0.212  **
 I(year - mean(year))     0.005  0.004  1.176  0.240  -0.003  0.014
 ---
@@ Line 173: / Line 173: @@
 Multivariate Meta-Analysis Model (k = 56; method: REML)
 Variance Components:
            estim   sqrt  nlvls  fixed  factor
 sigma^2    0.088  0.297     56     no   study
 Test for Heterogeneity:
 Q(df = 55) = 578.864, p-val < .001
 Model Results:
-estimate       se     zval     pval    ci.lb    ci.ub
+estimate     se   zval   pval  ci.lb  ci.ub
-.128    0.044    2.916    0.004    0.042    0.214       **
+.128  0.044  2.916  0.004  0.042  0.214  **
 ---
 Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
 </code>
-The ''random = ~ 1 | study'' argument adds random effects corresponding to the study level to the model. Note that these are the same results as we obtained earlier. As before, moderators/covariates can be added to the model via the ''mods'' argument.
+The ''random = ~ 1 | study'' argument adds random effects corresponding to the study level to the model (which is a unique value for every row in the dataset, so we are essentially adding random effects for every estimate to the model). Note that these are the same results as we obtained earlier. As before, moderators/covariates can be added to the model via the ''mods'' argument.
 ==== Three-Level Model ====
@@ Line 201: / Line 201: @@
 Multivariate Meta-Analysis Model (k = 56; method: REML)
 Variance Components:
            estim   sqrt  nlvls  fixed          factor
 sigma^2.1  0.065  0.255     11     no        district
 sigma^2.2  0.033  0.181     56     no  district/study
 Test for Heterogeneity:
 Q(df = 55) = 578.864, p-val < .001
 Model Results:
-estimate       se     zval     pval    ci.lb    ci.ub
+estimate     se   zval   pval  ci.lb  ci.ub
-.185    0.085    2.185    0.029    0.019    0.350        *
+.185  0.085  2.185  0.029  0.019  0.350  *
 ---
@@ Line 219: / Line 219: @@
 </code>
 These results correspond to those given on the left-hand side of Table 5 in the paper. Somewhat confusingly, the results given in the table appear to be based on ML (instead of REML) estimation. With the argument ''method="ML"'', we could reproduce the results given in the paper more closely.
+**Note**: We would obtain the same results when using ''random = ~ 1 | district/school''. While the school variable always starts at 1 within each district, using this notation means that random effects should be added for each level of ''district'' and for each level of ''school'' within each level of ''district''. The latter is the same as adding random effects for every level of ''study'', so the results will be identical.
 ==== Profile Likelihood Plots ====
-Whenever we start fitting more complicated models with the ''rma.mv()'' function, it is a good idea to check the profile likelihood plots of the variance components of the model. The ''profile()'' function can be used to obtain such plots. Here, the model includes two variance components, which are denoted as $\sigma^2_1$ and $\sigma^2_2$. Likelihood profiles for these two components can be obtained with:
+Whenever we start fitting more complicated models with the ''rma.mv()'' function, it is a good idea to check the profile likelihood plots of the variance components of the model. The ''profile()'' function can be used to obtain such plots. Here, the model includes two variance components, which are denoted as $\sigma^2_1$ (for between-district heterogeneity) and $\sigma^2_2$ (for between-school-within-district heterogeneity). Likelihood profiles for these two components can be obtained with:
 <code rsplus>
 par(mfrow=c(2,1))
@@ Line 244: / Line 246: @@
 [1] 0.665
 </code>
-Therefore, the underlying true effects within districts are estimated to correlate quite strongly.
+Therefore, the underlying true effects within districts are estimated to correlate quite strongly (the simpler two-level model we fitted at the beginning ignores this dependence).
 Also, it is worth noting that the sum of the two variance components can be interpreted as the total amount of heterogeneity in the true effects:
@@ Line 264: / Line 266: @@
 Multivariate Meta-Analysis Model (k = 56; method: REML)
 Variance Components:
 outer factor: district      (nlvls = 11)
-inner factor: factor(study) (nlvls = 11)
+inner factor: factor(study) (nlvls = 56)
            estim   sqrt  fixed
 tau^2      0.098  0.313     no
 rho        0.665            no
 Test for Heterogeneity:
 Q(df = 55) = 578.864, p-val < .001
 Model Results:
-estimate       se     zval     pval    ci.lb    ci.ub
+estimate     se   zval   pval  ci.lb  ci.ub
-.185    0.085    2.185    0.029    0.019    0.350        *
+.185  0.085  2.185  0.029  0.019  0.350  *
 ---
 Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
 </code>
-The ''random = ~ factor(study) | district'' argument adds correlated random effects for the different studies within districts to the model, where the variance-covariance matrix of the random effects takes on a compound symmetric structure (''struct="CS"'' is the default). Note that the estimate of $\rho$ that is obtained is exactly the same as the ICC value we computed earlier based on the multilevel model. Also, the estimate of $\tau^2$ obtained from the multivariate parameterization is the same as the total amount of heterogeneity computed earlier based on the multilevel model.
+The ''random = ~ factor(study) | district'' argument adds correlated random effects for the different studies within districts to the model, where the variance-covariance matrix of the random effects takes on a compound symmetric structure (''struct="CS"'' is the default). Note that the estimate of $\rho$ that is obtained is exactly the same as the ICC value we computed earlier based on the multilevel model. Also, the estimate of $\tau^2$ obtained from the multivariate parameterization is the same as the total amount of heterogeneity computed earlier based on the multilevel model. Note that ''random = ~ factor(school) | district'' would again yield the same results.
 As long as $\rho$ is estimated to be positive, the multilevel and multivariate parametrizations are in essence identical. In fact, the log likelihoods of the two models should be identical, which we can confirm with:
@@ Line 309: / Line 311: @@
 Again, both plots indicate that the estimates obtained in fact correspond to the peaks of the respective likelihood profiles, with decreasing log likelihoods as the values of parameters are moved away from the actual estimates.
-Since the log likelihood drops of quite dramatically when $\rho$ is set equal to a value very close to 1, the left-hand side of the profile gets 'squished' together at the top and it is more difficult to see the curvature around the estimate. One can change the x-axis limits with the ''xlim'' argument. For example, with ''profile(res.mv, rho=1, xlim=c(0.3,0.9))'', a nicer profile plot for $\rho$ can be obtained.
+Since the log likelihood drops of quite dramatically when $\rho$ is set equal to a value very close to 1, the left-hand side of the profile gets 'squished' together at the top and it is more difficult to see the curvature around the estimate. One can change the x-axis limits with the ''xlim'' argument. For example, with ''profile(res.mv, rho=1, xlim=c(0.3,0.9))'' we can obtain a nicer profile plot for $\rho$.
 ==== Uncorrelated Sampling Errors ====
@@ Line 324: / Line 326: @@
 Multivariate Meta-Analysis Model (k = 56; method: REML)
 Variance Components:
            estim   sqrt  nlvls  fixed    factor
 sigma^2    0.083  0.288     11     no  district
 Test for Heterogeneity:
 Q(df = 55) = 578.864, p-val < .001
 Model Results:
-estimate       se     zval     pval    ci.lb    ci.ub
+estimate     se   zval   pval  ci.lb  ci.ub
-.196    0.090    2.179    0.029    0.020    0.372        *
+.196  0.090  2.179  0.029  0.020  0.372  *
 ---
@@ Line 352: / Line 354: @@
 Multivariate Meta-Analysis Model (k = 56; method: REML)
 Variance Components:
            estim   sqrt  nlvls  fixed    factor
 sigma^2.1  0.041  0.203     11     no  district
 sigma^2.2  0.026  0.162     56     no     study
 outer factor: district      (nlvls = 11)
 inner factor: factor(study) (nlvls = 56)
            estim   sqrt  fixed
 tau^2      0.030  0.174     no
 rho        0.784            no
 Test for Heterogeneity:
 Q(df = 55) = 578.864, p-val < .001
 Model Results:
-estimate       se     zval     pval    ci.lb    ci.ub
+estimate     se   zval   pval  ci.lb  ci.ub
-.185    0.085    2.185    0.029    0.019    0.350        *
+.185  0.085  2.185  0.029  0.019  0.350  *
 ---