Table of Contents
Credé et al. (2010)
The Methods and Data
The meta-analysis by Credé, Roch, and Kieszczynka (2010) examined to what extent class attendance in college is related to college grades and students' grade point average (GPA) more broadly. The studies included in the meta-analysis either reported the correlation between attendance and grades for a sample of students in a particular class or the correlation between attendance and GPA. Aside from the interesting subject matter, the meta-analysis is also instructive because it allows for a nice illustration of the use of a multilevel meta-analytic model for its analysis. The reason for using such a model will be discussed further below.
The first author (who was a fellow PhD student at the University of Illinois, Urbana-Champaign, in the early 2000's) was kind enough to share the dataset with me and it is now included in the metafor package.1) Note that this dataset differs just slightly from the one that was actually used by Credé et al. (2010) in their meta-analysis, but any differences are negligible and not relevant for the purposes of this illustration.
The data can be loaded with:
library(metafor) dat <- dat.crede2010
(I copy the dataset into dat
, which is a bit shorter and therefore easier to type further below). Let's examine the first 15 rows of the dataset:
head(dat, 15)
studyid year source sampleid criterion class ni ri 1 1 2009 dissertation 1 grade nonscience 76 0.8860 2 2 1975 journal 1 grade nonscience 297 0.3000 3 3 2007 dissertation 1 gpa <NA> 191 0.7200 4 4 1989 journal 1 grade nonscience 265 0.4750 5 4 1989 journal 2 grade nonscience 154 0.3340 6 5 2008 journal 1 grade nonscience 162 0.6150 7 6 1999 journal 1 grade nonscience 28 0.1450 8 6 1999 journal 2 grade nonscience 33 0.2300 9 6 1999 journal 3 grade nonscience 47 0.2700 10 6 1999 journal 4 grade nonscience 25 -0.0228 11 6 1999 journal 5 grade nonscience 48 0.4290 12 6 1999 journal 6 grade nonscience 39 0.3490 13 6 1999 journal 7 grade nonscience 41 0.2200 14 6 1999 journal 8 grade nonscience 35 0.3390 15 6 1999 journal 9 grade nonscience 46 0.4470
Variable studyid
is a study identifier, year
denotes the publication year of the study, the source
variable indicates whether a particular study was a published journal article, dissertation, or was obtained from some other source, variable sampleid
is a sample (within studies) identifier (some studies included multiple samples), criterion
indicates whether the correlation reported in a particular row corresponds to the relationship between attendance and grades within a particular class or overall GPA, the class
variable distinguishes science from non-science classes (missing when the criterion is GPA), while variables ni
and ri
contain the sample sizes and correlation coefficients, respectively.
For the analysis, we will use r-to-z transformed correlation coefficients and focus on the analysis between attendance and grades in particular classes. We use the escalc()
function to compute the transformed coefficients and select only the rows from the dataset where the criterion
variable is equal to grade
.
dat <- escalc(measure="ZCOR", ri=ri, ni=ni, data=dat, subset=criterion=="grade") head(dat, 15)
studyid . sampleid . ni ri yi vi 1 1 . 1 . 76 0.8860 1.4030 0.0137 2 2 . 1 . 297 0.3000 0.3095 0.0034 4 4 . 1 . 265 0.4750 0.5165 0.0038 5 4 . 2 . 154 0.3340 0.3473 0.0066 6 5 . 1 . 162 0.6150 0.7169 0.0063 7 6 . 1 . 28 0.1450 0.1460 0.0400 8 6 . 2 . 33 0.2300 0.2342 0.0333 9 6 . 3 . 47 0.2700 0.2769 0.0227 10 6 . 4 . 25 -0.0228 -0.0228 0.0455 11 6 . 5 . 48 0.4290 0.4587 0.0222 12 6 . 6 . 39 0.3490 0.3643 0.0278 13 6 . 7 . 41 0.2200 0.2237 0.0263 14 6 . 8 . 35 0.3390 0.3530 0.0312 15 6 . 9 . 46 0.4470 0.4809 0.0233 17 7 . 1 . 350 0.4010 0.4248 0.0029
Variables yi
and vi
contain the transformed correlation coefficients and the corresponding sampling variances, respectively.
Descriptive Statistics
First, we will check how many studies and how many correlations are included in the dataset:
c(studies=length(unique(dat$studyid)), correlations=nrow(dat))
studies correlations 54 67
Hence, the dataset includes 54 studies, which provide a total of 67 (transformed) correlation coefficients. So, as noted above, some studies included multiple samples (e.g., for different classes or for multiple sections of the same class) and reported separate correlation coefficients for the different samples. We can check how often this was the case with:
table(table(dat$studyid))
1 2 9 48 5 1
Therefore, most (i.e., 48) studies only provided a single correlation coefficient, 5 studies reported two coefficients, and one study reported the results for 9 different samples (which was study 6, as we can see in the data output further above).
We can obtain some descriptive statistics about the observed correlation coefficients with:
round(c(summary(dat$ri), SD=sd(dat$ri)), digits=2)
Min. 1st Qu. Median Mean 3rd Qu. Max. SD -0.02 0.30 0.39 0.40 0.49 0.89 0.17
The observed correlations range from -0.02 to 0.89, with a mean and median equal to 0.39 and 0.40, respectively.
Standard Random-Effects Model
We start out by fitting a standard random-effects model to the data. Here, we treat the 67 coefficients in the dataset as independent (which we will later see is not justified). We can fit such a model with:
res <- rma(yi, vi, data=dat) res
Random-Effects Model (k = 67; tau^2 estimator: REML) tau^2 (estimated amount of total heterogeneity): 0.0511 (SE = 0.0104) tau (square root of estimated tau^2 value): 0.2261 I^2 (total heterogeneity / total variability): 93.83% H^2 (total variability / sampling variability): 16.21 Test for Heterogeneity: Q(df = 66) = 1068.7213, p-val < .0001 Model Results: estimate se zval pval ci.lb ci.ub 0.4547 0.0300 15.1343 <.0001 0.3958 0.5136 *** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
The Q-test and the estimate of $\tau^2$ suggest that the observed outcomes are heterogeneous. Hence, the strength of the relationship between class attendance and grades appears to vary across studies. The estimated average transformed correlation coefficient (i.e., 0.4547) is clearly above 0, indicating that there is on average a positive relationship between these two variables. For easier interpretation, we can back-transform the results using the z-to-r transformation:
predict(res, transf=transf.ztor, digits=2)
pred ci.lb ci.ub pi.lb pi.ub 0.43 0.38 0.47 0.01 0.72
Hence, the estimated average correlation is equal to 0.43 (95% CI: 0.38 to 0.47), which is a fairly sizable association between these two variables.2) In fact, as the authors of the meta-analysis note, this makes class attendance "a better predictor of college grades than any other known predictor of academic performance, including scores on standardized admissions tests such as the SAT, high school GPA, study habits, and study skills." Of course, we do not know if the attendance per se leads to better grades or if better students tend to attend more classes or if some third variable (e.g., motivation) influences both attendance and grades, but this issue is beyond the purposes of this illustration (but see the paper by Credé et al., 2010, for some further discussion on this).
Finally, it is worth noting in this analysis that the 95% prediction interval is quite wide, ranging from a value just above 0 to a correlation of 0.72, reflecting the substantial heterogeneity in the results. Hence, while on average there appears to be a sizable correlation between these two variables, in any particular situation/class the actual strength of the association might be quite a bit weaker or stronger.
Hunter and Schmidt Method
When comparing the results reported in the paper (see the row labeled "All Class Grades" in Table 1) with what we obtained above, we see some slight discrepancies. This is because the analyses in the paper used a different meta-analytic approach based on the methods described by Hunter and Schmidt (see here for some further details on this method). We can obtain essentially the same results as those given in the paper with:3)
tmp <- escalc(measure="COR", ri=ri, ni=ni, data=dat, vtype="AV") res <- rma(yi, vi, weights=ni, data=tmp, method="HS") print(res, digits=2)
Random-Effects Model (k = 67; tau^2 estimator: HS) tau^2 (estimated amount of total heterogeneity): 0.02 (SE = 0.01) tau (square root of estimated tau^2 value): 0.14 I^2 (total heterogeneity / total variability): 90.57% H^2 (total variability / sampling variability): 10.61 Test for Heterogeneity: Q(df = 66) = 741.15, p-val < .01 Model Results: estimate se zval pval ci.lb ci.ub 0.44 0.04 12.45 <.01 0.37 0.51 *** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Since we directly meta-analyze the correlation coefficients in this approach, the value given under estimate
is the estimated average correlation coefficient, which matches what the authors report (i.e., 0.44). The value reported as tau
in the output is the estimated standard deviation of the true correlations, which also matches what the authors report as $SD_{\rho}$ (i.e., 0.14). Finally, the authors report an 80% prediction interval (with bounds labeled as "10%CV" and "90%CV"), which we can obtain with:4)
predict(res, digits=2, level=80, pi.type="simple")
pred se ci.lb ci.ub pi.lb pi.ub 0.44 0.04 0.40 0.49 0.26 0.63
There is a slight difference in the upper bound (0.63 versus 0.62), but this might just be a rounding issue.
Standard RE Model with the rma.mv() Function
Let us return for the moment to the standard random-effects model we fitted earlier using the rma()
function. We can also fit the same model with the rma.mv()
function (see also here for a more detailed discussion on this). The syntax for the random-effects model is:
dat$id <- 1:nrow(dat) res <- rma.mv(yi, vi, random = ~ 1 | id, data=dat) res
Multivariate Meta-Analysis Model (k = 67; method: REML) Variance Components: estim sqrt nlvls fixed factor sigma^2 0.0511 0.2261 67 no id Test for Heterogeneity: Q(df = 66) = 1068.7213, p-val < .0001 Model Results: estimate se tval df pval ci.lb ci.ub 0.4547 0.0300 15.1343 66 <.0001 0.3947 0.5147 *** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
As shown above, we first create a variable called id
that takes on a unique value for every row in the dataset and then use random = ~ 1 | id
in rma.mv()
to add a random effect corresponding to each row in the dataset to the model. This is in fact what happens automatically in rma()
but when using the rma.mv()
function, we have to add random effects manually. As we can see above, the results are identical to those we obtained earlier with the rma()
function.
Multilevel Meta-Analysis Model
As noted earlier, some studies reported correlation coefficients for multiple samples within the same study. The dataset therefore has a multilevel structure, with samples nested within studies. A multilevel meta-analysis of these data can be used to estimate and account for the amount of heterogeneity between studies and between samples within studies. For this, we add a random effect not only at the study level, but also at the sample (within studies) level. The syntax for this is:
res <- rma.mv(yi, vi, random = ~ 1 | studyid/sampleid, data=dat) res
Multivariate Meta-Analysis Model (k = 67; method: REML) Variance Components: estim sqrt nlvls fixed factor sigma^2.1 0.0376 0.1939 54 no studyid sigma^2.2 0.0159 0.1259 67 no studyid/sampleid Test for Heterogeneity: Q(df = 66) = 1068.7213, p-val < .0001 Model Results: estimate se zval pval ci.lb ci.ub 0.4798 0.0331 14.5167 <.0001 0.4151 0.5446 *** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
The estimated variance components indicate that there is more heterogeneity between studies (0.0376) than between samples within studies (0.0159). Again, we can back-transform the results with:
predict(res, transf=transf.ztor, digits=2)
pred ci.lb ci.ub pi.lb pi.ub 0.45 0.39 0.50 0.02 0.73
Although these results are not substantially different than what we obtained earlier, this model accounts for the multilevel structure in the data.
Intraclass Correlation of the True Outcomes
The multilevel model above accounts for potential dependence in multiple outcomes coming from the same study. In fact, it can be shown that the model implies an intraclass correlation coefficient (ICC) of $$\rho = \frac{\sigma^2_1}{\sigma^2_1 + \sigma^2_2}$$ for the true outcomes within the same study (where $\sigma^2_1$ is the variance component corresponding to the study level and $\sigma^2_2$ is the variance component corresponding to the sample level).
In the present case, the ICC is estimated to be:
round(res$sigma2[1] / sum(res$sigma2), digits=4)
[1] 0.7033
Therefore, the underlying true outcomes (i.e., transformed correlation coefficients) are estimated to correlate quite strongly within studies. On the other hand, the standard random-effects model we fitted earlier ignores this dependence and treats all rows in the dataset as independent.
Also, it is worth noting that the sum of the two variance components can be interpreted as the total amount of heterogeneity in the true outcomes:
round(sum(res$sigma2), digits=4)
[1] 0.0535
The relevance of this value will become clear in a moment.
Multivariate Parameterization of the Model
We can fit the same model using a multivariate parameterization, which directly provides us with the ICC. For this, we use the syntax:
res <- rma.mv(yi, vi, random = ~ sampleid | studyid, data=dat) res
Multivariate Meta-Analysis Model (k = 67; method: REML) Variance Components: outer factor: studyid (nlvls = 54) inner factor: sampleid (nlvls = 9) estim sqrt fixed tau^2 0.0535 0.2312 no rho 0.7033 no Test for Heterogeneity: Q(df = 66) = 1068.7213, p-val < .0001 Model Results: estimate se zval pval ci.lb ci.ub 0.4798 0.0331 14.5167 <.0001 0.4151 0.5446 *** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
The random = ~ sampleid | studyid
argument adds correlated random effects for the different samples within studies to the model, where the variance-covariance matrix of the random effects takes on a compound symmetric structure (struct="CS"
is the default). Note that the estimate given next to rho
is exactly the same as the ICC value we computed earlier based on the multilevel model. Also, the estimate given next to tau^2
obtained from the multivariate parameterization is the same as the total amount of heterogeneity computed earlier based on the multilevel model.
As long as rho
is estimated to be positive, the multilevel and multivariate parameterizations are in essence identical. In fact, the log likelihoods of the two models should be identical, which we can confirm with:
res1 <- rma.mv(yi, vi, random = ~ 1 | studyid/sampleid, data=dat) res2 <- rma.mv(yi, vi, random = ~ sampleid | studyid, data=dat) fitstats(res1, res2)
res1 res2 logLik: 1.304449 1.304449 deviance: -2.608898 -2.608898 AIC: 3.391102 3.391102 BIC: 9.960066 9.960066 AICc: 3.778198 3.778198
Likelihood Ratio Test to Compare Models
The analysis above raises the question whether the additional complexity of the multilevel model is warranted. To address this question, we can conduct a likelihood ratio test, comparing the fit of the multilevel model with that of the standard random-effects model:
res0 <- rma.mv(yi, vi, random = ~ 1 | id, data=dat) res1 <- rma.mv(yi, vi, random = ~ 1 | studyid/sampleid, data=dat) anova(res0, res1)
df AIC BIC AICc logLik LRT pval QE Full 3 3.3911 9.9601 3.7782 1.3044 1068.7213 Reduced 2 6.5060 10.8853 6.6964 -1.2530 5.1149 0.0237 1068.7213
The multilevel model provides a significantly better fit, as indicated by the likelihood ratio test ($\chi^2 = 5.11$, $\mbox{df} = 1$, $p = 0.0237$). Also, the information criteria (i.e., the AIC, BIC, and the corrected AIC) all point towards the multilevel model as being the preferable model (for these statistics, smaller is better).
Uncorrelated Sampling Errors
The multilevel model above accounts for possible dependence in the underlying true outcomes (i.e., the true transformed correlation coefficients) within studies. That is, if the true association is quite strong for a particular section of a class, then it probably also tends to be quite strong for other sections (and vice-versa). In fact, this is what the high ICC value above suggests.
It is important to note that the model still assumes that the sampling errors of the observed outcomes are independent. This is typically a safe assumption as long as there is no overlap in the data/subjects used to compute the various estimates. In the present example, this would seem to be the case, given that multiple correlation coefficients within the same study were obtained from independent samples.
However, when multiple estimates are obtained from the same group of subjects (e.g., the correlation between attendance and grades in different classes for the same group of subjects), then this assumption would be violated. In this case, one also needs to account for the dependence in the sampling errors of the estimates, which requires additional work, but is beyond the scope of this example.
A Common Mistake when Fitting the Multilevel Model
When fitting the multilevel model, a common mistake is to forget to add random effects at both levels. In other words, one could fit the model:
rma.mv(yi, vi, random = ~ 1 | studyid, data=dat)
Multivariate Meta-Analysis Model (k = 67; method: REML) Variance Components: estim sqrt nlvls fixed factor sigma^2 0.0528 0.2298 54 no studyid Test for Heterogeneity: Q(df = 66) = 1068.7213, p-val < .0001 Model Results: estimate se zval pval ci.lb ci.ub 0.4865 0.0332 14.6416 <.0001 0.4214 0.5517 *** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
This model only accounts for heterogeneity between studies, so it assumes that the true (transformed) correlation coefficients for different samples within the same study are all identical (or put differently, that the ICC is equal to 1). In the present case, the ICC in the multilevel model is actually quite high, so the results are not dramatically different, but this may not be true in other applications.
We could in fact again conduct a likelihood ratio test to compare these two models:
res0 <- rma.mv(yi, vi, random = ~ 1 | studyid, data=dat) res1 <- rma.mv(yi, vi, random = ~ 1 | studyid/sampleid, data=dat) anova(res0, res1)
df AIC BIC AICc logLik LRT pval QE Full 3 3.3911 9.9601 3.7782 1.3044 1068.7213 Reduced 2 14.8533 19.2326 15.0438 -5.4267 13.4622 0.0002 1068.7213
These results indicate that the multilevel model with random effects at both levels fits significantly better than the simpler model that only adds a random effect at the study level.
Regardless of the result of this test, not adding random effects at both levels is typically an inadvertent mistake. Unless the estimate of the variance component at the sample level happens to be equal to zero (which is identical to simply leaving out this random effect from the model altogether), it should always be included.
References
Credé, M., Roch, S. G. & Kieszczynka, U. M. (2010). Class attendance in college: A meta-analytic review of the relationship of class attendance with grades and student characteristics. Review of Educational Research, 80(2), 272–295. https://doi.org/10.3102/0034654310362998
predict(res, transf=transf.ztor.int, targs=res$tau2, digits=2)
, which yields an estimate of 0.41 (95% CI: 0.36 to 0.46). Since the difference is relatively small in this case, we will ignore this technicality for the remainder of this illustration.pi.type="simple"
to use a slightly simpler method for constructing the prediction interval, which is used in the Hunter and Schmidt method.