--- title: "Meta-analyzing correlations" author: "Brenton Wiernik" date: "`r Sys.Date()`" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Meta-analyzing correlations} %\VignetteEngine{knitr::rmarkdown} %\usepackage[utf8]{inputenc} csl: "apa.csl" bibliography: vignette.yaml --- **Note: This vignette is a work in progress.** ```{r setup, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) library(psychmeta) options(width = 160) ``` This vignette will walk you through estimating barebones meta-analyses of correlations between multiple constructs. For more vignettes, see the [**psychmeta** overview](overview.html). ## Getting Started To begin, you will need your meta-analytic data sheet for analysis. We recommend the **rio** package for importing data to **R**. For an introduction to **rio**, see `vignette("rio", "rio")`. **psychmeta** assumes that your data are in "long" format, with each row corresponding to one effect size. For example, this is the format used in this data frame: ```{r} knitr::kable(data_r_meas_multi[1:10,]) ``` In this table, - **`sample_id`** contains labels indicating the sample each effect size is drawn from; - **`moderator`** is a moderator variable, each row containing the effect size's level for that moderator; - **`x_name`** and **`y_name`** are columns indicating the variables/constructs being related in the effect size; - **`n`** is the sample size for the effect size; - **`rxyi`** is the effect size (the correlation between the two constructs/variables); - **`rxxi`** and **`ryyi`** are the sample reliability values for the measures of the `x_name` and `y_name` variables, respectively; - **`citekey`** contains the citations keys for each study (used to generate bibliographies of included studies). You can see this data set includes correlations among three variables: `X`, `Y`, and `Z`, and that each sample contributes several effect sizes, one each for for different pairs of variables/constructs. If your data are in a different format, you can use the `reshape_wide2long()` function to reshape it. ## Estimating a Barebones Meta-Analysis Let's assume your data frame is called `coding_sheet`. ```{r} coding_sheet <- data_r_meas_multi head(coding_sheet) ``` The primary function to conduct meta-analyses of correlations is `ma_r()`. To conduct barebones meta-analyses, run: ```{r} ma_res <- ma_r(rxyi = rxyi, n = n, construct_x = x_name, construct_y = y_name, sample_id = sample_id, moderators = moderator, data = coding_sheet ) ``` - **`data`** is your data frame. - **`rxyi`**, **`n`**, **`construct_x`**, **`construct_y`**, **`sample_id`**, and **`moderators`** are the names of the columns in your data frame that contain the appropriate values. - **`rxyi`** is the correlation effect sizes; - **`n`** is the sample sizes; - **`construct_x`** and **`construct_y`** are the labels for the variables/constructs being correlated; - **`sample_id`** is the sample identification labels; - **`moderators`** is a vector of moderator variable names for the meta-analyses. - Column names can be provided either with quotes (e.g., `"rxyi"`, `"n"`) or without (e.g., `rxyi`, `n`). To conduct a barebones meta-analysis, at minimum, `n` and `rxyi` are needed. ### Modeling Options - By default, correlations are weighted by sample size. You can specify alternative weights using the `wt_type` argument. - Random-effects variance (τ^2^ or *SD~res~*^2^) is estimated using the Hunter-Schmidt estimator, computed using the unbiased sample variance estimator (i.e., dividing by $k-1$ rather than $k$). To use the maximum-likelihood estimator instead, specify `var_unbiased = FALSE`. - Barebones results are corrected for the small-sample bias in the correlation coefficient. To disable this correction, specify `correct_bias = FALSE`. - By default, confidence and credibility intervals are constructed using a _t_ distribution with $k-1$ degrees of freedom. To use a normal distribution instead, specify, `conf_method = "norm"` and `cred_method = "norm"`. To customize the coverage levels for these intervals, use the `conf_level` and `cred_level` arguments. ## The **psychmeta** Meta-Analysis Object A **psychmeta** meta-analsyis object is a data frame, with each row being a meta-analysis or subanalysis and each column containing information about or results from that analysis. For example, the results of the analysis above look like this: ```{r} ma_res ``` Each row corresponds to a different pair of variables/constructs (X-Y; X-Z; Y-Z) and level of the moderator variable (overall/all levels pooled together; moderator = 1; moderator = 2). - **`analysis_id`** is a numeric label for each analysis; - **`pair_id`** is a numeric label for each pair of variables/constructs (X-Y; X-Z; Y-Z); - **`construct_x`** and **`construct_y`** indicate which variables/constructs are being meta-analyzed. - **`analysis_type`** indicates the type of analysis. - "Overall" means an overall meta-analysis, pooling across all moderator levels. - "Simple Moderator" means a subgroup moderator analysis of only studies with the specified levels of the moderator variable(s) in the next column(s). (See below for how to conduct meta-analyses with multiple moderator variables or with continuous moderators) - **`meta_tables`** contains the principal meta-analysis results tables. - **`escalc`** contains tables of effect sizes, sampling error variances, weights, residuals, and other data. These tables can be used for follow-up analyses or with the **metafor** package for additional meta-analysis techniques. ## Viewing Results Summaries To view meta-anlaysis results tables, use the `summary()` function: ```{r} summary(ma_res) ``` In this table, - **`analysis_id`**, **`pair_id`**, **`construct_x`**, **`construct_y`**, and the moderator columns are defined as above. - **`k`** is the number of effect sizes contributing to each meta-analysis. `N` is the total sample size contributing to each meta-analysis. - **`mean_r`** is the weighted mean correlation. - **`sd_r`** is the weighted observed standard deviation of correlations. - **`se_r`** is the standard error of `mean_r`. - **`sd_res`** is the estimated random-effects standard deviation (residual _SD_ of correlations after accounting for sampling error). - **`CI_LL_95`** and **`CI_UL_95`** are the upper and lower bounds of the confidence interval for `mean_r`; the number indicates the coverage level (default: 95%). - **`CR_LL_80`** and **`CR_UL_80`** are the upper and lower bounds of the credibility interval for the estimated population distribution; the number indicates the coverage level (default: 80%). To view additional results, such as observed variance (`var_r`) or standard deviation of sampling errors (`sd_e`), use the `get_metatab()` function and select the appropriate columns: ```{r} names(get_metatab(ma_res)) get_metatab(ma_res)$var_r ``` To view all columns of this table, convert it to a `data.frame` or `tibble`: ```{r} dplyr::as_tibble(get_metatab(ma_res)) ``` ```{r} as.data.frame(get_metatab(ma_res)) ``` ### Moderator Analyses Results for subgroup analyses for different levels of categorical moderators are shown in the rows of the meta-analysis results table. To estimate confidence intervals for differences between levels or an omnibus ANOVA statistic, use the `anova()` function: ```{r} anova(ma_res) ``` ### Correcting for Statistical Artifacts See Artifact corrections](artifact_corrections.html). ## Outputting Results To output the main meta-analysis results table to RMarkdown, Word, HTML, PDF, or other formats, use the `metabulate()` function. For example, to output the above results to a Word document, run: ```{r, eval=FALSE} metabulate(ma_res, file = "meta-analysis_results.docx", output_format = "word") ``` ## Follow-Up Analyses ### Plotting You can add plots for each meta-analysis in `ma_res` using the `plot_forest()` and `plot_funnel()` functions: ```{r} ma_res <- plot_funnel(ma_res) ma_res <- plot_forest(ma_res) ``` You can view these plots using the `get_plots()` function. This will return a list of all of the plots in this results. Specify which meta-analysis you want to view plots for by passing its `analysis_id` to `[[`: ```{r, fig.show='hold'} get_plots(ma_res)[["forest"]][[2]] get_plots(ma_res)[["funnel"]][[2]] ``` For forest plots, if you select an "Overall" meta-analysis, it will include plots faceted by moderator levels (`"moderated"`) and not (`"unmoderated"`): ```{r, fig.show='hold'} get_plots(ma_res)[["forest"]][[1]][["moderated"]][["barebones"]] get_plots(ma_res)[["forest"]][[1]][["unmoderated"]][["barebones"]] ``` ### Heterogeneity Analyses **psychmeta** reports the random-effects standard deviaton (τ or *SD_res_*) and credibility intervals (`mean_r` ± _crit_ × *SD~res~*) in the main meta-analaysis results tables. To view confidence intervals for *SD_res_* or additional heterogeneity statistics, use the `heterogeneity()` function: ```{r} ma_res <- heterogeneity(ma_res) get_heterogeneity(ma_res)[[1]][["barebones"]] ``` ### Publication Bias and Sensitivity Analyses **psychmeta** supports cumulative meta-analysis for publication/small-sample bias detection, leave-1-out sensitivity analyses, and bootstrap confidence intervals using the sensitivity function: ```{r, fig.show='hold'} ma_res <- sensitivity(ma_res) get_cumulative(ma_res)[[1]][["barebones"]] get_cumulative(ma_res)[[1]][["barebones"]][["plots"]] ``` ```{r} get_bootstrap(ma_res)[[1]][["barebones"]] ```