Assessing Usefulness of Databases for Evidence Synthesis

About this vignette

In the process of developing search strategies for evidence synthesis, it is standard practice to test different versions of a search against a set of already known relevant studies — benchmark studies. In this way, the right balance between precision and sensitivity can be achieved prior to screening.

Until now, this within-database testing has been the primary method of pre-screening search validation. With CiteSource, we can test search strategies across databases to assess the usefulness of certain databases before finalizing our database set. This vignette provides a workflow for testing a search strategy across multiple databases and against a set of benchmark studies.

In this example, we are running a search about loneliness and gambling addiction. We developed a search strategy for PsycInfo, our main database, and want to see if searching Web of Science and PubMed adds useful records and helps us find more of our benchmark studies.

Installation and setup

#install.packages("CiteSource")
library(CiteSource)

Import files from multiple sources

Here we import three database searches and a set of benchmark studies. The benchmark file is assigned cite_source = NA since it does not represent a database search, and cite_label = "benchmark" to identify it as the reference set.

citation_files <- list.files(path = "valid_data", pattern = "\\.ris", full.names = TRUE)
citation_files
#> [1] "valid_data/WoS_79.ris"      "valid_data/benchmark.ris"  
#> [3] "valid_data/psycinfo_64.ris" "valid_data/pubmed_46.ris"

citations <- read_citations(citation_files,
                            cite_sources = c(NA, "psycinfo", "pubmed", "wos"),
                            cite_labels  = c("benchmark", "search", "search", "search"),
                            tag_naming   = "best_guess")
#> Note: the following cite_label value(s) are not in the standard vocabulary (search / screened / final): benchmark. Phase-analysis functions expect these exact labels.
#> Import completed - with the following details:
#>              file cite_source cite_string cite_label citations
#> 1      WoS_79.ris        <NA>        <NA>  benchmark        79
#> 2   benchmark.ris    psycinfo        <NA>     search        13
#> 3 psycinfo_64.ris      pubmed        <NA>     search        64
#> 4   pubmed_46.ris         wos        <NA>     search        46

Deduplication and source information

CiteSource merges duplicate records while preserving the cite_source and cite_label metadata fields, so the origin of each record is retained through deduplication.

unique_citations <- dedup_citations(citations)
n_unique         <- count_unique(unique_citations)
source_comparison <- compare_sources(unique_citations, comp_type = "sources")

Plot heatmap to compare source overlap

Heatmap by number of records

A heatmap shows the total number of records from each database and the count of overlapping records for each pair. Web of Science yielded the highest number of records on gambling addiction and loneliness; PubMed the least.

plot_source_overlap_heatmap(source_comparison)

Heatmap by percentage of records

The percentage heatmap shows what share of each row’s records were also found in each column. Here, 55% of Web of Science records were also found in PsycInfo, while 44% of PsycInfo records were found in Web of Science.

plot_source_overlap_heatmap(source_comparison, plot_type = "percentages")

Plot an upset plot to compare source overlap

An upset plot provides more detail about shared and unique records across all source combinations. Web of Science had the most unique records not found in any other database (n=29); PubMed had only four unique records. Twenty-four records were found in every database.

plot_source_overlap_upset(source_comparison, decreasing = c(TRUE, TRUE))

Bar plots of unique and shared records

plot_contributions() visualizes unique and shared record counts by source, and can include the benchmark label to show how each database contributed to the benchmark set.

plot_contributions(n_unique, center = TRUE)

Analyzing unique contributions

To examine which records are exclusive to each database, filter n_unique for unique == TRUE and rejoin with unique_citations to recover full bibliographic data.

unique_psycinfo <- n_unique |>
  dplyr::filter(cite_source == "psycinfo", unique == TRUE) |>
  dplyr::inner_join(unique_citations, by = "duplicate_id")

unique_pubmed <- n_unique |>
  dplyr::filter(cite_source == "pubmed", unique == TRUE) |>
  dplyr::inner_join(unique_citations, by = "duplicate_id")

unique_wos <- n_unique |>
  dplyr::filter(cite_source == "wos", unique == TRUE) |>
  dplyr::inner_join(unique_citations, by = "duplicate_id")

# To export for manual review:
# export_csv(unique_pubmed, "pubmed_unique.csv")

Record-level table

Filtering unique_citations to only the benchmark records and passing to record_level_table() shows which databases contained each benchmark study.

unique_citations |>
  dplyr::filter(stringr::str_detect(cite_label, "benchmark")) |>
  record_level_table(return = "DT")

Search summary table

citation_summary_table() calculates sensitivity and precision scores for each database against the benchmark set, providing a concise overview of each source’s performance before screening begins.

citation_summary_table(unique_citations, screening_label = "benchmark")

Sources	Records		Contribution	Sensitivity	Precision
Sources	total	unique	unique	Sensitivity	Precision
search
pubmed	64	33	41.77%	71.11%	—
wos	46	18	22.78%	51.11%	—
psycinfo	13	7	8.86%	14.44%	—
Total¹	90	58	64.44%	—	—
benchmark
wos	39	14	17.72%	49.37%	84.78%
pubmed	35	9	11.39%	44.30%	54.69%
NA	27	27	34.18%	34.18%	—
psycinfo	6	2	2.53%	7.59%	46.15%
Total¹	79	52	65.82%	—	87.78%
Included fields: Total records are all records returned by that source, while unique records are found in only that source (or, in the Total rows, in only one source). The unique contribution is the share of records only found in that source (or, in the Total rows, in only one source). Sensitivity is the share of all (deduplicated) records retained at that stage compared to the total number found in that particular source. Precision is the share of initial records in that source that are retained for inclusion at that stage.
¹ After deduplication

Exporting for further analysis

CiteSource can export deduplicated results as CSV, RIS, or BibTeX files, and reimport them to resume analysis later.

#export_csv(unique_citations, filename = "unique-by-source.csv", separate = "cite_source")
#export_ris(unique_citations, filename = "unique_citations.ris", source_field = "DB", label_field = "N1")
#export_bib(unique_citations, filename = "unique_citations.bib", include = c("sources", "labels", "strings"))
#reimport_csv("unique-by-source.csv")

In summary

CiteSource can evaluate the usefulness of different databases against a set of benchmark studies before screening begins. In this example, both PsycInfo and Web of Science made unique contributions to the benchmark set and had a significant proportion of unique records. PubMed did not contribute any unique benchmark records and mostly overlapped with the other two databases — providing evidence that it may not be an effective addition for this topic.