Retrieve Citation Metrics

This vignette demonstrates how to query the EDI repository for journal articles that cite a certain data package. These stats may be used in reports.

library(EDIutils)
library(dplyr)
library(tidyr)

Get the Journal Citation Report for Specific Data Packages

Get all identifiers for a certain scope.

scope <- "edi"
identifiers <- list_data_package_identifiers(scope)

Get the latest revisions for each identifier and retrieve all journal citations.

The citation information is returned as a data frame. First set up the empty data frame to append all records to.

identifiers <- paste0(scope, ".", identifiers, ".1")
res <- lapply(identifiers, list_data_package_citations, list_all = TRUE)
df_return_all <- do.call(rbind, res)

Various analyses may now be conducted on this data frame. E.g., number of articles citing data packages

df_data_packages <- df_return_all %>%
  separate(
    packageId,
    into = c("scope", "datasetNum", "revision"),
    sep = "\\."
  ) %>%
  mutate(datasetId = paste(scope, datasetNum, sep = ".")) %>%
  distinct(datasetId)

print(paste("Unique data packages being cited:", nrow(df_data_packages)))
#> [1] "Unique data packages being cited: 375"

or the number of articles citing data packages

df_articles <- df_return_all %>%  distinct(articleUrl)
print(paste('Number of aricles:', nrow(df_articles)))
#> [1] "Number of aricles: 503"