Accessing Publication Data

Helen Miller

2022-06-15

DataSpace maintains a curated collection of relevant publications, which can be accessed through the Publications page through the app. Some publications laos include datasets which can be downloaded as a zip file. DataSpaceR provides an interface for browsing publications in DataSpace and downloading publication data where available.

Browsing publications in DataSpace

The DataSpaceConnection object includes methods for browsing and downloading publications and publication data.

library(DataSpaceR)
library(data.table)
con <- connectDS()
con
#> <DataSpaceConnection>
#>   URL: https://dataspace.cavd.org
#>   User: jmtaylor@scharp.org
#>   Available studies: 273
#>     - 77 studies with data
#>     - 5049 subjects
#>     - 423195 data points
#>   Available groups: 3
#>   Available publications: 1530
#>     - 12 publications with data

The DataSpaceConnection print method summarizes the publications and publication data. More details about publications can be accessed through con$availablePublications.

knitr::kable(head(con$availablePublications[, -"link"]))
publication_id first_author all_authors title journal publication_date pubmed_id related_studies studies_with_data publication_data_available
1006 Abbink P Abbink P, Maxfield LF, Ng’ang’a D, Borducchi EN, Iampietro MJ, Bricault CA, Teigler JE, Blackmore S, Parenteau L, Wagh K, Handley SA, Zhao G, Virgin HW, Korber B, Barouch DH Construction and evaluation of novel rhesus monkey adenovirus vaccine vectors J Virol 2015 Feb 25410856 NA NA FALSE
1303 Abbott RK Abbott RK, Lee JH, Menis S, Skog P, Rossi M, Ota T, Kulp DW, Bhullar D, Kalyuzhniy O, Havenar-Daughton C, Schief WR, Nemazee D, Crotty S Precursor frequency and affinity determine B Cell competitive fitness in germinal centers, tested with germline-targeting HIV vaccine immunogens Immunity 2018 Jan 16 29287996 NA NA FALSE
1136 Abu-Raddad LJ Abu-Raddad LJ, Boily MC, Self S, Longini IM Jr Analytic insights into the population level impact of imperfect prophylactic HIV vaccines J Acquir Immune Defic Syndr 2007 Aug 1 17554215 NA NA FALSE
1485 Abu-Raddad LJ Abu-Raddad LJ, Barnabas RV, Janes H, Weiss HA, Kublin JG, Longini IM Jr, Wasserheit JN, HIV Viral Load Working Group Have the explosive HIV epidemics in sub-Saharan Africa been driven by higher community viral load? AIDS 2013 Mar 27 23196933 NA NA FALSE
842 Acharya P Acharya P, Tolbert WD, Gohain N, Wu X, Yu L, Liu T, Huang W, Huang CC, Kwon YD, Louder RK, Luongo TS, McLellan JS, Pancera M, Yang Y, Zhang B, Flinko R, Foulke JS Jr, Sajadi MM, Kamin-Lewis R, Robinson JE, Martin L, Kwong PD, Guan Y, DeVico AL, Lewis GK, Pazgier M Structural definition of an antibody-dependent cellular cytotoxicity response implicated in reduced risk for HIV-1 infection J Virol 2014 Nov 25165110 NA NA FALSE
1067 Ackerman ME Ackerman ME, Mikhailova A, Brown EP, Dowell KG, Walker BD, Bailey-Kellogg C, Suscovich TJ, Alter G Polyfunctional HIV-specific antibody responses are associated with spontaneous HIV control PLOS Pathog 2016 Jan 26745376 NA NA FALSE

This table summarizes all publications, providing some information like first author, journal where it was published, and title as a data.table. It also includes a pubmed url where available under link. Related studies under related_studies, and related studies with data available under studies_with_data. We can use data.table methods to filter and sort this table to browse available publications.

For example, we can filter this table to view only publications related to a particular study:

vtn096_pubs <- con$availablePublications[grepl("vtn096", related_studies)]
knitr::kable(vtn096_pubs[, -"link"])
publication_id first_author all_authors title journal publication_date pubmed_id related_studies studies_with_data publication_data_available
1466 Cram JA Cram JA, Fiore-Gartland AJ, Srinivasan S, Karuna S, Pantaleo G, Tomaras GD, Fredricks DN, Kublin JG Human gut microbiota is associated with HIV-reactive immunoglobulin at baseline and following HIV vaccination PLoS One 2019 31869338 vtn096 NA TRUE
250 Huang Y Huang Y, DiazGranados C, Janes H, Huang Y, deCamp A, Metch B, Grant S, Sanchez B, Phogat S, Koutsoukos M, Kanesa-Thasan N, Bourguignon P, Collard A, Buchbinder S, Tomaras GD, McElrath J, Gray G, Kublin J, Corey L, Gilbert PB Selection of HIV vaccine candidates for concurrent testing in an efficacy trial Curr Opin Virol 2016 Jan 28 26827165 mrv144, vtn096 NA FALSE
267 Huang Y Huang Y, Zhang L, Janes H, Frahm N, Isaacs A, Kim JH, Montefiori D, McElrath MJ, Tomaras GD, Gilbert PB Predictors of durable immune responses six months after the last vaccination in preventive HIV vaccine trials Vaccine 2017 Feb 22 28131393 mrv144, vtn096, vtn205 NA FALSE
268 Huang Y Huang Y, Gilbert P, Fu R, Janes H Statistical methods for down-selection of treatment regimens based on multiple endpoints, with application to HIV vaccine trials Biostatistics 2017 Apr 1 27649715 mrv144, vtn096 NA FALSE
1392 Pantaleo G Pantaleo G, Janes H, Karuna S, Grant S, Ouedraogo GL, Allen M, Tomaras GD, Frahm N, Montefiori DC, Ferrari G, Ding S, Lee C, Robb ML, Esteban M, Wagner R, Bart PA, Rettby N, McElrath MJ, Gilbert PB, Kublin JG, Corey L, NIAID HIV Vaccine Trials Network Safety and immunogenicity of a multivalent HIV vaccine comprising envelope protein with either DNA or NYVAC vectors (HVTN 096): a phase 1b, double-blind, placebo-controlled trial Lancet HIV 2019 Oct 7 31601541 mrv144, vtn096, vtn100 NA TRUE
1420 Westling T Westling T, Juraska M, Seaton KE, Tomaras GD, Gilbert PB, Janes H Methods for comparing durability of immune responses between vaccine regimens in early-phase trials Stat Methods Med Res 2019 Jan 9 30623732 vtn094, vtn096 NA FALSE
281 Yates NL Yates NL, deCamp AC, Korber BT, Liao HX, Irene C, Pinter A, Peacock J, Harris LJ, Sawant S, Hraber P, Shen X, Rerks-Ngarm S, Pitisuttithum P, Nitayapan S, Berman PW, Robb ML, Pantaleo G, Zolla-Pazner S, Haynes BF, Alam SM, Montefiori DC, Tomaras GD HIV-1 envelope glycoproteins from diverse clades differentiate antibody responses and durability among vaccinees J Virol 2018 Mar 28 29386288 mrv144, vtn096 NA FALSE

or publications that have related studies with integrated data in DataSpace:

pubs_with_study_data <- con$availablePublications[!is.na(studies_with_data)]
knitr::kable(head(pubs_with_study_data[, -"link"]))
publication_id first_author all_authors title journal publication_date pubmed_id related_studies studies_with_data publication_data_available
1540 Andersen-Nissen E Andersen-Nissen E, Fiore-Gartland A, Ballweber Fleming L, Carpp LN, Naidoo AF, Harper MS, Voillet V, Grunenberg N, Laher F, Innes C, Bekker L-G, Kublin JG, Huang Y, Ferrari G, Tomaras GD, Gray G, Gilbert PB, McElrath MJ Innate immune signatures to a partially-efficacious HIV vaccine predict correlates of HIV-1 infection risk PLOS Pathog 2021 NA vtn097 vtn097 TRUE
213 Andrasik MP Andrasik MP, Yoon R, Mooney J, Broder G, Bolton M, Votto T, Davis-Vogel A; HVTN 505 study team; NIAID HIV Vaccine Trials Network Exploring barriers and facilitators to participation of male-to-female transgender persons in preventive HIV vaccine clinical trials. Prev Sci 2014 Jun 23446435 vtn505 vtn505 FALSE
218 Andrasik MP Andrasik MP, Karuna ST, Nebergall M, Koblin BA, Kublin JG; NIAID HIV Vaccine Trials Network Behavioral risk assessment in HIV Vaccine Trials Network (HVTN) clinical trials: A qualitative study exploring HVTN staff perspectives Vaccine 2013 Sep 13 23859840 vtn069, vtn404, vtn502, vtn503, vtn504, vtn505, vtn802, vtn903, vtn906, vtn907 vtn505 FALSE
225 Andrasik MP Andrasik MP, Chandler C, Powell B, Humes D, Wakefield S, Kripke K, Eckstein D Bridging the divide: HIV prevention research and Black men who have sex with men Am J Public Health 2014 Apr 24524520 vtn505 vtn505 FALSE
226 Arnold MP Arnold MP, Andrasik M, Landers S, Karuna S, Mimiaga MJ, Wakefield S, Mayer K, Buchbinder S, Koblin BA Sources of racial/ethnic differences in HIV vaccine trial awareness: Exposure, attention, or both? Am J Public Health 2014 Aug 24922153 vtn505 vtn505 FALSE
7 Asbach B Asbach B, Kliche A, Köstler J, Perdiguero B, Esteban M, Jacobs BL, Montefiori DC, LaBranche CC, Yates NL, Tomaras GD, Ferrari G, Foulds KE, Roederer M, Landucci G, Forthal DN, Seaman MS, Hawkins N, Self SG, Sato A, Gottardo R, Phogat S, Tartaglia J, Barnett SW, Burke B, Cristillo AD, Weiss DE, Francis J, Galmin L, Ding S, Heeney JL, Pantaleo G, Wagner R Potential to streamline heterologous DNA prime and NYVAC/protein boost HIV vaccine regimens in rhesus macaques by employing improved antigens J Virol 2016 Mar 28 26865719 cvd277, cvd281 cvd277, cvd281 FALSE

We can also use this information to connect to related studies and pull integrated data. Say we are interested in this Rouphael (2019) publication:

rouphael2019 <- con$availablePublications[first_author == "Rouphael NG"]
knitr::kable(rouphael2019[, -"link"])
publication_id first_author all_authors title journal publication_date pubmed_id related_studies studies_with_data publication_data_available
1390 Rouphael NG Rouphael NG, Morgan C, Li SS, Jensen R, Sanchez B, Karuna S, Swann E, Sobieszczyk ME, Frank I, Wilson GJ, Tieu HV, Maenza J, Norwood A, Kobie J, Sinangil F, Pantaleo G, Ding S, McElrath MJ, De Rosa SC, Montefiori DC, Ferrari G, Tomaras GD, Keefer MC, HVTN 105 Protocol Team and the NIAID HIV Vaccine Trials Network DNA priming and gp120 boosting induces HIV-specific antibodies in a randomized clinical trial J Clin Invest 2019 Sep 30 31566579 vtn105 vtn105 TRUE

We can find this publication in the available publications table, determine related studies, and pull data for those studies where available.

related_studies <- rouphael2019$related_studies
related_studies
#> [1] "vtn105"
rouphael2019_study <- con$getStudy(related_studies)
rouphael2019_study
#> <DataSpaceStudy>
#>   Study: vtn105
#>   URL: https://dataspace.cavd.org/CAVD/vtn105
#>   Available datasets:
#>     - Binding Ab multiplex assay
#>     - Demographics
#>     - Intracellular Cytokine Staining
#>     - Neutralizing antibody
#>   Available non-integrated datasets:
dim(rouphael2019_study$availableDatasets)
#> [1] 4 4

We can see that there are datasets available for this study. We can pull any of them using rouphael2019_study$getDataset().

Downloading Publication Data

DataSpace also includes publication datasets for some publications. The format of this data will vary from publication to publication, and is stored in a zip file. The publication_data_available field specifies publications where publication data is available.

pubs_with_data <- con$availablePublications[publication_data_available == TRUE]
knitr::kable(head(pubs_with_data[, -"link"]))
publication_id first_author all_authors title journal publication_date pubmed_id related_studies studies_with_data publication_data_available
1540 Andersen-Nissen E Andersen-Nissen E, Fiore-Gartland A, Ballweber Fleming L, Carpp LN, Naidoo AF, Harper MS, Voillet V, Grunenberg N, Laher F, Innes C, Bekker L-G, Kublin JG, Huang Y, Ferrari G, Tomaras GD, Gray G, Gilbert PB, McElrath MJ Innate immune signatures to a partially-efficacious HIV vaccine predict correlates of HIV-1 infection risk PLOS Pathog 2021 NA vtn097 vtn097 TRUE
136 Bekker LG Bekker LG, Moodie Z, Grunenberg N, Laher F, Tomaras GD, Cohen KW, Allen M, Malahleha M, Mngadi K, Daniels B, Innes C, Bentley C, Frahm N, Morris DE, Morris L, Mkhize NN, Montefiori DC, Sarzotti-Kelsoe M, Grant S, Yu C, Mehra VL, Pensiero MN, Phogat S, DiazGranados CA, Barnett SW, Kanesa-thasan N, Koutsoukos M, Michael NL, Robb ML, Kublin JG, Gilbert PB, Corey L, Gray GE, McElrath MJ on behalf of the HVTN 100 Protocol Team A phase 1/2 HIV-1 vaccine trial of a Subtype C ALVAC-HIV and Bivalent Subtype C gp120/MF59 vaccine regimen in low risk HIV uninfected South African adults Lancet HIV 2018 Jul 29898870 mrv144, vtn100 NA TRUE
1466 Cram JA Cram JA, Fiore-Gartland AJ, Srinivasan S, Karuna S, Pantaleo G, Tomaras GD, Fredricks DN, Kublin JG Human gut microbiota is associated with HIV-reactive immunoglobulin at baseline and following HIV vaccination PLoS One 2019 31869338 vtn096 NA TRUE
80 Fong Y Fong Y, Shen X, Ashley VC, Deal A, Seaton KE, Yu C, Grant SP, Ferrari G, deCamp AC, Bailer RT, Koup RA, Montefiori D, Haynes BF, Sarzotti-Kelsoe M, Graham BS, Carpp LN, Hammer SM, Sobieszczyk M, Karuna S, Swann E, DeJesus E, Mulligan M, Frank I, Buchbinder S, Novak RM, McElrath MJ, Kalams S, Keefer M, Frahm NA, Janes HE, Gilbert PB, Tomaras GD Modification of the association between T-Cell immune responses and human immunodeficiency virus type 1 infection risk by vaccine-induced antibody responses in the HVTN 505 trial J Infect Dis 2018 Mar 28 29325070 vtn505 vtn505 TRUE
79 Hammer SM Hammer SC, Sobieszczyk ME, Janes H, Karuna ST, Mulligan MJ, Grove D, Koblin BA, Buchbinder SP, Keefer MC, Tomaras GD, Frahm N, Hural J, Anude C, Graham BS, Enama ME, Adams E, DeJesus E, Novak RM, Frank I, Bentley C, Ramirez S, Fu R, Koup RA, Mascola JR, Nabel GJ, Montefiori DC, Kublin J, McElrath MJ, Corey L, Gilbert PB Efficacy trial of a DNA/rAd5 HIV-1 preventive vaccine N Engl J Med 2013 Nov 28 24099601 vtn505 vtn505 TRUE
1467 Hosseinipour MC Hosseinipour MC, Innes C, Naidoo S, Mann P, Hutter J, Ramjee G, Sebe M, Maganga L, Herce ME, deCamp AC, Marshall K, Dintwe O, Andersen-Nissen E, Tomaras GD, Mkhize N, Morris L, Jensen R, Miner MD, Pantaleo G, Ding S, Van Der Meeren O, Barnett SW, McElrath MJ, Corey L, Kublin JG Phase 1 HIV vaccine trial to evaluate the safety and immunogenicity of HIV subtype C DNA and MF59-adjuvanted subtype C Env protein Clin Infect Dis 2020 Jan 4 31900486 vtn111 NA TRUE

Data for a publication can be accessed through DataSpaceR with con$downloadPublicationData(). The publication ID must be specified, as found under publication_id in con$availablePublications. The file is presented as a zip file. The unzip argument gives us the option whether to unzip this file. By default, the file will be unzipped. You may also specify the directory to download the file. By default, it will be saved to your Downloads directory. This function invisibly returns the paths to the downloaded files.

Here, we download data for publication with ID 1461 (Westling, 2020), and view the resulting downloads.

publicationDataFiles <- con$downloadPublicationData("1461", outputDir = tempdir(), unzip = TRUE, verbose = TRUE)
basename(publicationDataFiles)
#> [1] "causal.isoreg.fns.R"           "CD.SuperLearner.R"            
#> [3] "cd4_analysis.R"                "cd4_data.csv"                 
#> [5] "cd8_analysis.R"                "cd8_data.csv"                 
#> [7] "README.txt"                    "Westling_1461_file_format.pdf"

All zip files will include a file format document as a PDF, as well as a README. These documents will give an overview of the remaining contents of the files. In this case, data is separated by CD8+ T-cell responses and CD4+ T-cell responses, as described in the README.txt.

cd4 <- fread(publicationDataFiles[grepl("cd4_data", publicationDataFiles)])
cd4
#>       pub_id age    sex      bmi   prot num_vacc dose response sexFemale
#>   1: 069-071  23   Male 33.31000 vtn069        3  4.0        0         0
#>   2: 069-068  20 Female 32.74000 vtn069        3  4.0        1         1
#>   3: 069-020  19   Male 20.24000 vtn069        3  4.0        0         0
#>   4: 069-008  19   Male 29.98000 vtn069        3  4.0        0         0
#>   5: 069-058  27   Male 31.48000 vtn069        3  4.0        0         0
#>  ---                                                                    
#> 368: 100-182  36 Female 24.32323 vtn100        4  1.5        0         1
#> 369: 100-034  20   Male 25.15590 vtn100        4  1.5        1         0
#> 370: 100-115  20   Male 22.94812 vtn100        4  1.5        1         0
#> 371: 100-144  19   Male 20.95661 vtn100        4  1.5        0         0
#> 372: 100-189  24   Male 19.03114 vtn100        4  1.5        1         0
#>      studyHVTN052 studyHVTN068 studyHVTN069 studyHVTN204 studyHVTN100 vacc_type
#>   1:            0            0            1            0            0   VRC4or6
#>   2:            0            0            1            0            0   VRC4or6
#>   3:            0            0            1            0            0   VRC4or6
#>   4:            0            0            1            0            0   VRC4or6
#>   5:            0            0            1            0            0   VRC4or6
#>  ---                                                                           
#> 368:            0            0            0            0            1   VRC4or6
#> 369:            0            0            0            0            1   VRC4or6
#> 370:            0            0            0            0            1   VRC4or6
#> 371:            0            0            0            0            1   VRC4or6
#> 372:            0            0            0            0            1   VRC4or6
#>      vacc_typeSinglePlasmid vacc_typeVRC4or6
#>   1:                      0                1
#>   2:                      0                1
#>   3:                      0                1
#>   4:                      0                1
#>   5:                      0                1
#>  ---                                        
#> 368:                      0                1
#> 369:                      0                1
#> 370:                      0                1
#> 371:                      0                1
#> 372:                      0                1

This publication also includes analysis scripts used for the publication, which can allow users to reproduce the analysis and results.

Session information

sessionInfo()
#> R version 4.1.2 (2021-11-01)
#> Platform: x86_64-pc-linux-gnu (64-bit)
#> Running under: Ubuntu 18.04.5 LTS
#> 
#> Matrix products: default
#> BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.7.1
#> LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.7.1
#> 
#> locale:
#>  [1] LC_CTYPE=en_US.utf8       LC_NUMERIC=C             
#>  [3] LC_TIME=en_US.utf8        LC_COLLATE=en_US.utf8    
#>  [5] LC_MONETARY=en_US.utf8    LC_MESSAGES=en_US.utf8   
#>  [7] LC_PAPER=en_US.utf8       LC_NAME=C                
#>  [9] LC_ADDRESS=C              LC_TELEPHONE=C           
#> [11] LC_MEASUREMENT=en_US.utf8 LC_IDENTIFICATION=C      
#> 
#> attached base packages:
#> [1] stats     graphics  grDevices utils     datasets  methods   base     
#> 
#> other attached packages:
#> [1] data.table_1.14.2 DataSpaceR_0.7.5  knitr_1.37       
#> 
#> loaded via a namespace (and not attached):
#>  [1] Rcpp_1.0.8       digest_0.6.29    assertthat_0.2.1 R6_2.5.1        
#>  [5] jsonlite_1.8.0   magrittr_2.0.2   evaluate_0.15    highr_0.9       
#>  [9] httr_1.4.2       stringi_1.7.6    curl_4.3.2       tools_4.1.2     
#> [13] stringr_1.4.0    Rlabkey_2.8.3    xfun_0.29        compiler_4.1.2