Let’s first load the packages required.
library(CDMConnector)
library(CohortSurvival)
library(dplyr)
library(ggplot2)
We’ll create a cdm reference to use our example MGUS2 survival dataset. In practice you would use the CDMConnector package to connect to your data mapped to the OMOP CDM.
<- CohortSurvival::mockMGUS2cdm() cdm
We will proceed as we did with the single event survival, but this time we will also use a competing risk cohort of progression of the disease.
We would typically need to define study cohorts ourselves, but in the case of our example data we already have these cohorts available. You can see for our diagnosis cohort we also have a number of additional features recorded for individuals which we’ll use for stratification.
$mgus_diagnosis %>%
cdmglimpse()
#> Rows: ??
#> Columns: 10
#> Database: DuckDB 0.8.1 [eburn@Windows 10 x64:R 4.2.1/:memory:]
#> $ cohort_definition_id <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1…
#> $ subject_id <dbl> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15…
#> $ cohort_start_date <date> 1981-01-01, 1968-01-01, 1980-01-01, 1977-01-01, …
#> $ cohort_end_date <date> 1981-01-01, 1968-01-01, 1980-01-01, 1977-01-01, …
#> $ age <dbl> 88, 78, 94, 68, 90, 90, 89, 87, 86, 79, 86, 89, 8…
#> $ sex <fct> F, F, M, M, F, M, F, F, F, F, M, F, M, F, M, F, F…
#> $ hgb <dbl> 13.1, 11.5, 10.5, 15.2, 10.7, 12.9, 10.5, 12.3, 1…
#> $ creat <dbl> 1.30, 1.20, 1.50, 1.20, 0.80, 1.00, 0.90, 1.20, 0…
#> $ mspike <dbl> 0.5, 2.0, 2.6, 1.2, 1.0, 0.5, 1.3, 1.6, 2.4, 2.3,…
#> $ age_group <chr> ">=70", ">=70", ">=70", "<70", ">=70", ">=70", ">…
$death_cohort %>%
cdmglimpse()
#> Rows: ??
#> Columns: 4
#> Database: DuckDB 0.8.1 [eburn@Windows 10 x64:R 4.2.1/:memory:]
#> $ cohort_definition_id <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1…
#> $ subject_id <dbl> 1, 2, 3, 4, 5, 6, 7, 8, 10, 11, 12, 13, 14, 15, 1…
#> $ cohort_start_date <date> 1981-01-31, 1968-01-26, 1980-02-16, 1977-04-03, …
#> $ cohort_end_date <date> 1981-01-31, 1968-01-26, 1980-02-16, 1977-04-03, …
$progression %>%
cdmglimpse()
#> Rows: ??
#> Columns: 4
#> Database: DuckDB 0.8.1 [eburn@Windows 10 x64:R 4.2.1/:memory:]
#> $ cohort_definition_id <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1…
#> $ subject_id <dbl> 56, 81, 83, 111, 124, 127, 147, 163, 165, 167, 18…
#> $ cohort_start_date <date> 1978-01-30, 1985-01-15, 1974-08-17, 1993-01-14, …
#> $ cohort_end_date <date> 1978-01-30, 1985-01-15, 1974-08-17, 1993-01-14, …
The package allows to estimate survival of both an outcome and competing risk outcome. We can then stratify, see information on events, summarise the estimates and check the contributing participants in the same way we did for the single event survival analysis.
<- estimateCompetingRiskSurvival(cdm,
MGUS_death_prog targetCohortTable = "mgus_diagnosis",
outcomeCohortTable = "progression",
competingOutcomeCohortTable = "death_cohort"
)
%>%
MGUS_death_prog glimpse()
#> Rows: 2,550
#> Columns: 14
#> $ cdm_name <chr> "mock", "mock", "mock", "mock", "mock", "mock", "mock",…
#> $ result_type <chr> "Survival estimate", "Survival estimate", "Survival est…
#> $ group_name <chr> "Cohort", "Cohort", "Cohort", "Cohort", "Cohort", "Coho…
#> $ group_level <chr> "mgus_diagnosis", "mgus_diagnosis", "mgus_diagnosis", "…
#> $ strata_name <chr> "Overall", "Overall", "Overall", "Overall", "Overall", …
#> $ strata_level <chr> "Overall", "Overall", "Overall", "Overall", "Overall", …
#> $ variable <chr> "Outcome", "Outcome", "Outcome", "Outcome", "Outcome", …
#> $ variable_level <chr> "progression", "progression", "progression", "progressi…
#> $ variable_type <chr> "estimate", "estimate_95CI_lower", "estimate_95CI_upper…
#> $ estimate_type <chr> "Cumulative failure probability", "Cumulative failure p…
#> $ time <dbl> 0, 0, 0, 1, 1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 4, 5, 5, 5, 6…
#> $ analysis_type <chr> "Competing risk", "Competing risk", "Competing risk", "…
#> $ outcome <chr> "progression", "progression", "progression", "progressi…
#> $ estimate <dbl> 0.0000, 0.0000, 0.0000, 0.0000, NA, NA, 0.0014, 0.0004,…
As we can see above our results have been outputted in long format. We can plot these results like so.
plotCumulativeIncidence(MGUS_death_prog,
colour = "outcome")
Our returned results also have attributes containing information that summarises survival.
%>% dplyr::filter(estimate_type == "Survival summary") %>%
MGUS_death_prog ::pivot_wider(names_from = "variable_type", values_from = "estimate")
tidyr#> # A tibble: 0 × 12
#> # ℹ 12 variables: cdm_name <chr>, result_type <chr>, group_name <chr>,
#> # group_level <chr>, strata_name <chr>, strata_level <chr>, variable <chr>,
#> # variable_level <chr>, estimate_type <chr>, time <dbl>, analysis_type <chr>,
#> # outcome <chr>
To estimate survival for particular strata of interest we need these features to have been added to the target cohort table. Once we have them defined, and as seen above we already have a number of example characteristics added to our diagnosis cohort, we can add stratifications like so.
<- estimateCompetingRiskSurvival(cdm,
MGUS_death_prog targetCohortTable = "mgus_diagnosis",
outcomeCohortTable = "progression",
competingOutcomeCohortTable = "death_cohort",
strata = list(c("sex"))
)
As we can see as well as results for each strata, we’ll always also have overall results returned. We can filter the output table to plot only the cumulative failure probability for the different strata levels.
plotCumulativeIncidence(MGUS_death_prog %>%
::filter(strata_name != "Overall"),
dplyrfacet = "strata_level",
colour = "outcome")
And we also now have summary statistics for each of the strata as well as overall.
%>% dplyr::filter(estimate_type == "Survival summary") %>%
MGUS_death_prog ::pivot_wider(names_from = "variable_type", values_from = "estimate")
tidyr#> # A tibble: 0 × 12
#> # ℹ 12 variables: cdm_name <chr>, result_type <chr>, group_name <chr>,
#> # group_level <chr>, strata_name <chr>, strata_level <chr>, variable <chr>,
#> # variable_level <chr>, estimate_type <chr>, time <dbl>, analysis_type <chr>,
#> # outcome <chr>
If we set returnParticipants as TRUE then we will also be able to access the individuals that contributed to the analysis.
<- estimateCompetingRiskSurvival(cdm,
MGUS_death_prog targetCohortTable = "mgus_diagnosis",
outcomeCohortTable = "progression",
competingOutcomeCohortTable = "death_cohort",
returnParticipants = TRUE
)survivalParticipants(MGUS_death_prog)
#> # Source: table<dbplyr_083> [?? x 4]
#> # Database: DuckDB 0.8.1 [eburn@Windows 10 x64:R 4.2.1/:memory:]
#> cohort_definition_id subject_id cohort_start_date cohort_end_date
#> <int> <dbl> <date> <date>
#> 1 1 56 1978-01-01 1978-01-01
#> 2 1 124 1974-01-01 1974-01-01
#> 3 1 127 1978-01-01 1978-01-01
#> 4 1 147 1975-01-01 1975-01-01
#> 5 1 163 1966-01-01 1966-01-01
#> 6 1 167 1968-01-01 1968-01-01
#> 7 1 186 1989-01-01 1989-01-01
#> 8 1 195 1981-01-01 1981-01-01
#> 9 1 206 1986-01-01 1986-01-01
#> 10 1 229 1981-01-01 1981-01-01
#> # ℹ more rows
cdm_disconnect(cdm)