Statistical Supplement

This file contains details of the statistical methodology and the analysis code for the manuscript: ‘Assessing the impact of primary tumour location on survival after resection of colorectal liver metastases: A propensity weighted retrospective cohort study.’

See also the HTML version of this file: <https://aborakati.github.io/CRLM-Project/>

Statistical Methods

Multiple imputation was performed using chained equations and classification and regression trees to account for missing data with the number of imputed samples equal to the percentage of missing data as suggested previously (White, Royston, and Wood 2011). Outcome variables and PTL were not imputed.

Propensity score weights calculated by generalised boosted modelling (McCaffrey, Ridgeway, and Morral 2004) were used to adjust for baseline imbalanced factors which may plausibly pre-dispose to having a particular PTL. Balance was assessed using simple difference-in-means testing at baseline, with analysis of variance (ANOVA) for normally distributed variables and the Kruskal-Wallis test for non-normally distributed variables. After weighting a mean difference of <0.1 for each covariate between each PTL group was considered a good balance (Appendix Table 2).

Cox regression was performed to further adjust for recorded covariates on 10-year overall and 5-year disease free survival. The proportional hazards assumption was confirmed with plots of Schoenfeld residuals (appendix figure 3) (SCHOENFELD 1982). Variables included in the final models were selected with a stepwise method using a p-value threshold of <0.10 on univariable analysis. Weighting and regression was performed on each imputed dataset and pooled according to Rubin’s rules (Rubin 1987).  

Hazard ratios for each variable with 95% confidence intervals were derived with p-values for statistical significance; the statistical significance threshold was set a priori at <0.05.

Univariable survival curves for each PTL, and other variables with the Kaplan-Meier estimator before multiple imputation, propensity score weighting and Cox-regression were also produced, with the log rank test for statistical significance.

Analysis was performed using R version 4.0.2 (R Foundation for Statistical Computing, Vienna, Austria) with survival (Terry M. Therneau and Patricia M. Grambsch 2000), survminer (Kassambara, Kosinski, and Biecek 2020), mice (Buuren and Groothuis-Oudshoorn 2011), WeightIt (Greifer 2020), cobalt (Greifer 2020) and MatchThem (Pishgar and Greifer 2020) packages.

See also the HTML version of this file: https://aborakati.github.io/CRLM-Project/

References

Buuren, Stef van, and Karin Groothuis-Oudshoorn. 2011. “Mice: Multivariate Imputation by Chained Equations in r.” Journal of Statistical Software 45 (3): 1–67. https://www.jstatsoft.org/v45/i03/.
Greifer, Noah. 2020. WeightIt: Weighting for Covariate Balance in Observational Studies. https://CRAN.R-project.org/package=WeightIt.
Kassambara, Alboukadel, Marcin Kosinski, and Przemyslaw Biecek. 2020. Survminer: Drawing Survival Curves Using ’Ggplot2’. https://CRAN.R-project.org/package=survminer.
McCaffrey, Daniel F., Greg Ridgeway, and Andrew R. Morral. 2004. “Propensity Score Estimation With Boosted Regression for Evaluating Causal Effects in Observational Studies.” Psychological Methods 9 (4): 403–25. https://doi.org/10.1037/1082-989x.9.4.403.
Pishgar, Farhad, and Noah Greifer. 2020. MatchThem: Matching and Weighting Multiply Imputed Datasets. https://CRAN.R-project.org/package=MatchThem.
Rubin, Donald B., ed. 1987. Multiple Imputation for Nonresponse in Surveys. John Wiley & Sons, Inc. https://doi.org/10.1002/9780470316696.
SCHOENFELD, DAVID. 1982. “Partial Residuals for the Proportional Hazards Regression Model.” Biometrika 69 (1): 239–41. https://doi.org/10.1093/biomet/69.1.239.
Terry M. Therneau, and Patricia M. Grambsch. 2000. Modeling Survival Data: Extending the Cox Model. New York: Springer.
White, Ian R., Patrick Royston, and Angela M. Wood. 2011. “Multiple Imputation Using Chained Equations: Issues and Guidance for Practice.” Statistics in Medicine 30 (4): 377–99. https://doi.org/10.1002/sim.4067.

Load Libraries


library(lubridate)
library(tidyverse)
library(survival)
library(tableone)
library(qdapTools)
library(MatchIt)
library(finalfit)
library(readxl)
library(broom)
library(survminer)
library(readxl)
library(ggfortify)
library(ggsci)
library(labelled)
library(janitor)
library(mice)
library(MatchThem)
library(cobalt)
library(forestplot)
library(Gmisc, quietly = TRUE)
library(glue)
library(htmlTable)
library(grid)
library(magrittr)

Load data


data <- read_excel("CRLM Data 11.01.21.xlsx", 
    col_types = c("numeric", "date", "text", 
        "date", "numeric", "numeric", "numeric", 
        "numeric", "text", "text", "numeric", 
        "text", "text", "text", "text", "text", 
        "text", "text", "text", "date", "text", 
        "date", "numeric", "text", "text", 
        "numeric", "numeric", "text", "text", 
        "text", "text", "text", "numeric", 
        "text", "text", "text", "text", "date", 
        "text", "date", "text", "text", "text", 
        "date", "date", "date", "text", "text", 
        "text", "text", "numeric", "date", 
        "logical", "logical", "logical", 
        "logical"), na = "NA")

Filter data to those who survived until discharge:

postopdead <- data %>% filter(ClavienDindo == 
    "V")

data <- data[!(data$ID2 %in% postopdead$ID2), 
    ]

Data Cleaning


data$age <- floor(time_length(difftime(data$DateofLiverResection, 
    data$DOB), "years"))

data <- data %>% relocate(age)

## Age as categoric variable:
data$agecat <- case_when(data$age > 70 ~ 
    ">70", data$age > 60 ~ "61-70", data$age > 
    50 ~ "51-60", data$age > 18 ~ "<50")
data$agecat <- factor(data$agecat, levels = c("<50", 
    "51-60", "61-70", ">70"))

data$Gender = data$Gender %>% as.factor() %>% 
    plyr::revalue(c(F = "Female", M = "Male"))

Time Period Data:

data$TimeToLivMet <- time_length(as.period(interval(data$DatePrimaryResection, 
    data$DateDiseaseRecurrence)), unit = "months")
### Time to last follow up if no death
data$DateofDeath <- replace_na(data$DateofDeath, 
    as.Date("2020-08-30"))
data$TimeToDeath <- time_length(as.period(interval(data$DateofLiverResection, 
    data$DateofDeath)), unit = "months")
data$DateOfRecurrencePostLiverResection <- replace_na(data$DateOfRecurrencePostLiverResection, 
    as.Date("2020-08-30"))

data$TimeToRecurrence <- time_length(as.period(interval(data$DateofLiverResection, 
    data$DateOfRecurrencePostLiverResection)), 
    unit = "months")
### Time to last follow up if no recurrence


data$RecurrenceToDeath <- time_length(as.period(interval(data$DateOfRecurrencePostLiverResection, 
    data$DateofDeath)), unit = "months")


### Recode 5 year survival as FALSE or TRUE
### (these are the status variables for OS
### and DFS)
data$five_year_all_cause_survival <- ifelse(data$dateof_death == 
    as.Date("2020-08-30"), FALSE, TRUE)
data$five_year_disease_free_survival <- ifelse(data$date_of_recurrence_post_liver_resection == 
    as.Date("2020-08-30"), FALSE, TRUE)

Change 0s to NA (these values were intented as ‘unavailable’ in the CRF)

data$InitialCEA[data$InitialCEA == "0"] <- NA
data$CEALiverMetastasis[data$CEALiverMetastasis == 
    "0"] <- NA
data$NoSegmentectomies[data$NoSegmentectomies == 
    "0"] <- NA

Separate delimited values into new columns and delete original column

data <- cbind(data[, -grep("LocationLiverMetastasisLobes", 
    colnames(data))], mtabulate(strsplit(as.character(data$LocationLiverMetastasisLobes), 
    "[;]")))
data <- cbind(data[, -grep("LocationLiverMetastasisSegments", 
    colnames(data))], mtabulate(strsplit(data$LocationLiverMetastasisSegments, 
    "[;]")))
data <- cbind(data[, -grep("ExtraHepaticMetastases", 
    colnames(data))], mtabulate(strsplit(data$ExtraHepaticMetastases, 
    "[;]")))
data <- cbind(data[, -grep("AdjuvantTherapies", 
    colnames(data))], mtabulate(strsplit(data$AdjuvantTherapies, 
    "[;]")))

data <- cbind(data[, -grep("LiverResectionProcedure", 
    colnames(data))], mtabulate(strsplit(data$LiverResectionProcedure, 
    "[;]")))

## New column for each mutation type:

data$RAS <- grepl("RAS", data$Mutations)
data$BRAF <- grepl("BRAF", data$Mutations)
data$p53 <- grepl("p53", data$Mutations)
## New Bilobar mets column

data$Bilobar <- data$Right == 1 & data$Left == 
    1

Order of Surgery in Synchronous Tumours:

##Liver First or Colorectal First
difftime(data$dateof_liver_resection, data$date_primary_resection, units="days") %>% as.numeric()->data$primtoliver
case_when(
  
  data$primtoliver==0 | data$primtoliver==-1 | data$primtoliver==1~"Synchronous Resection",
  data$primtoliver>0~"Primary First",
  data$primtoliver<0~"Liver First",
  is.na(data$primtoliver)~"Liver First")->data$sequence


## Filter into synchronous tumours only


data %>% filter(synchronous_liver_metastasis=="Yes")->synch

Relabel columns

data %>% rename("OtherGI"='Other GI', "SystemicChemo"='Systemic- Chemo', 
                "IRAblation"='IR ablation', "PVE"='Portal vein embolization')->data 
### Convert columns to logical

data[,50:90] <-lapply(data[50:90],as.logical) 
data %<>% mutate_at(c("Diaphragm","Lung","None","OtherGI"), as.logical)->data 

###Collapse Right sided and Transverse into one 

data$PrimarySite<-fct_collapse(data$PrimarySite, "Right" = c("Right","Transverse"), "Left" = c("Sigmoid","Left") ) 

## Collapse TNM stages:

data$InitialTumourStageT<-fct_collapse(data$InitialTumourStageT, "T5"= c("T5a", "T5b")) data$InitialTumourStageN<-fct_collapse(data$InitialTumourStageN, "N1"= c("N1a", "N1b", "N1c"), "N2"= c("N2a","N2b")) 
data$StageTimeMetastasisN<-fct_collapse(data$StageTimeMetastasisN, "N1"= c("N1a", "N1b", "N1c"), "N2"= c("N2a","N2b")) 

###Modified Charlson Index to Factor:

data$ModifiedCharlsonIndex<-fct_collapse(as.factor(data$ModifiedCharlsonIndex), ">5"= c("5","6","7","8","9","10","11","12"))

Number of Unresectable Resections

data$Unresectable <- as.numeric(data$LiverRecurrence) - 
    as.numeric(data$Resections)
data$UnresectablePct <- data$Unresectable/as.numeric(data$LiverRecurrence) * 
    100
data$LiverRecurrenceYN = ifelse(data$LiverRecurrence > 
    0, TRUE, FALSE)

data$LiverRecurrence = data$LiverRecurrence %>% 
    as.factor()
data$Resections = data$Resections %>% as.factor()
data$Unresectable = data$Unresectable %>% 
    as.factor()

Label Variables

var_label(data) <- list(age = "Age", Gender = "Gender", 
    ID2 = "ID", ASA = "American Society of Anesthesiology Grade", 
    ModifiedCharlsonIndex = "Modified Charlson Index", 
    PerformanceStatus = "Performance Status", 
    InitialCEA = "Initial Carcinoembroynic Antigen (µ/L)", 
    PrimarySite = "Primary Tumour Location", 
    SynchronousLiverMetastasis = "Synchronous Liver Metastasis", 
    TumourGrade = "Tumour Grade", InitialTumourStageT = "T Stage", 
    InitialTumourStageN = "N Stage", InitialTumourStageM = "M Stage", 
    PrimaryProcedure = "Primary Procedure", 
    UrgencyPrimary = "Urgency of Primary Procedure", 
    Stoma = "Stoma Formation in Primary Procedure", 
    OperativeRoute = "Operative Route- Primary Procedure", 
    MarginsPrimaryResection = "Resection Margins- Primary Procedure", 
    DateDiseaseRecurrence = "Date of Recurrence in Liver", 
    CEALiverMetastasis = "CEA at Liver Recurrence (µ/L)", 
    NumberLiverMetastases = "Number of Liver Metastases", 
    SizeLargestLiverMetastasis = "Size of Largest Liver Metastasis (mm)", 
    StageTimeMetastasisN = "Stage at Time of Metastasis - N", 
    StageTimeMetastasisM = "Stage at Time of Metastasis - M", 
    NoSegmentectomies = "No. of Segmentectomies", 
    Route = "Operative Route - Liver Resection", 
    LiverResectionMargins = "Resection Margins- Liver Resection", 
    ClavienDindo = "Clavien Dindo Grade during Admission", 
    FiveYearAllCauseSurvival = "Five Year All Cause Survival", 
    FiveYearDiseaseFreeSurvival = "Five Year Disease Free Survival", 
    TimeToLivMet = "Time  to Liver Metastasis after Primary Resection (months)", 
    TimeToDeath = "Time to Death after Liver Resection (months)", 
    TimeToRecurrence = "Time to Recurrence after Liver Resection (months)", 
    RecurrenceToDeath = "Time to Death after Recurrence following Liver Resection (months)", 
    Bilobar = "Bilobar Liver Metastases", 
    Lung = "Lung Metastases", Peritoneum = "Peritoneal Metastases", 
    OtherGI = "Other Gastrointestinal Metastasis", 
    SystemicChemo = "Systemic Chemotherapy", 
    IRAblation = "Radiofrequency Ablation", 
    CNS = "Central Nervous System Metastasis", 
    Bone = "Bone Metastasis", Radiotherapy = "Radiotherapy", 
    Immunotherapy = "Immunotherapy", p53 = "p53 Mutation", 
    RAS = "RAS Mutation", BRAF = "BRAF Mutation", 
    LiverRecurrence = "Number of Recurrences in Liver after Liver Resection", 
    Resections = "Number of Repeat Liver Resections", 
    Unresectable = "Number of Unresectable Liver Recurrences", 
    UnresectablePct = "Percentage Unresectable Liver Recurrences", 
    LiverRecurrenceYN = "Liver Recurrence")




data <- expss::apply_labels(data, age = "Age", 
    Gender = "Gender", ID2 = "ID", ASA = "American Society of Anesthesiology Grade", 
    ModifiedCharlsonIndex = "Modified Charlson Index", 
    PerformanceStatus = "Performance Status", 
    InitialCEA = "Initial Carcinoembroynic Antigen (µ/L)", 
    PrimarySite = "Primary Tumour Location", 
    SynchronousLiverMetastasis = "Synchronous Liver Metastasis", 
    TumourGrade = "Tumour Grade", InitialTumourStageT = "T Stage", 
    InitialTumourStageN = "N Stage", InitialTumourStageM = "M Stage", 
    PrimaryProcedure = "Primary Procedure", 
    UrgencyPrimary = "Urgency of Primary Procedure", 
    Stoma = "Stoma Formation in Primary Procedure", 
    OperativeRoute = "Operative Route- Primary Procedure", 
    MarginsPrimaryResection = "Resection Margins- Primary Procedure", 
    DateDiseaseRecurrence = "Date of Recurrence in Liver", 
    CEALiverMetastasis = "CEA at Liver Recurrence (µ/L)", 
    NumberLiverMetastases = "Number of Liver Metastases", 
    SizeLargestLiverMetastasis = "Size of Largest Liver Metastasis (mm)", 
    StageTimeMetastasisN = "Stage at Time of Metastasis - N", 
    StageTimeMetastasisM = "Stage at Time of Metastasis - M", 
    NoSegmentectomies = "No. of Segmentectomies", 
    Route = "Operative Route - Liver Resection", 
    LiverResectionMargins = "Resection Margins- Liver Resection", 
    ClavienDindo = "Clavien Dindo Grade during Admission", 
    FiveYearAllCauseSurvival = "Five Year All Cause Survival", 
    FiveYearDiseaseFreeSurvival = "Five Year Disease Free Survival", 
    TimeToLivMet = "Time  to Liver Metastasis after Primary Resection (months)", 
    TimeToDeath = "Time to Death after Liver Resection (months)", 
    TimeToRecurrence = "Time to Recurrence after Liver Resection (months)", 
    RecurrenceToDeath = "Time to Death after Recurrence following Liver Resection (months)", 
    Bilobar = "Bilobar Liver Metastases", 
    Lung = "Lung Metastases", Peritoneum = "Peritoneal Metastases", 
    OtherGI = "Other Gastrointestinal Metastasis", 
    SystemicChemo = "Systemic Chemotherapy", 
    IRAblation = "Radiofrequency Ablation", 
    CNS = "Central Nervous System Metastasis", 
    Bone = "Bone Metastasis", Radiotherapy = "Radiotherapy", 
    Immunotherapy = "Immunotherapy", p53 = "p53 Mutation", 
    RAS = "RAS Mutation", BRAF = "BRAF Mutation", 
    LiverRecurrence = "Number of Recurrences in Liver after Liver Resection", 
    Resections = "Number of Repeat Liver Resections", 
    Unresectable = "Number of Unresectable Liver Recurrences", 
    UnresectablePct = "Percentage Unresectable Liver Recurrences", 
    LiverRecurrenceYN = "Liver Recurrence")

Clean variable names:

data <- janitor::clean_names(data)
attach(data)

Demographics Table


explanatorydemog <- c("age", "gender", "modified_charlson_index", 
    "asa", "performance_status", "initial_cea", 
    "primary_site", "synchronous_liver_metastasis", 
    "tumour_grade", "initial_tumour_stage_t", 
    "initial_tumour_stage_n", "initial_tumour_stage_m", 
    "primary_procedure", "urgency_primary", 
    "stoma", "operative_route", "margins_primary_resection", 
    "cea_liver_metastasis", "number_liver_metastases", 
    "size_largest_liver_metastasis", "route", 
    "liver_resection_margins", "clavien_dindo", 
    "five_year_all_cause_survival", "five_year_disease_free_survival", 
    "time_to_liv_met", "time_to_death", "time_to_recurrence", 
    "recurrence_to_death", "bilobar", "lung", 
    "peritoneum", "bone", "cns", "systemic_chemo", 
    "radiotherapy", "immunotherapy", "ir_ablation", 
    "p53", "ras", "braf", "alpps", "extended_left_hepatectomy", 
    "left_hepatectomy", "right_hepatectomy", 
    "right_lobectomy_extended_right_trisegmentectomy", 
    "traditional_two_stage_hepatectomy", 
    "segmentectomy_wedge_resection", "no_segmentectomies", 
    "liver_recurrence", "resections", "unresectable", 
    "unresectable_pct", "liver_recurrence_yn", 
    "time_to_death", "i", "ii", "iii", "iv", 
    "v", "vi", "vii", "viii", "sequence", 
    "primtoliver", "neo_adj_prim", "adj_prim", 
    "neo_adj_liv", "adj_liv")

nnormal <- c("asa", "modified_charlson_index", 
    "performance_status", "number_liver_metastasis", 
    "size_liver_metastasis", "no_segmentectomies", 
    "tumour_grade", "initial_cea", "cea_liver_metastasis", 
    "time_to_liv_met", "time_to_death", "time_to_recurrence", 
    "recurrence_to_death", "no_segmentectomies", 
    "unresectable_pct", "time_to_death", 
    "i", "time_to_death", "ii", "iii", "iv", 
    "v", "vi", "vii", "viii", "primtoliver")
nnormal <- append(nnormal, names(data[, 57:51]))
exact <- (c("gender", "mutations", "primary_procedure", 
    "urgency_procedure", "margins_primary_resection", 
    "stome", "operative_route", "stage_time_metastasis_n", 
    "stage_time_metastasis_m", "liver_resection_procedure", 
    "route", "liver_resection_margins", "fiver_year_all_cause_survival", 
    "five_year_disease_free_survival", "bilobar", 
    "tumour_grade", "liver_recurrence", "resections", 
    "unresectable", "unresectable_pct", "liver_recurrence_yn", 
    "i", "braf", "p53", "ras", "ii", "iii", 
    "iv", "v", "vi", "vii", "viii", "sequence"))

data[, c("i", "ii", "iii", "iv", "v", "vi", 
    "vii", "viii")] <- data %>% select("i", 
    "ii", "iii", "iv", "v", "vi", "vii", 
    "viii") %>% lapply(as.logical)
data$tumour_grade <- data$tumour_grade %>% 
    as.factor()
data$tumour_grade <- fct_collapse(data$tumour_grade, 
    `3` = c("3", "4"))
demogtable <- tableone::CreateTableOne(explanatorydemog, 
    "primary_site", data) %>% print(nonnormal = nnormal, 
    exact = exact, varLabels = TRUE, missing = TRUE)
write.csv(demogtable, "table1.csv")

Kaplan-Meier Survival Curves


Create Survival object for 10 year all cause survival:

fitoverallsurv <- survfit(Surv(time_to_death, 
    ten_year_all_cause_survival) ~ primary_site, 
    data = data)

Create survival curve:

Code adapted from: survminer R package: Survival Data Analysis and Visualization - Easy Guides - Wiki - STHDA by Alboukadel Kassambara
ggsurvplot(fitoverallsurv, data = data,
           size = 1,                 # change line size
           conf.int = "TRUE", 
           conf.int.style="step",# Add confidence interval
           pval = TRUE,              # Add p-value
           risk.table = "abs_pct",        # Add risk table
           risk.table.col = "strata",# Risk table color by groups
           legend.labs =
             c("Left", "Rectum", "Right"),    # Change legend labels
           risk.table.height = 0.25, # Useful to change when you have multiple groups
           palette = "lancet",
           xlim= c(0,120),
           break.time.by = 10,
           legend.title = "Primary Site",
           xlab = "Time (months)",
           legend="right",
           title= "All Cause Survival from Time of Liver Resection", title.size=500
           )


### Life table which shows proportion alive at each time period below KM curve:

gtsummary::tbl_survfit(fitoverallsurv, times=c(12,36,60,120), label="Primary Tumour Location", 
                       label_header = "**{time} Months**", statistic="{estimate} (95% CI {conf.low}-{conf.high})"
)

Create survival object for time to recurrence in liver after primary resection:

New Column - LiverRecurrence (all 2 as everyone has liver mets)
data$liver_recurrence=2

survfit(Surv(time_to_liv_met, liver_recurrence) ~ primary_site, data = data)->fittimetorecurrence
ggsurvplot(fittimetorecurrence, data = data,
           size = 1,                 # change line size
           conf.int = "TRUE", 
           conf.int.style="step",# Add confidence interval
           pval = TRUE,
           pval.coord= c(75,0.25),# Add p-value
           risk.table = "abs_pct",        # Add risk table
           risk.table.col = "strata",# Risk table color by groups
           legend.labs =
             c("Left", "Rectosigmoid", "Right"),    # Change legend labels
           risk.table.height = 0.25, # Useful to change when you have multiple groups
           palette = "lancet",
           legend.title = "Primary Site",
           xlab = "Time (months)",
           ylab = "Recurrence Probability",
           legend="right",
           title="Time from Primary Procedure to Recurrence in Liver"
)

Disease Free Survival

survfit(Surv(time_to_recurrence, five_year_disease_free_survival) ~ primary_site, data = data)->fitdiseasefreesurvival
ggsurvplot(fitdiseasefreesurvival, data = data,
           size = 1,                 # change line size
           conf.int = TRUE,          # Add confidence interval
           pval = TRUE,              # Add p-value
           risk.table = "abs_pct",
           conf.int.style="step",# Add risk table
           risk.table.col = "strata",# Risk table color by groups
           legend.labs =
             c("Left", "Rectum", "Right"),    # Change legend labels
           risk.table.height = 0.25, # Useful to change when you have multiple groups
           palette = "lancet",# Change ggplot2 theme
           legend.title = "Primary Site",
           xlab = "Time (months)",
           ylab = "Disease Free Probability",
           legend="right",
           xlim= c(0,60),
           break.time.by = 10,
           pval.coord = c(5, 0.1), 
           title= "Disease Free Survival Post-Liver Resection",
)

gtsummary::tbl_survfit(fitdiseasefreesurvival, times=c(12,36,60), label="Primary Tumour Location", 
                       label_header = "**{time} Months**", statistic="{estimate} (95% CI {conf.low}-{conf.high})"
)

Time from Recurrence after Liver Resection to Death

survfit(Surv(recurrence_to_death, five_year_all_cause_survival) ~ primary_site, data = data)->fitrecurrencetodeath

ggsurvplot(fitrecurrencetodeath, data = data,
           size = 1,                 # change line size
           conf.int = TRUE, 
           conf.int.style="step",# Add risk table
           # Add confidence interval
           pval = TRUE,              # Add p-value
           risk.table = "abs_pct",
           risk.table.col = "strata",# Risk table color by groups
           legend.labs =
             c("Left", "Rectum", "Right"),    # Change legend labels
           risk.table.height = 0.25, # Useful to change when you have multiple groups
           palette = "lancet",# Change ggplot2 theme
           legend.title = "Primary Site",
           xlab = "Time (months)",
           ylab = "Probability of Survival after Recurrence Post Liver Resection",
           legend="right",
           xlim= c(0,120),
           break.time.by = 10,
           pval.coord = c(5, 0.05), 
           title= "Recurrence after Liver Resection to Death"
)

Faceted Kaplan-Meier Survival Curves for each variable


### Custom function to generate KM plots:

survplot<-function(data=data,time, event, var,title=NULL, ylab="Survival probability", 
                   xlim=c(0,120), pval.coord=NULL, legend.labs=NULL){
  
  survminer::surv_fit(Surv(time, 
                           event)~ var, data=data)->fit
  
  
  ggsurvplot(fit, data=data,
             size = 1,                 # change line size
             conf.int = "TRUE", 
             conf.int.style="step",# Add confidence interval
             pval = TRUE,              # Add p-value
             risk.table = FALSE,        # Add risk table
             risk.table.col = "strata",# Risk table color by groups
             legend.labs =
               legend.labs,    # Change legend labels
             risk.table.height = 0.25, # Useful to change when you have multiple groups
             palette = "lancet",
             xlim= xlim,
             break.time.by = 10,
             legend.title = "",
             xlab = "Time (months)",
             ylab=ylab,
             pval.coord=pval.coord,
             legend="right",
             title= title, title.size=100,
             font.legend=c(5,"plain","black"),
             font.main=c(6,"plain","black"),
             font.tickslab=c(4,"plain","black"),
             font.x=c(5,"plain","black"),
             font.y=c(5,"plain","black"),
  )
  
}

### Overall Survival

oslist<-list(
  
  survplot(data=data,data$time_to_death,data$five_year_all_cause_survival, data$agecat,
           legend.labs=c("<50","51-60","61-70",">70"), title = "Overall Survival- Age"),
  
  survplot(data=data,data$time_to_death,data$five_year_all_cause_survival, data$gender,
           legend.labs = c("Female","Male"), title = "Overall Survival-Gender"), 
  
  survplot(data=data,data$time_to_death,data$five_year_all_cause_survival, data$modified_charlson_index,
           legend.labs = c("0","1","2","3","4",">=5") , title = "Overall Survival-Modified Charlson Index"),
  survplot(data=data,data$time_to_death,data$five_year_all_cause_survival, data$asa,
           legend.labs = c("1","2","3","4") , title = "Overall Survival-ASA"),
  survplot(data=data,data$time_to_death,data$five_year_all_cause_survival, data$performance_status,
           legend.labs = c("1","2","3","4") , title = "Overall Survival-Performance Status"),
  
  survplot(data=data,data$time_to_death,data$five_year_all_cause_survival, data$primary_site,
           legend.labs=c("Left","Rectum","Right"), title = "Overall Survival- Primary Tumour Location"),
  
  
  
  survplot(data=data,data$time_to_death,data$five_year_all_cause_survival, data$synchronous_liver_metastasis,
           legend.labs = c("Metachronous","Synchronous"), title = "Overall Survival- Synchronous vs. Metachronous"),
  
  survplot(data=data,data$time_to_death,data$five_year_all_cause_survival, data$initial_tumour_stage_t,
           legend.labs=c("T1","T2","T3","T4a","T4b"), title = "Overall Survival- Tumour Stage T"),
  survplot(data=data,data$time_to_death,data$five_year_all_cause_survival, data$initial_tumour_stage_n,
           legend.labs=c("N0","N1","N2"), title = "Overall Survival- Tumour Stage N"),
  
  survplot(data=data,data$time_to_death,data$five_year_all_cause_survival, as.factor(data$tumour_grade),
           legend.labs = c("1","2","3"),title = "Overall Survival- Tumour Grade"),
  
  
  survplot(data=data,data$time_to_death,data$five_year_all_cause_survival, data$urgency_primary,
           legend.labs = c("Elective","Emergency"), title = "Overall Survival-Urgency Primary Procedure"),
  
  survplot(data=data,data$time_to_death,data$five_year_all_cause_survival, data$neo_adj_prim,
           legend.labs = c("No Neoadjuvant\nChemotherapy","Neoadjuvant\nChemotherapy"), title = "Overall Survival-Neoadjuvant Chemotherapy Primary"),
  
  survplot(data=data,data$time_to_death,data$five_year_all_cause_survival, data$adj_prim,
           legend.labs = c("No Adjuvant\nChemotherapy","Adjuvant\nChemotherapy"), title = "Overall Survival-Adjuvant Chemotherapy Primary"),
  
  survplot(data=data,data$time_to_death,data$five_year_all_cause_survival, data$neo_adj_liv,
           legend.labs = c("No Neoadjuvant\nChemotherapy","Neodjuvant\nChemotherapy"), title = "Overall Survival-Neoadjuvant Chemotherapy Liver"),
  
  survplot(data=data,data$time_to_death,data$five_year_all_cause_survival, data$adj_liv,
           legend.labs = c("No Adjuvant\nChemotherapy","Adjuvant\nChemotherapy"), title = "Overall Survival-Adjuvant Chemotherapy Liver"),
  
  
  survplot(data=synch,synch$time_to_death,synch$five_year_all_cause_survival, synch$sequence,
           legend.labs = c("Liver First","Primary First","Synchronous") , title = "Overall Survival-Sequence of Surgery in Patients with Synchronous Tumours"),
  
  
  
  survplot(data=data,data$time_to_death,data$five_year_all_cause_survival, data$clavien_dindo,
           legend.labs = c("0 or I","II","IIIA","IIIB","IVA","IVB") , title = "Overall Survival-Clavien-Dindo Classification after Liver Surgery"),
  
  survplot(data=data,data$time_to_death,data$five_year_all_cause_survival, data$margins_primary_resection,
           legend.labs = c("R0","R1") , title = "Overall Survival-Margins after Primary Surgery"),
  
  survplot(data=data,data$time_to_death,data$five_year_all_cause_survival, data$liver_resection_margins,
           legend.labs = c("R0","R1") , title = "Overall Survival-Margins after Liver Surgery"),
  
  survplot(data=data,data$time_to_death,data$five_year_all_cause_survival, data$lung,
           legend.labs = c("No Lung\nMetastases", "Lung\nMetastases") , title = "Overall Survival- Lung Metastases"),
  
  survplot(data=data,data$time_to_death,data$five_year_all_cause_survival, data$liver_recurrence_yn,
           legend.labs = c("No Liver\nRecurrence", "Liver\nRecurrence") , title = "Overall Survival- Liver Recurrence")
)


arrange_ggsurvplots(oslist, ncol=6,nrow = 4,risk.table = FALSE)->osfacet
ggsave("osfacet.svg",osfacet, dpi=300, width=45,height=23, units="cm")


### Disease Free Survival


dflist<-list(
  
  survplot(data=data,data$time_to_recurrence,data$five_year_disease_free_survival, data$agecat,
           xlim=c(0,60), legend.labs=c("<50","51-60","61-70",">70"), title = "Disease Free Survival- Age"),
  
  survplot(data=data,data$time_to_recurrence,data$five_year_disease_free_survival, data$gender,
           xlim=c(0,60), legend.labs = c("Female","Male"), title = "Disease Free Survival-Gender"), 
  
  survplot(data=data,data$time_to_recurrence,data$five_year_disease_free_survival, data$modified_charlson_index,
           xlim=c(0,60), legend.labs = c("0","1","2","3","4",">=5") , title = "Disease Free Survival-Modified Charlson Index"),
  survplot(data=data,data$time_to_recurrence,data$five_year_disease_free_survival, data$asa,
           xlim=c(0,60), legend.labs = c("1","2","3","4") , title = "Disease Free Survival-ASA"),
  survplot(data=data,data$time_to_recurrence,data$five_year_disease_free_survival, data$performance_status,
           xlim=c(0,60), legend.labs = c("1","2","3","4") , title = "Disease Free Survival-Performance Status"),
  
  survplot(data=data,data$time_to_recurrence,data$five_year_disease_free_survival, data$primary_site,
           xlim=c(0,60), legend.labs=c("Left","Rectum","Right"), title = "Disease Free Survival- Primary Tumour Location"),
  survplot(data=data,data$time_to_recurrence,data$five_year_disease_free_survival, data$synchronous_liver_metastasis,
           xlim=c(0,60), legend.labs = c("Metachronous","Synchronous"), title = "Disease Free Survival- Synchronous vs. Metachronous"),
  
  survplot(data=data,data$time_to_recurrence,data$five_year_disease_free_survival, data$initial_tumour_stage_t,
           legend.labs=c("T1","T2","T3","T4a","T4b"), title = "Disease Free Survival- Tumour Stage T"),
  survplot(data=data,data$time_to_recurrence,data$five_year_disease_free_survival, data$initial_tumour_stage_n,
           legend.labs=c("N0","N1","N2"), title = "Disease Free Survival- Tumour Stage N"),
  
  survplot(data=data,data$time_to_recurrence,data$five_year_disease_free_survival, as.factor(data$tumour_grade),
           legend.labs = c("1","2","3"),title = "Disease Free Survival- Tumour Grade"),
  
  
  
  
  survplot(data=data,data$time_to_recurrence,data$five_year_disease_free_survival, data$urgency_primary,
           xlim=c(0,60), legend.labs = c("Elective","Emergency"), title = "Disease Free Survival-Urgency Primary Procedure"),
  
  survplot(data=data,data$time_to_recurrence,data$five_year_disease_free_survival, data$neo_adj_prim,
           xlim=c(0,60), legend.labs = c("No Neoadjuvant\nChemotherapy","Neoadjuvant\nChemotherapy"), title = "Disease Free Survival-Neoadjuvant Chemotherapy Primary"),
  
  survplot(data=data,data$time_to_recurrence,data$five_year_disease_free_survival, data$adj_prim,
           xlim=c(0,60), legend.labs = c("No Adjuvant\nChemotherapy","Adjuvant\nChemotherapy"), title = "Disease Free Survival-Adjuvant Chemotherapy Primary"),
  
  survplot(data=data,data$time_to_recurrence,data$five_year_disease_free_survival, data$neo_adj_liv,
           xlim=c(0,60), legend.labs = c("No Neoadjuvant\nChemotherapy","Neodjuvant\nChemotherapy"), title = "Disease Free Survival-Neoadjuvant Chemotherapy Liver"),
  
  survplot(data=data,data$time_to_recurrence,data$five_year_disease_free_survival, data$adj_liv,
           xlim=c(0,60), legend.labs = c("No Adjuvant\nChemotherapy","Adjuvant\nChemotherapy"), title = "Disease Free Survival-Adjuvant Chemotherapy Liver"),
  
  
  survplot(data=synch,synch$time_to_recurrence,synch$five_year_disease_free_survival, synch$sequence,
           xlim=c(0,60), legend.labs = c("Liver First","Primary First","Synchronous") , title = "Disease Free Survival-Sequence of Surgery in Patients with Synchronous Tumours"),
  
  
  
  survplot(data=data,data$time_to_recurrence,data$five_year_disease_free_survival, data$clavien_dindo,
           xlim=c(0,60), legend.labs = c("0 or I","II","IIIA","IIIB","IVA","IVB") , title = "Disease Free Survival-Clavien-Dindo Classification after Liver Surgery"),
  
  
  survplot(data=data,data$time_to_recurrence,data$five_year_disease_free_survival, data$lung,
           xlim=c(0,60), legend.labs = c("R0", "R1") , title = "Disease Free Survival- Primary Margins"),
  
  survplot(data=data,data$time_to_recurrence,data$five_year_disease_free_survival, data$liver_resection_margins,
           xlim=c(0,60), legend.labs = c("R0", "R1") , title = "Disease Free Survival- Liver Resection Margins")
  
  
  
)


arrange_ggsurvplots(dflist, ncol=5,nrow = 4,risk.table = FALSE)->dffacet

ggsave("dffacet.svg",dffacet, dpi=300, width=40,height=20, units="cm")

Univariable and Multivariable Cox Regression before Propensity Score Weighting:

coxreg <- coxph(Surv(TimeToDeath, FiveYearAllCauseSurvival) ~ 
    PrimarySite, data = data)

coxregmulti <- coxph(Surv(TimeToDeath, FiveYearAllCauseSurvival) ~ 
    PrimarySite + Age + Gender + ASA + LiverResectionMargins, 
    data = data)

Multiple Imputation


Visualize Distribution of Missing Data:

naniar::gg_miss_var(longtermsurvdata, show_pct = TRUE)

Remove unused variables:

longtermsurvdata_slim <- data %>% select(-c(id2, 
    dob, dateof_liver_resection, dateof_death, 
    date_of_recurrence_post_liver_resection, 
    , initial_cea, stoma, operative_route, 
    date_primary_resection, margins_primary_resection, 
    date_disease_recurrence, cea_liver_metastasis, 
    stage_time_metastasis_m, stage_time_metastasis_n, 
    no_segmentectomies, route, right, i, 
    ii, iii, iv, v, vi, vii, viii, adrenal, 
    aortocaval_node, bladder, bone, cns, 
    diaphragm, lymphnodes_aortocaval_and_hilar, 
    peritoneum, skin, ureter, uterus, immunotherapy, 
    ir_ablation, na, pve, braf_2, other, 
    ras_2, regional_chemo, braf, ovary, other_gi, 
    paraortic_nodes, none, primary_procedure))
longtermsurvdata_slim[sapply(longtermsurvdata_slim, 
    is.character)] <- lapply(longtermsurvdata_slim[sapply(longtermsurvdata_slim, 
    is.character)], as.factor)

longtermsurvdata_slim$modified_charlson_index <- longtermsurvdata_slim$modified_charlson_index %>% 
    as.numeric()

Run MICE with 25 imputed datasets, using classification and regression trees

set.seed(125)
imputed <- mice(longtermsurvdata_slim, m = 25, 
    method = "cart", seed = 125)

Propensity Score Weighting


Weighting conducted with multiple methods, GBM chosen as best balance

GBM

weighted <- weightthem(primary_site ~ age + 
    gender + urgency_primary + modified_charlson_index + 
    +initial_tumour_stage_t + initial_tumour_stage_n + 
    performance_status + synchronous_liver_metastasis + 
    tumour_grade + neo_adj_prim + adj_prim + 
    neo_adj_liv + adj_liv + ras + p53, datasets = imputed, 
    method = "gbm", approach = "within", 
    estimand = "ATT", focal = "Right", verbose = TRUE)

CB

weighted <- weightthem(primary_site ~ age + 
    gender + urgency_primary + modified_charlson_index + 
    +initial_tumour_stage_t + initial_tumour_stage_n + 
    performance_status + synchronous_liver_metastasis + 
    tumour_grade + neo_adj_prim + adj_prim + 
    neo_adj_liv + adj_liv + ras + p53, datasets = imputed, 
    method = "cb", approach = "within", estimand = "ATT", 
    focal = "Right", verbose = TRUE)

Optimal Matching

weighted <- weightthem(primary_site ~ age + 
    gender + urgency_primary + modified_charlson_index + 
    +initial_tumour_stage_t + initial_tumour_stage_n + 
    performance_status + synchronous_liver_metastasis + 
    tumour_grade + neo_adj_prim + adj_prim + 
    neo_adj_liv + adj_liv + ras + p53, datasets = imputed, 
    method = "optimal", approach = "within", 
    estimand = "ATT", focal = "Right", verbose = TRUE)

Other Methods

weightedebal<-weightthem(primary_site~age+gender+urgency_primary+modified_charlson_index+
                       +initial_tumour_stage_t+
                       initial_tumour_stage_n+
                       performance_status+synchronous_liver_metastasis+tumour_grade+neo_adj_prim+
                       adj_prim+neo_adj_liv+adj_liv+ras+p53,
                     datasets = imputed,
                     method = "ebal", approach = "within",
                     estimand = "ATT", focal="Right", verbose=TRUE)"


weightednpcbps<-
weightthem(primary_site~age+gender+urgency_primary+modified_charlson_index+
                       +initial_tumour_stage_t+
                       initial_tumour_stage_n+
                       performance_status+synchronous_liver_metastasis+tumour_grade+neo_adj_prim+
                       adj_prim+neo_adj_liv+adj_liv+ras+p53,
                     datasets = imputed,
                     method = "npcbps", approach = "within",
                     estimand = "ATT", focal="Right", verbose=TRUE)

Create list to compare weighting methods:

weights <- list(weighted, weightedps, weightedcb, 
    weightedopt, weightedebal, weightednpcbps)

Balance Assessment

library(cobalt)

Function to create balance table:

bal.tabvar <- function(x) {
    bal.tab(x, thresholds = c(m = 0.1, v = 2), 
        abs = TRUE)
}

Balance Tables with all methods of weighting:

map(weights, bal.tabvar)->baltables 

for (i in 1:6) { print( baltables[[i]][["Pair.Balance"]][["Right vs. Left"]][["Balance.Across.Imputations"]][["Mean.Diff.Adj"]]) } for (i in 1:6) { print( baltables[[i]][["Pair.Balance"]][["Right vs. Rectum"]][["Balance.Across.Imputations"]][["Mean.Diff.Adj"]]) } for (i in 1:6) { print( baltables[[i]][["Pair.Balance"]][["Rectum vs. Left"]][["Balance.Across.Imputations"]][["Mean.Diff.Adj"]]) }

for (i in 1:6) { print( baltables[[i]][["Pair.Balance"]][["Right vs. Left"]][["Balance.Across.Imputations"]][["Mean.V.Ratio.Adj"]]) } for (i in 1:6) { print( baltables[[i]][["Pair.Balance"]][["Right vs. Rectum"]][["Balance.Across.Imputations"]][["Mean.V.Ratio.Adj"]]) } for (i in 1:6) { print( baltables[[i]][["Pair.Balance"]][["Rectum vs. Left"]][["Balance.Across.Imputations"]][["Mean.V.Ratio.Adj"]]) }

bal.plot.both<-function(x){ bal.plot(weightedopt, which = "both", var.name = x) }

c("age","gender","urgency_primary","modified_charlson_index", "initial_tumour_stage_t", "initial_tumour_stage_n", "performance_status","initial_tumour_stage_m","tumour_grade")->balvars

map(balvars, bal.plot.both)->balpots

Stack Imputed Datasets into One Large Dataset and Remove Original

weighted[["datasets"]] %>% bind_rows()->weightedstacked weightedstacked %>% filter(.imp != 0)->weightedstacked

Explanatory Variables for Overall and Disease Free Survival:

explan = c("age", "gender", "tumour_grade", 
    "primary_site", "initial_tumour_stage_t", 
    "initial_tumour_stage_n", "synchronous_liver_metastasis", 
    "bilobar", "lung", "neo_adj_liv", "neo_adj_prim", 
    "adj_prim", "adj_liv", "ras", "p53", 
    "asa", "number_liver_metastases", "size_largest_liver_metastasis", 
    "performance_status", "modified_charlson_index", 
    "urgency_primary", "liver_resection_margins", 
    "clavien_dindo", "liver_recurrence_yn")

explandiseasefree = c("age", "gender", "tumour_grade", 
    "primary_site", "initial_tumour_stage_t", 
    "initial_tumour_stage_n", "synchronous_liver_metastasis", 
    "bilobar", "neo_adj_liv", "neo_adj_prim", 
    "adj_prim", "adj_liv", "ras", "p53", 
    "p53", "asa", "number_liver_metastases", 
    "size_largest_liver_metastasis", "performance_status", 
    "modified_charlson_index", "urgency_primary", 
    "liver_resection_margins", "clavien_dindo")

Cox Regression after Propensity Weighting


Function to Conduct Weighted Univariable Cox Regression on Each Weighted Imputed Dataset and then pool the results (for all cause survival):

allcausereg <- function(x) {
    weightedstacked %>% split(.$.imp) %>% 
        map(~coxph(formula = as.formula(paste("Surv(time_to_death, ten_year_all_cause_survival)", 
            x, sep = "~")), data = ., weights = weights)) %>% 
        as.mira(.) %>% pool(., dfcom = NULL) %>% 
        summary(., conf.int = TRUE, exp = TRUE) %>% 
        print()
}

Function to Conduct Weighted Univariable Cox Regression on Each Weighted Imputed Dataset and then pool the results (for disease free survival):

diseasefreereg <- function(x) {
    weightedstacked %>% split(.$.imp) %>% 
        map(~coxph(formula = as.formula(paste("Surv(time_to_recurrence, five_year_disease_free_survival)", 
            x, sep = "~")), data = ., weights = weights)) %>% 
        as.mira(.) %>% pool(., dfcom = NULL) %>% 
        summary(., conf.int = TRUE, exp = TRUE) %>% 
        print()
}

Run Univariable Regression with all explanatory variables

All cause survival:

univarallcause <- map(explan, allcausereg) %>% 
    bind_rows()

Disease free survival:

univardiseasefree <- map(explandiseasefree, 
    diseasefreereg) %>% bind_rows()

Multivariable Analysis

All Cause Survival:

coxregtablematched <- weightedstacked %>% 
    split(.$.imp) %>% map(~coxph(Surv(time_to_death, 
    five_year_all_cause_survival) ~ age + 
    bilobar + lung + asa + initial_tumour_stage_n + 
    performance_status + synchronous_liver_metastasis + 
    neo_adj_liv + adj_prim + neo_adj_prim + 
    number_liver_metastases + size_largest_liver_metastasis + 
    primary_site + liver_recurrence_yn + 
    clavien_dindo + asa, , data = ., , weights = weights)) %>% 
    as.mira(.) %>% pool(., dfcom = NULL) %>% 
    summary(., conf.int = TRUE, exp = TRUE)

Disease Free Survival:

coxregtablematcheddiseasefree <- weightedstacked %>% 
    split(.$.imp) %>% map(~coxph(Surv(time_to_recurrence, 
    five_year_disease_free_survival) ~ bilobar + 
    initial_tumour_stage_n + synchronous_liver_metastasis + 
    adj_prim + adj_liv + neo_adj_liv + neo_adj_prim + 
    number_liver_metastases + primary_site + 
    liver_resection_margins, data = ., , 
    weights = weights)) %>% as.mira(.) %>% 
    pool(., dfcom = NULL) %>% summary(., 
    conf.int = TRUE, exp = TRUE)

Forest Plot of Multivariable Cox Regression

coxregtablematched %>% 
  mutate_if(is.numeric, round, digits=2)->coxregtablematchedlabel


coxregtablematchedlabel %>% glue::glue_data("{estimate} (95% CI {X2.5..}-{X97.5..}, p={p.value})")->coxregtablematchedlabel$summary


tabletext1<-cbind(coxregtablematchedlabel$term, coxregtablematchedlabel$summary)


forestplot(labeltext=tabletext1,
           
           line.margin = 1,
           mean=coxregtablematchedlabel$estimate, 
           lower=coxregtablematchedlabel$X2.5.., upper=coxregtablematchedlabel$X97.5..,
           xlab="Hazard Ratio for All Cause Mortality after Liver Resection",
           zero=1,  boxsize=0.7, xlog=TRUE,
           
           lwd.ci=2, ci.vertices=TRUE, ci.vertices.height = 0.1,
           lwd.zero=3,
           title="Hazard Ratios for All Cause Mortality after Liver Resection",
           txt_gp=fpTxtGp(label=gpar(cex=1.5, fontfamily="Helvetica"),
                          ticks=gpar(cex=1.1),
                          xlab=gpar(cex = 1.2),
                          title=gpar(cex = 1.5)
           ),
           
           
           graphwidth=unit(10,"cm"),
           is.summary=c(rep(FALSE,15)),
           col=fpColors(box="darkblue",axes="black",lines = "gray52"))

Disease Free Forest Plot:

coxregtablematcheddiseasefreelabel <- coxregtablematcheddiseasefree %>% 
    mutate_if(is.numeric, round, digits = 2)

coxregtablematcheddiseasefreelabel$summary <- coxregtablematcheddiseasefreelabel %>% 
    glue::glue_data("{estimate} (95% CI {X2.5..}-{X97.5..}, p={p.value})")

tabletext2 <- cbind(coxregtablematcheddiseasefreelabel$term, 
    coxregtablematcheddiseasefreelabel$summary)
forestplot(labeltext = tabletext2, mean = coxregtablematcheddiseasefreelabel$estimate, 
    lower = coxregtablematcheddiseasefreelabel$X2.5.., 
    upper = coxregtablematcheddiseasefreelabel$X97.5.., 
    xlab = "Hazard Ratio for Recurrence after Liver Resection", 
    zero = 1, boxsize = 0.5, xlog = TRUE, 
    lwd.ci = 2, ci.vertices = TRUE, ci.vertices.height = 0.1, 
    lwd.zero = 3, title = "Hazard Ratios for Recurrence after Liver Resection", 
    txt_gp = fpTxtGp(label = gpar(cex = 1.25, 
        fontfamily = "Helvetica"), ticks = gpar(cex = 1.1), 
        xlab = gpar(cex = 1.2), title = gpar(cex = 1.5)), 
    graphwidth = unit(10, "cm"), is.summary = c(rep(FALSE, 
        15)), col = fpColors(box = "darkblue", 
        axes = "black", lines = "gray52"))

Plot of Schoenfeld Residuals to test Proportional Hazards Assumption:

form <- paste("Surv(data$time_to_death, data$five_year_all_cause_survival)", 
    paste(explanatory, collapse = " + "), 
    sep = " ~ ") %>% as.formula()

res.cox <- coxph(form, data = data)
test.ph <- cox.zph(res.cox)
ggcoxzph(test.ph)

Create Flow Chart of Inclusion/ Exclusion:


Code adpated from: Easiest flowcharts eveR? | R-bloggers (r-bloggers.com) by Max Gordon
library(Gmisc, quietly = TRUE); library(glue); library(htmlTable); library(grid); library(magrittr)

org_cohort <- boxGrob(glue("Patients in Institutional Registry", "n = {pop}", pop = txtInt(573), .sep = "\n"))

included <- boxGrob(glue("Included Patients", "n = {incl}", incl = txtInt(414), .sep = "\n")) grp_a <- boxGrob(glue("Right Sided Tumours", "n = {recr} (24.6%)", recr = txtInt(102), .sep = "\n"))

grp_b <- boxGrob(glue("Left Sided Tumours", "n = {recr} (39.4%)", recr = txtInt(163), .sep = "\n")) grp_c <- boxGrob(glue("Rectosigmoid Tumours", "n = {recr} (36.0%)", recr = txtInt(149), .sep = "\n")) excluded <- boxGrob(glue("Excluded (n = {159}):", " - Repeat Liver Resection: {uninterested}", " - Procedure Abandoned: {contra}", " - Died before Discharge: {died}", uninterested = 134, contra = 573 - 134 - 9-414, died = 9, .sep = "\n"), just = "left")

grid.newpage() vert <- spreadVertical(org_cohort, included = included, grps = grp_a) grps <- alignVertical(reference = vert$grps,  grp_a, grp_b, grp_c) %>%  spreadHorizontal() vert$grps <- NULL

excluded <- moveBox(excluded, x = .8, y = .7)

for (i in 1:(length(vert) - 1)) { connectGrob(vert[[i]], vert[[i + 1]], type = "vert") %>% print } connectGrob(vert$included, grps[[1]], type = "N") connectGrob(vert$included, grps[[2]], type = "N") connectGrob(vert$included, grps[[3]], type = "N") connectGrob(included, excluded, type = "L")

vert grps excluded

Session Info


    sessionInfo()
    R version 4.0.2 (2020-06-22)
    Platform: x86_64-w64-mingw32/x64 (64-bit)
    Running under: Windows 10 x64 (build 19042)

    Matrix products: default

    locale:
    [1] LC_COLLATE=English_United Kingdom.1252  LC_CTYPE=English_United Kingdom.1252    LC_MONETARY=English_United Kingdom.1252
    [4] LC_NUMERIC=C                            LC_TIME=English_United Kingdom.1252    

    attached base packages:
    [1] stats     graphics  grDevices utils     datasets  methods   base     

    loaded via a namespace (and not attached):
     [1] tidyr_1.1.2        splines_4.0.2      carData_3.0-4      assertthat_0.2.1   ggstance_0.3.4     cellranger_1.1.0  
     [7] yaml_2.2.1         pillar_1.4.6       backports_1.1.10   lattice_0.20-41    rlist_0.4.6.1      visdat_0.5.3      
    [13] glue_1.4.2         digest_0.6.25      ggsignif_0.6.0     checkmate_2.0.0    snakecase_0.11.0   colorspace_1.4-1  
    [19] htmltools_0.5.0    Matrix_1.2-18      survey_4.0         plyr_1.8.6         pkgconfig_2.0.3    broom_0.7.1       
    [25] haven_2.3.1        purrr_0.3.4        xtable_1.8-4       scales_1.1.1       naniar_0.6.0       km.ci_0.5-2       
    [31] openxlsx_4.2.2     rio_0.5.16         KMsurv_0.1-5       htmlTable_2.1.0    tibble_3.0.3       generics_0.0.2    
    [37] car_3.0-10         ggplot2_3.3.2      ellipsis_0.3.1     ggpubr_0.4.0       cobalt_4.2.3       janitor_2.0.1     
    [43] lazyeval_0.2.2     cli_2.0.2          survival_3.2-7     magrittr_1.5       crayon_1.3.4       readxl_1.3.1      
    [49] MatchIt_3.0.2      evaluate_0.14      fansi_0.4.1        MASS_7.3-53        rstatix_0.6.0      forcats_0.5.0     
    [55] foreign_0.8-80     tableone_0.12.0    tools_4.0.2        data.table_1.13.0  hms_0.5.3          mitools_2.4       
    [61] lifecycle_0.2.0    matrixStats_0.57.0 stringr_1.4.0      munsell_0.5.0      zip_2.1.1          compiler_4.0.2    
    [67] survminer_0.4.8    rlang_0.4.7        grid_4.0.2         rstudioapi_0.11    htmlwidgets_1.5.2  rmarkdown_2.4     
    [73] gtable_0.3.0       DBI_1.1.0          abind_1.4-5        curl_4.3           ggforestplot_0.1.0 R6_2.4.1          
    [79] gridExtra_2.3      zoo_1.8-8          lubridate_1.7.9    knitr_1.30         dplyr_1.0.2        survMisc_0.5.5    
    [85] stringi_1.5.3      Rcpp_1.0.5         vctrs_0.3.4        expss_0.10.6       tidyselect_1.1.0   xfun_0.18