Custom ORSF control
Arguments
- beta_fun
(function) a function to define coefficients used in linear combinations of predictor variables.
beta_fun
must accept three inputs namedx_node
,y_node
andw_node
, and should expect the following types and dimensions:x_node
(matrix;n
rows,p
columns)y_node
(matrix;n
rows,2
columns)w_node
(matrix;n
rows,1
column)
In addition,
beta_fun
must return a matrix with p rows and 1 column. If any of these conditions are not met,orsf_control_custom()
will let you know.- ...
Further arguments passed to or from other methods (not currently used).
Value
an object of class 'orsf_control'
, which should be used as
an input for the control
argument of orsf.
Examples
Two customized functions to identify linear combinations of predictors are shown here.
The first uses random coefficients
The second derives coefficients from principal component analysis.
Random coefficients
f_rando()
is our function to get the random coefficients:
We can plug f_rando
into orsf_control_custom()
, and then pass the
result into orsf()
:
library(aorsf)
fit_rando <- orsf(pbc_orsf,
Surv(time, status) ~ . - id,
control = orsf_control_custom(beta_fun = f_rando),
n_tree = 500)
fit_rando
## ---------- Oblique random survival forest
##
## Linear combinations: Custom user function
## N observations: 276
## N events: 111
## N trees: 500
## N predictors total: 17
## N predictors per node: 5
## Average leaves per tree: 23
## Min observations in leaf: 5
## Min events in leaf: 1
## OOB stat value: 0.82
## OOB stat type: Harrell's C-statistic
## Variable importance: anova
##
## -----------------------------------------
Principal components
Follow the same steps as above, starting with the custom function:
f_pca <- function(x_node, y_node, w_node) {
# estimate two principal components.
pca <- stats::prcomp(x_node, rank. = 2)
# use the second principal component to split the node
pca$rotation[, 2L, drop = FALSE]
}
Then plug the function into orsf_control_custom()
and pass the result
into orsf()
:
fit_pca <- orsf(pbc_orsf,
Surv(time, status) ~ . - id,
control = orsf_control_custom(beta_fun = f_pca),
n_tree = 500)
Evaluate
How well do our two customized ORSFs do? Let’s compute their indices of prediction accuracy based on out-of-bag predictions:
## riskRegression version 2022.11.28
library(survival)
risk_preds <- list(rando = 1 - fit_rando$pred_oobag,
pca = 1 - fit_pca$pred_oobag)
sc <- Score(object = risk_preds,
formula = Surv(time, status) ~ 1,
data = pbc_orsf,
summary = 'IPA',
times = fit_pca$pred_horizon)
The PCA ORSF does quite well! (higher IPA is better)
sc$Brier
##
## Results by model:
##
## model times Brier lower upper IPA
## <fctr> <num> <char> <char> <char> <char>
## 1: Null model 1788 20.479 18.090 22.868 0.000
## 2: rando 1788 12.381 10.175 14.588 39.541
## 3: pca 1788 12.496 10.476 14.515 38.983
##
## Results of model comparisons:
##
## times model reference delta.Brier lower upper p
## <num> <fctr> <fctr> <char> <char> <char> <num>
## 1: 1788 rando Null model -8.098 -10.392 -5.804 4.558033e-12
## 2: 1788 pca Null model -7.983 -9.888 -6.078 2.142713e-16
## 3: 1788 pca rando 0.114 -0.703 0.932 7.838255e-01
##
## NOTE: Values are multiplied by 100 and given in %.
## NOTE: The lower Brier the better, the higher IPA the better.
See also
linear combination control functions
orsf_control_cph()
,
orsf_control_fast()
,
orsf_control_net()