Use a trained elastic net model to predict fitted values from new data
Source:R/patient-level_modeling.R
tof_predict.Rd
This function uses a trained `tof_model` to make predictions on new data.
Usage
tof_predict(
tof_model,
new_data,
prediction_type = c("response", "class", "link", "survival curve")
)
Arguments
- tof_model
A `tof_model` trained using
tof_train_model
- new_data
A tibble of new observations for which predictions should be made. If new_data isn't provided, predictions will be made for the training data used to fit the model.
- prediction_type
A string indicating which type of prediction should be provided by the model:
- "response" (the default)
For "linear" models, the predicted response for each observation. For "two-class" and "multiclass" models, the fitted probabilities of each class for each observation. For "survival" models, the fitted relative-risk for each observation.
- "class"
Only applies to "two-class" and "multiclass" models. For both, the class label corresponding to the class with the maximum fitted probability.
- "link"
The linear predictions of the model (the output of the link function for each model family.)
- "survival curve"
Only applies to "survival" models. Returns a tibble indicating each patient's probability of survival (1 - probability(event)) at each timepoint in the dataset. Obtained using the
survfit
function.
Value
A tibble
with a single column (`.pred`) containing
the predictions or, for multiclass models with `prediction_type` == "response",
a tibble with one column for each class. Each row in the output corresponds to a row in `new_data` (
or, if `new_data` is not provided, to a row in the `tof_model`'s training data).
In the latter case, be sure to check `tof_model$training_data` to confirm the
order of observations, as the resampling procedure can change their ordering
relative to the original input data.
See also
Other modeling functions:
tof_assess_model()
,
tof_create_grid()
,
tof_split_data()
,
tof_train_model()
Examples
feature_tibble <-
dplyr::tibble(
sample = as.character(1:100),
cd45 = runif(n = 100),
pstat5 = runif(n = 100),
cd34 = runif(n = 100),
outcome = (3 * cd45) + (4 * pstat5) + rnorm(100)
)
new_tibble <-
dplyr::tibble(
sample = as.character(1:20),
cd45 = runif(n = 20),
pstat5 = runif(n = 20),
cd34 = runif(n = 20),
outcome = (3 * cd45) + (4 * pstat5) + rnorm(20)
)
split_data <- tof_split_data(feature_tibble, split_method = "simple")
# train a regression model
regression_model <-
tof_train_model(
split_data = split_data,
predictor_cols = c(cd45, pstat5, cd34),
response_col = outcome,
model_type = "linear"
)
# apply the model to new data
tof_predict(tof_model = regression_model, new_data = new_tibble)
#> # A tibble: 20 × 1
#> .pred
#> <dbl>
#> 1 4.28
#> 2 4.22
#> 3 1.48
#> 4 3.39
#> 5 5.08
#> 6 5.18
#> 7 3.63
#> 8 3.99
#> 9 3.87
#> 10 4.39
#> 11 4.42
#> 12 2.71
#> 13 2.90
#> 14 2.95
#> 15 2.56
#> 16 1.42
#> 17 3.34
#> 18 4.18
#> 19 4.30
#> 20 3.06