Use a trained elastic net model to predict fitted values from new data
Source:R/patient-level_modeling.R
tof_predict.Rd
This function uses a trained `tof_model` to make predictions on new data.
Usage
tof_predict(
tof_model,
new_data,
prediction_type = c("response", "class", "link", "survival curve")
)
Arguments
- tof_model
A `tof_model` trained using
tof_train_model
- new_data
A tibble of new observations for which predictions should be made. If new_data isn't provided, predictions will be made for the training data used to fit the model.
- prediction_type
A string indicating which type of prediction should be provided by the model:
- "response" (the default)
For "linear" models, the predicted response for each observation. For "two-class" and "multiclass" models, the fitted probabilities of each class for each observation. For "survival" models, the fitted relative-risk for each observation.
- "class"
Only applies to "two-class" and "multiclass" models. For both, the class label corresponding to the class with the maximum fitted probability.
- "link"
The linear predictions of the model (the output of the link function for each model family.)
- "survival curve"
Only applies to "survival" models. Returns a tibble indicating each patient's probability of survival (1 - probability(event)) at each timepoint in the dataset. Obtained using the
survfit
function.
Value
A tibble
with a single column (`.pred`) containing
the predictions or, for multiclass models with `prediction_type` == "response",
a tibble with one column for each class. Each row in the output corresponds to a row in `new_data` (
or, if `new_data` is not provided, to a row in the `tof_model`'s training data).
In the latter case, be sure to check `tof_model$training_data` to confirm the
order of observations, as the resampling procedure can change their ordering
relative to the original input data.
See also
Other modeling functions:
tof_assess_model()
,
tof_create_grid()
,
tof_split_data()
,
tof_train_model()
Examples
feature_tibble <-
dplyr::tibble(
sample = as.character(1:100),
cd45 = runif(n = 100),
pstat5 = runif(n = 100),
cd34 = runif(n = 100),
outcome = (3 * cd45) + (4 * pstat5) + rnorm(100)
)
new_tibble <-
dplyr::tibble(
sample = as.character(1:20),
cd45 = runif(n = 20),
pstat5 = runif(n = 20),
cd34 = runif(n = 20),
outcome = (3 * cd45) + (4 * pstat5) + rnorm(20)
)
split_data <- tof_split_data(feature_tibble, split_method = "simple")
# train a regression model
regression_model <-
tof_train_model(
split_data = split_data,
predictor_cols = c(cd45, pstat5, cd34),
response_col = outcome,
model_type = "linear"
)
# apply the model to new data
tof_predict(tof_model = regression_model, new_data = new_tibble)
#> # A tibble: 20 × 1
#> .pred
#> <dbl>
#> 1 2.83
#> 2 4.41
#> 3 2.19
#> 4 3.22
#> 5 5.11
#> 6 4.16
#> 7 5.26
#> 8 3.61
#> 9 1.97
#> 10 3.15
#> 11 1.60
#> 12 2.15
#> 13 5.23
#> 14 5.90
#> 15 6.37
#> 16 2.67
#> 17 3.10
#> 18 2.74
#> 19 3.98
#> 20 4.61