Skip to contents

This function uses a trained `tof_model` to make predictions on new data.

Usage

tof_predict(
  tof_model,
  new_data,
  prediction_type = c("response", "class", "link", "survival curve")
)

Arguments

tof_model

A `tof_model` trained using tof_train_model

new_data

A tibble of new observations for which predictions should be made. If new_data isn't provided, predictions will be made for the training data used to fit the model.

prediction_type

A string indicating which type of prediction should be provided by the model:

"response" (the default)

For "linear" models, the predicted response for each observation. For "two-class" and "multiclass" models, the fitted probabilities of each class for each observation. For "survival" models, the fitted relative-risk for each observation.

"class"

Only applies to "two-class" and "multiclass" models. For both, the class label corresponding to the class with the maximum fitted probability.

"link"

The linear predictions of the model (the output of the link function for each model family.)

"survival curve"

Only applies to "survival" models. Returns a tibble indicating each patient's probability of survival (1 - probability(event)) at each timepoint in the dataset. Obtained using the survfit function.

Value

A tibble with a single column (`.pred`) containing the predictions or, for multiclass models with `prediction_type` == "response", a tibble with one column for each class. Each row in the output corresponds to a row in `new_data` ( or, if `new_data` is not provided, to a row in the `tof_model`'s training data). In the latter case, be sure to check `tof_model$training_data` to confirm the order of observations, as the resampling procedure can change their ordering relative to the original input data.

See also

Other modeling functions: tof_assess_model(), tof_create_grid(), tof_split_data(), tof_train_model()

Examples

feature_tibble <-
    dplyr::tibble(
        sample = as.character(1:100),
        cd45 = runif(n = 100),
        pstat5 = runif(n = 100),
        cd34 = runif(n = 100),
        outcome = (3 * cd45) + (4 * pstat5) + rnorm(100)
    )

new_tibble <-
    dplyr::tibble(
        sample = as.character(1:20),
        cd45 = runif(n = 20),
        pstat5 = runif(n = 20),
        cd34 = runif(n = 20),
        outcome = (3 * cd45) + (4 * pstat5) + rnorm(20)
    )

split_data <- tof_split_data(feature_tibble, split_method = "simple")

# train a regression model
regression_model <-
    tof_train_model(
        split_data = split_data,
        predictor_cols = c(cd45, pstat5, cd34),
        response_col = outcome,
        model_type = "linear"
    )

# apply the model to new data
tof_predict(tof_model = regression_model, new_data = new_tibble)
#> # A tibble: 20 × 1
#>    .pred
#>    <dbl>
#>  1  4.28
#>  2  4.22
#>  3  1.48
#>  4  3.39
#>  5  5.08
#>  6  5.18
#>  7  3.63
#>  8  3.99
#>  9  3.87
#> 10  4.39
#> 11  4.42
#> 12  2.71
#> 13  2.90
#> 14  2.95
#> 15  2.56
#> 16  1.42
#> 17  3.34
#> 18  4.18
#> 19  4.30
#> 20  3.06