Fit a glmnet model and calculate performance metrics using a single rsplit object

This function trains a glmnet model on the training set of an rsplit object, then calculates performance metrics of that model on the validation/holdout set at all combinations of the mixture and penalty hyperparameters provided in a hyperparameter grid.

Usage

tof_fit_split(
  split_data,
  prepped_recipe,
  hyperparameter_grid,
  model_type,
  outcome_colnames
)

Arguments

split_data: An `rsplit` object from the rsample package. Alternatively, an unsplit tbl_df can be provided, though this is not recommended.
prepped_recipe: A trained recipe
hyperparameter_grid: A tibble containing the hyperparameter values to tune. Can be created using tof_create_grid
model_type: A string representing the type of glmnet model being fit.
outcome_colnames: Quoted column names indicating which columns in the data being fit represent the outcome variables (with all others assumed to be predictors).

Value

A tibble with the same number of rows as the input hyperparameter grid. Each row represents a combination of mixture and penalty, and each column contains a performance metric for the fitted glmnet model on `split_data`'s holdout set. The specific performance metrics depend on the type of model being fit:

"linear": mean-squared error (`mse`) and mean absolute error (`mae`)
"two-class": binomial deviance (`binomial_deviance`); misclassification error rate `misclassification_error`; the area under the receiver-operating curve (`roc_auc`); and `mse` and `mse` as above
"multiclass": multinomial deviance (`multinomial_deviance`); misclassification error rate `misclassification_error`; the area under the receiver-operating curve (`roc_auc`) computed using the Hand-Till method in roc_auc; and `mse` and `mse` as above
"survival": the negative log2-transformed partial likelihood (`neg_log_partial_likelihood`) and Harrel's concordance index (often simply called "C"; `concordance_index`)

References

Harrel Jr, F. E. and Lee, K. L. and Mark, D. B. (1996) Tutorial in biostatistics: multivariable prognostic models: issues in developing models, evaluating assumptions and adequacy, and measuring and reducing error, Statistics in Medicine, 15, pages 361–387.