Skip to contents

This function is a wrapper around tidytof's tof_cluster_* function family. It performs clustering on high-dimensional cytometry data using a user-specified method (of 5 choices) and each method's corresponding input parameters.

Usage

tof_cluster(
  tof_tibble,
  cluster_cols = where(tof_is_numeric),
  group_cols = NULL,
  ...,
  augment = TRUE,
  method
)

Arguments

tof_tibble

A `tof_tbl` or `tibble`.

cluster_cols

Unquoted column names indicating which columns in `tof_tibble` to use in computing the clusters. Defaults to all numeric columns in `tof_tibble`. Supports tidyselect helpers.

group_cols

Optional. Unquoted column names indicating which columns should be used to group cells before clustering. Clustering is then performed on each group independently. Supports tidyselect helpers.

...

Additional arguments to pass to the `tof_cluster_*` function family member corresponding to the chosen method.

augment

A boolean value indicating if the output should column-bind the cluster ids of each cell as a new column in `tof_tibble` (TRUE, the default) or if a single-column tibble including only the cluster ids should be returned (FALSE).

method

A string indicating which clustering methods should be used. Valid values include "flowsom", "phenograph", "kmeans", "ddpr", and "xshift".

Value

A `tof_tbl` or `tibble` If augment = FALSE, it will have a single column encoding the cluster ids for each cell in `tof_tibble`. If augment = TRUE, it will have ncol(tof_tibble) + 1 columns: each of the (unaltered) columns in `tof_tibble` plus an additional column encoding the cluster ids.

See also

Examples

sim_data <-
    dplyr::tibble(
        cd45 = rnorm(n = 500),
        cd38 = rnorm(n = 500),
        cd34 = rnorm(n = 500),
        cd19 = rnorm(n = 500)
    )

tof_cluster(tof_tibble = sim_data, method = "kmeans")
#> # A tibble: 500 × 5
#>      cd45   cd38   cd34   cd19 .kmeans_cluster
#>     <dbl>  <dbl>  <dbl>  <dbl> <chr>          
#>  1  1.33  -0.447  1.50   0.436 11             
#>  2 -1.20  -0.481 -0.391 -1.54  9              
#>  3 -0.541  0.666 -1.68  -0.986 16             
#>  4 -1.22   1.32   0.689 -0.791 10             
#>  5  0.639  0.519 -1.32  -0.204 18             
#>  6 -0.239  0.397 -0.780  0.372 1              
#>  7  0.651  0.997 -0.665  0.805 18             
#>  8  0.788  1.26   0.584 -0.953 19             
#>  9 -0.344  0.388 -0.407 -0.442 13             
#> 10  0.120  0.885 -2.26   0.583 17             
#> # ℹ 490 more rows
tof_cluster(tof_tibble = sim_data, method = "phenograph")
#> # A tibble: 500 × 5
#>      cd45   cd38   cd34   cd19 .phenograph_cluster
#>     <dbl>  <dbl>  <dbl>  <dbl> <chr>              
#>  1  1.33  -0.447  1.50   0.436 2                  
#>  2 -1.20  -0.481 -0.391 -1.54  1                  
#>  3 -0.541  0.666 -1.68  -0.986 1                  
#>  4 -1.22   1.32   0.689 -0.791 3                  
#>  5  0.639  0.519 -1.32  -0.204 5                  
#>  6 -0.239  0.397 -0.780  0.372 5                  
#>  7  0.651  0.997 -0.665  0.805 4                  
#>  8  0.788  1.26   0.584 -0.953 8                  
#>  9 -0.344  0.388 -0.407 -0.442 1                  
#> 10  0.120  0.885 -2.26   0.583 5                  
#> # ℹ 490 more rows