Perform k-means clustering on high-dimensional cytometry data.
Source:R/clustering.R
tof_cluster_kmeans.Rd
This function performs k-means clustering on high-dimensional cytometry data using a user-specified
selection of input variables/high-dimensional cytometry measurements. It is mostly a convenient
wrapper around kmeans
.
Usage
tof_cluster_kmeans(
tof_tibble,
cluster_cols = where(tof_is_numeric),
num_clusters = 20,
...
)
Arguments
- tof_tibble
A `tof_tibble`.
- cluster_cols
Unquoted column names indicating which columns in `tof_tibble` to use in computing the k-means clusters. Defaults to all numeric columns in `tof_tibble`. Supports tidyselect helpers.
- num_clusters
An integer indicating the maximum number of clusters that should be returned. Defaults to 20.
- ...
Optional additional arguments that can be passed to
kmeans
.
Value
A tibble with one column named `.kmeans_cluster`. This column will contain an integer vector of length `nrow(tof_tibble)` indicating the id of the k-means cluster to which each cell (i.e. each row) in `tof_tibble` was assigned.
See also
Other clustering functions:
tof_cluster()
,
tof_cluster_ddpr()
,
tof_cluster_flowsom()
,
tof_cluster_phenograph()
Examples
sim_data <-
dplyr::tibble(
cd45 = rnorm(n = 1000),
cd38 = rnorm(n = 1000),
cd34 = rnorm(n = 1000),
cd19 = rnorm(n = 1000)
)
tof_cluster_kmeans(tof_tibble = sim_data)
#> # A tibble: 1,000 × 1
#> .kmeans_cluster
#> <chr>
#> 1 8
#> 2 14
#> 3 9
#> 4 5
#> 5 4
#> 6 11
#> 7 5
#> 8 15
#> 9 7
#> 10 5
#> # ℹ 990 more rows
tof_cluster_kmeans(tof_tibble = sim_data, cluster_cols = c(cd45, cd19))
#> # A tibble: 1,000 × 1
#> .kmeans_cluster
#> <chr>
#> 1 18
#> 2 5
#> 3 4
#> 4 5
#> 5 2
#> 6 12
#> 7 11
#> 8 6
#> 9 16
#> 10 13
#> # ℹ 990 more rows