Metacluster clustered CyTOF data using FlowSOM's built-in metaclustering algorithm
Source:R/metaclustering.R
tof_metacluster_flowsom.Rd
This function performs metaclustering on a `tof_tbl` containing CyTOF data
using a user-specified selection of input variables/CyTOF measurements and
the number of desired metaclusters. It takes advantage of the FlowSOM package's
built-in functionality for automatically detecting the number of metaclusters
and can use several strategies as adapted by the FlowSOM team: consensus
metaclustering, hierarchical metaclustering, k-means metaclustering, or
metaclustering using the FlowSOM algorithm itself.
See MetaClustering
for additional
details.
Arguments
- tof_tibble
A `tof_tbl` or `tibble`.
- cluster_col
An unquoted column name indicating which column in `tof_tibble` stores the cluster ids for the cluster to which each cell belongs. Cluster labels can be produced via any method the user chooses - including manual gating, any of the functions in the `tof_cluster_*` function family, or any other method.
- metacluster_cols
Unquoted column names indicating which columns in `tof_tibble` to use in computing the metaclusters. Defaults to all numeric columns in `tof_tibble`. Supports tidyselect helpers.
- central_tendency_function
The function that should be used to calculate the measurement of central tendency for each cluster before metaclustering. This function will be used to compute a summary statistic for each input cluster in `cluster_col` across all columns specified by `metacluster_cols`, and the resulting vector (one for each cluster) will be used as the input for metaclustering. Defaults to
median
.- num_metaclusters
An integer indicating the maximum number of clusters that should be returned. Defaults to 10. Note that for this function, the output may provide a small number of metaclusters than requested. This is because
MetaClustering
uses the "Elbow method" to automatically detect the optimal number of metaclusters.- clustering_algorithm
A string indicating which clustering algorithm
MetaClustering
should use to perform the metaclustering. Options are "consensus" (the default), "hierarchical", "kmeans", and "som" (i.e. self-organizing map; the FlowSOM algorithm itself).- ...
Optional additional arguments to pass to
MetaClustering
.
Value
A tibble with a single column (`.flowsom_metacluster`) and the same number of rows as the input `tof_tibble`. Each entry in the column indicates the metacluster label assigned to the same row in `tof_tibble`.
See also
Other metaclustering functions:
tof_metacluster()
,
tof_metacluster_consensus()
,
tof_metacluster_hierarchical()
,
tof_metacluster_kmeans()
,
tof_metacluster_phenograph()
Examples
sim_data <-
dplyr::tibble(
cd45 = rnorm(n = 1000),
cd38 = rnorm(n = 1000),
cd34 = rnorm(n = 1000),
cd19 = rnorm(n = 1000),
cluster_id = sample(letters, size = 1000, replace = TRUE)
)
tof_metacluster_flowsom(
tof_tibble = sim_data,
cluster_col = cluster_id,
clustering_algorithm = "consensus"
)
#> # A tibble: 1,000 × 1
#> .flowsom_metacluster
#> <chr>
#> 1 3
#> 2 6
#> 3 2
#> 4 3
#> 5 3
#> 6 4
#> 7 1
#> 8 1
#> 9 3
#> 10 3
#> # ℹ 990 more rows
tof_metacluster_flowsom(
tof_tibble = sim_data,
cluster_col = cluster_id,
clustering_algorithm = "som"
)
#> # A tibble: 1,000 × 1
#> .flowsom_metacluster
#> <chr>
#> 1 1
#> 2 1
#> 3 3
#> 4 3
#> 5 1
#> 6 2
#> 7 2
#> 8 2
#> 9 1
#> 10 1
#> # ℹ 990 more rows