Skip to contents

This function performs PhenoGraph clustering on high-dimensional cytometry data using a user-specified selection of input variables/high-dimensional cytometry measurements.

Usage

tof_cluster_phenograph(
  tof_tibble,
  cluster_cols = where(tof_is_numeric),
  num_neighbors = 30,
  distance_function = c("euclidean", "cosine"),
  ...
)

Arguments

tof_tibble

A `tof_tbl` or `tibble`.

cluster_cols

Unquoted column names indicating which columns in `tof_tibble` to use in computing the PhenoGraph clusters. Defaults to all numeric columns in `tof_tibble`. Supports tidyselect helpers.

num_neighbors

An integer indicating the number of neighbors to use when constructing PhenoGraph's k-nearest-neighbor graph. Smaller values emphasize local graph structure; larger values emphasize global graph structure (and will add time to the computation). Defaults to 30.

distance_function

A string indicating which distance function to use for the nearest-neighbor calculation. Options include "euclidean" (the default) and "cosine" distances.

...

Optional additional parameters that can be passed to tof_find_knn.

Value

A tibble with one column named `.phenograph_cluster`. This column will contain an integer vector of length `nrow(tof_tibble)` indicating the id of the PhenoGraph cluster to which each cell (i.e. each row) in `tof_tibble` was assigned.

Details

For additional details about the Phenograph algorithm, see this paper.

See also

Examples

sim_data <-
    dplyr::tibble(
        cd45 = rnorm(n = 1000),
        cd38 = rnorm(n = 1000),
        cd34 = rnorm(n = 1000),
        cd19 = rnorm(n = 1000)
    )
tof_cluster_phenograph(tof_tibble = sim_data)
#> # A tibble: 1,000 × 1
#>    .phenograph_cluster
#>    <chr>              
#>  1 6                  
#>  2 2                  
#>  3 4                  
#>  4 7                  
#>  5 10                 
#>  6 8                  
#>  7 7                  
#>  8 9                  
#>  9 8                  
#> 10 4                  
#> # ℹ 990 more rows
tof_cluster_phenograph(tof_tibble = sim_data, cluster_cols = c(cd45, cd19))
#> # A tibble: 1,000 × 1
#>    .phenograph_cluster
#>    <chr>              
#>  1 9                  
#>  2 12                 
#>  3 1                  
#>  4 13                 
#>  5 5                  
#>  6 2                  
#>  7 10                 
#>  8 7                  
#>  9 11                 
#> 10 1                  
#> # ℹ 990 more rows