Visualize clusters in CyTOF data using a minimum spanning tree (MST).
Source:R/visualization.R
tof_plot_clusters_mst.Rd
This function plots a minimum-spanning tree using clustered single-cell data in order to summarize cluster-level characteristics. Each node in the MST represents a single cluster colored using a user-specified variable (either continuous or discrete).
Usage
tof_plot_clusters_mst(
tof_tibble,
cluster_col,
knn_cols = where(tof_is_numeric),
color_col,
num_neighbors = 5L,
graph_type = c("unweighted", "weighted"),
graph_layout = "nicely",
central_tendency_function = stats::median,
distance_function = c("euclidean", "cosine"),
edge_alpha = 0.4,
node_size = "cluster_size",
theme = ggplot2::theme_void(),
...
)
Arguments
- tof_tibble
A `tof_tbl` or a `tibble`.
- cluster_col
An unquoted column name indicating which column in `tof_tibble` stores the cluster ids for the cluster to which each cell belongs. Cluster labels can be produced via any method the user chooses - including manual gating, any of the functions in the `tof_cluster_*` function family, or any other method.
- knn_cols
Unquoted column names indicating which columns in `tof_tibble` should be used to compute the cluster-to-cluster distances used to construct the k-nearest-neighbor graph. Supports tidyselect helpers. Defaults to all numeric columns.
- color_col
Unquoted column name indicating which column in `tof_tibble` should be used to color the nodes in the MST.
- num_neighbors
An integer specifying how many neighbors should be used to construct the k-nearest neighbor graph.
- graph_type
A string specifying if the k-nearest neighbor graph should be "weighted" (the default) or "unweighted".
- graph_layout
This argument specifies a layout for the MST in one of two ways. Option 1: Provide a string specifying which algorithm should be used to compute the force-directed layout. Passed to
ggraph
. Defaults to "nicely", which tries to automatically select a visually-appealing layout. Other examples include "fr", "gem", "kk", and many others. Seelayout_tbl_graph_igraph
for other examples. Option 2: Provide a ggraph object previously generated with this function. The layout used to plot this ggraph object will then be used as a template for the new plot. Using this option, number of clusters (and their labels) must be identical to the template. This option is useful if you want to make multiple plots of the same tof_tibble colored by different protein markers, for example.- central_tendency_function
A function to use for computing the measure of central tendency that will be aggregated from each cluster in cluster_col. Defaults to the median.
- distance_function
A string indicating which distance function to use in computing the cluster-to-clusters distances in constructing the MST. Valid options include "euclidean" (the default) and "cosine".
- edge_alpha
A numeric value between 0 and 1 specifying the transparency of the edges drawn in the force-directed layout. Defaults to 0.25.
- node_size
Either a numeric value specifying the size of the nodes in the MST or the string "cluster_size", in which case the size of the node representing each cluster will be scaled according to the number of cells in that cluster (the default).
- theme
A ggplot2 theme to apply to the force-directed layout. Defaults to
theme_void
- ...
Optional additional arguments to
hnsw_knn
Examples
sim_data <-
dplyr::tibble(
cd45 = rnorm(n = 1000),
cd38 = rnorm(n = 1000),
cd34 = rnorm(n = 1000),
cd19 = rnorm(n = 1000),
cluster_id = sample(letters, size = 1000, replace = TRUE)
)
# make a layout colored by a marker
layout_cd38 <-
tof_plot_clusters_mst(
tof_tibble = sim_data,
cluster_col = cluster_id,
color_col = cd38
)
# use the same layout as the plot above to color the same
# tree using a different marker
layout_cd45 <-
tof_plot_clusters_mst(
tof_tibble = sim_data,
cluster_col = cluster_id,
color_col = cd45,
graph_layout = layout_cd38
)