Skip to contents

This function adds an additional column to a `tibble` or `tof_tbl` to allow users to incorporate manual cell type labels for clusters identified using unsupervised algorithms.

Usage

tof_annotate_clusters(tof_tibble, cluster_col, annotations)

Arguments

tof_tibble

`tof_tbl` or `tibble`.

cluster_col

An unquoted column name indicating which column in `tof_tibble` contains the ids of the unsupervised cluster to which each cell belongs. Cluster labels can be produced via any method the user chooses - including manual gating, any of the functions in the `tof_cluster_*` function family, or any other method.

annotations

A data structure indicating how to annotate each cluster id in `cluster_col`. `annotations` can be provided as a data.frame with two columns (the first should have the same name as `cluster_col` and contain each unique cluster id; the second can have any name and should contain a character vector indicating which manual annotation should be matched with each cluster id in the first column). `annotations` can also be provided as a named character vector; in this case, each entry in `annotations` should be a unique cluster id, and the names for each entry should be the corresponding manual cluster annotation. See below for examples.

Value

A `tof_tbl` with the same number of rows as `tof_tibble` and one additional column containing the manual cluster annotations for each cell (as a character vector). If `annotations` was provided as a data.frame, the new column will have the same name as the column containing the cluster annotations in `annotations`. If `annotations` was provided as a named character vector, the new column will be named `{cluster_col}_annotation`.

Examples


sim_data <-
    dplyr::tibble(
        cd45 = rnorm(n = 1000),
        cd38 = c(rnorm(n = 500), rnorm(n = 500, mean = 2)),
        cd34 = c(rnorm(n = 500), rnorm(n = 500, mean = 4)),
        cd19 = rnorm(n = 1000),
        cluster_id = c(rep("a", 500), rep("b", 500))
    )

# using named character vector
sim_data |>
    tof_annotate_clusters(
        cluster_col = cluster_id,
        annotations = c("macrophage" = "a", "dendritic cell" = "b")
    )
#> # A tibble: 1,000 × 6
#>        cd45   cd38   cd34   cd19 cluster_id cluster_id_annotation
#>       <dbl>  <dbl>  <dbl>  <dbl> <chr>      <chr>                
#>  1 -1.40    -0.337 -0.166  1.12  a          macrophage           
#>  2  0.255   -0.216  0.120  0.400 a          macrophage           
#>  3 -2.44     0.621 -0.662 -0.985 a          macrophage           
#>  4 -0.00557 -1.28  -0.531 -0.503 a          macrophage           
#>  5  0.622   -1.30  -0.301  0.987 a          macrophage           
#>  6  1.15    -0.377 -0.602  2.19  a          macrophage           
#>  7 -1.82     0.104 -0.318 -0.165 a          macrophage           
#>  8 -0.247   -0.704  0.308 -0.686 a          macrophage           
#>  9 -0.244    1.50   0.799  0.941 a          macrophage           
#> 10 -0.283   -0.303  1.75  -0.164 a          macrophage           
#> # ℹ 990 more rows

# using two-column data.frame
annotation_data_frame <-
    data.frame(
        cluster_id = c("a", "b"),
        cluster_annotation = c("macrophage", "dendritic cell")
    )

sim_data |>
    tof_annotate_clusters(
        cluster_col = cluster_id,
        annotations = annotation_data_frame
    )
#> # A tibble: 1,000 × 6
#>        cd45   cd38   cd34   cd19 cluster_id cluster_annotation
#>       <dbl>  <dbl>  <dbl>  <dbl> <chr>      <chr>             
#>  1 -1.40    -0.337 -0.166  1.12  a          macrophage        
#>  2  0.255   -0.216  0.120  0.400 a          macrophage        
#>  3 -2.44     0.621 -0.662 -0.985 a          macrophage        
#>  4 -0.00557 -1.28  -0.531 -0.503 a          macrophage        
#>  5  0.622   -1.30  -0.301  0.987 a          macrophage        
#>  6  1.15    -0.377 -0.602  2.19  a          macrophage        
#>  7 -1.82     0.104 -0.318 -0.165 a          macrophage        
#>  8 -0.247   -0.704  0.308 -0.686 a          macrophage        
#>  9 -0.244    1.50   0.799  0.941 a          macrophage        
#> 10 -0.283   -0.303  1.75  -0.164 a          macrophage        
#> # ℹ 990 more rows