Apply uniform manifold approximation and projection (UMAP) to single-cell data

This function calculates a UMAP embedding from single-cell data in a `tof_tibble`.

Usage

tof_reduce_umap(
  tof_tibble,
  umap_cols = where(tof_is_numeric),
  num_comp = 2,
  neighbors = 5,
  min_dist = 0.01,
  learn_rate = 1,
  epochs = NULL,
  verbose = FALSE,
  n_threads = 1,
  return_recipe = FALSE,
  ...
)

Arguments

tof_tibble: A `tof_tbl` or `tibble`.
umap_cols: Unquoted column names indicating which columns in `tof_tibble` to use in computing the UMAP embedding. Defaults to all numeric columns in `tof_tibble`. Supports tidyselect helpers.
num_comp: An integer for the number of UMAP components.
neighbors: An integer for the number of nearest neighbors used to construct the target simplicial set.
min_dist: The effective minimum distance between embedded points.
learn_rate: Positive number of the learning rate for the optimization process.
epochs: Number of iterations for the neighbor optimization. See umap for details.
verbose: A boolean indicating if run details should be logged to the console. Defaults to FALSE.
n_threads: Number of threads to use during UMAP calculation. Defaults to 1.
return_recipe: A boolean value indicating if instead of the UMAP result, a prepped recipe object containing the UMAP embedding should be returned. Set this option to TRUE if you want to create the UMAP embedding using one dataset but also want to project new observations onto the same embedding space later.
...: Optional. Other options to be passed as arguments to umap.

Value

A tibble with the same number of rows as `tof_tibble`, each representing a single cell. Each of the `num_comp` columns represents each cell's embedding in the calculated UMAP space.

Examples

# simulate single-cell data
sim_data <-
    dplyr::tibble(
        cd45 = rnorm(n = 200),
        cd38 = rnorm(n = 200),
        cd34 = rnorm(n = 200),
        cd19 = rnorm(n = 200)
    )
new_data <-
    dplyr::tibble(
        cd45 = rnorm(n = 50),
        cd38 = rnorm(n = 50),
        cd34 = rnorm(n = 50),
        cd19 = rnorm(n = 50)
    )

# calculate umap
tof_reduce_umap(tof_tibble = sim_data)
#> # A tibble: 200 × 2
#>    .umap1  .umap2
#>     <dbl>   <dbl>
#>  1 -2.73   1.31  
#>  2 -2.70  -2.01  
#>  3 -1.12   3.50  
#>  4 -3.21   1.63  
#>  5  0.790  2.91  
#>  6 -0.980 -1.42  
#>  7  3.51  -1.76  
#>  8 -1.99  -2.63  
#>  9 -2.13   1.15  
#> 10 -1.66  -0.0996
#> # ℹ 190 more rows

# calculate umap with only 2 columns
tof_reduce_tsne(tof_tibble = sim_data, umap_cols = c(cd34, cd38))
#> # A tibble: 200 × 2
#>       .tsne1 .tsne2
#>        <dbl>  <dbl>
#>  1 -2.97     -2.92 
#>  2 -4.27     -0.599
#>  3  0.462     2.34 
#>  4 -4.94      6.91 
#>  5  4.25      6.49 
#>  6 -0.000823  1.16 
#>  7  2.04     -4.65 
#>  8 -5.32     -2.66 
#>  9 -4.75      2.24 
#> 10 -3.48      1.06 
#> # ℹ 190 more rows

# return recipe
umap_recipe <- tof_reduce_umap(tof_tibble = sim_data, return_recipe = TRUE)

# apply recipe to new data
recipes::bake(umap_recipe, new_data = new_data)
#> # A tibble: 50 × 2
#>     UMAP1    UMAP2
#>     <dbl>    <dbl>
#>  1 -2.40   1.40   
#>  2  2.68   0.413  
#>  3 -1.44   4.48   
#>  4 -0.927 -2.78   
#>  5  0.491  2.00   
#>  6  3.03  -1.40   
#>  7  2.03  -3.10   
#>  8  0.204 -0.00277
#>  9 -0.922  1.25   
#> 10 -0.865 -0.178  
#> # ℹ 40 more rows

Apply uniform manifold approximation and projection (UMAP) to single-cell data

Usage

Arguments

Value

See also

Examples