Skip to contents

This function calculates a UMAP embedding from single-cell data in a `tof_tibble`.

Usage

tof_reduce_umap(
  tof_tibble,
  umap_cols = where(tof_is_numeric),
  num_comp = 2,
  neighbors = 5,
  min_dist = 0.01,
  learn_rate = 1,
  epochs = NULL,
  verbose = FALSE,
  n_threads = 1,
  return_recipe = FALSE,
  ...
)

Arguments

tof_tibble

A `tof_tbl` or `tibble`.

umap_cols

Unquoted column names indicating which columns in `tof_tibble` to use in computing the UMAP embedding. Defaults to all numeric columns in `tof_tibble`. Supports tidyselect helpers.

num_comp

An integer for the number of UMAP components.

neighbors

An integer for the number of nearest neighbors used to construct the target simplicial set.

min_dist

The effective minimum distance between embedded points.

learn_rate

Positive number of the learning rate for the optimization process.

epochs

Number of iterations for the neighbor optimization. See umap for details.

verbose

A boolean indicating if run details should be logged to the console. Defaults to FALSE.

n_threads

Number of threads to use during UMAP calculation. Defaults to 1.

return_recipe

A boolean value indicating if instead of the UMAP result, a prepped recipe object containing the UMAP embedding should be returned. Set this option to TRUE if you want to create the UMAP embedding using one dataset but also want to project new observations onto the same embedding space later.

...

Optional. Other options to be passed as arguments to umap.

Value

A tibble with the same number of rows as `tof_tibble`, each representing a single cell. Each of the `num_comp` columns represents each cell's embedding in the calculated UMAP space.

See also

Other dimensionality reduction functions: tof_reduce_dimensions(), tof_reduce_pca(), tof_reduce_tsne()

Examples

# simulate single-cell data
sim_data <-
    dplyr::tibble(
        cd45 = rnorm(n = 200),
        cd38 = rnorm(n = 200),
        cd34 = rnorm(n = 200),
        cd19 = rnorm(n = 200)
    )
new_data <-
    dplyr::tibble(
        cd45 = rnorm(n = 50),
        cd38 = rnorm(n = 50),
        cd34 = rnorm(n = 50),
        cd19 = rnorm(n = 50)
    )

# calculate umap
tof_reduce_umap(tof_tibble = sim_data)
#> # A tibble: 200 × 2
#>    .umap1  .umap2
#>     <dbl>   <dbl>
#>  1 -2.73   1.31  
#>  2 -2.70  -2.01  
#>  3 -1.12   3.50  
#>  4 -3.21   1.63  
#>  5  0.790  2.91  
#>  6 -0.980 -1.42  
#>  7  3.51  -1.76  
#>  8 -1.99  -2.63  
#>  9 -2.13   1.15  
#> 10 -1.66  -0.0996
#> # ℹ 190 more rows

# calculate umap with only 2 columns
tof_reduce_tsne(tof_tibble = sim_data, umap_cols = c(cd34, cd38))
#> # A tibble: 200 × 2
#>       .tsne1 .tsne2
#>        <dbl>  <dbl>
#>  1 -2.97     -2.92 
#>  2 -4.27     -0.599
#>  3  0.462     2.34 
#>  4 -4.94      6.91 
#>  5  4.25      6.49 
#>  6 -0.000823  1.16 
#>  7  2.04     -4.65 
#>  8 -5.32     -2.66 
#>  9 -4.75      2.24 
#> 10 -3.48      1.06 
#> # ℹ 190 more rows

# return recipe
umap_recipe <- tof_reduce_umap(tof_tibble = sim_data, return_recipe = TRUE)

# apply recipe to new data
recipes::bake(umap_recipe, new_data = new_data)
#> # A tibble: 50 × 2
#>     UMAP1    UMAP2
#>     <dbl>    <dbl>
#>  1 -2.40   1.40   
#>  2  2.68   0.413  
#>  3 -1.44   4.48   
#>  4 -0.927 -2.78   
#>  5  0.491  2.00   
#>  6  3.03  -1.40   
#>  7  2.03  -3.10   
#>  8  0.204 -0.00277
#>  9 -0.922  1.25   
#> 10 -0.865 -0.178  
#> # ℹ 40 more rows