Apply uniform manifold approximation and projection (UMAP) to single-cell data
Source:R/dimensionality_reduction.R
tof_reduce_umap.Rd
This function calculates a UMAP embedding from single-cell data in a `tof_tibble`.
Usage
tof_reduce_umap(
tof_tibble,
umap_cols = where(tof_is_numeric),
num_comp = 2,
neighbors = 5,
min_dist = 0.01,
learn_rate = 1,
epochs = NULL,
verbose = FALSE,
n_threads = 1,
return_recipe = FALSE,
...
)
Arguments
- tof_tibble
A `tof_tbl` or `tibble`.
- umap_cols
Unquoted column names indicating which columns in `tof_tibble` to use in computing the UMAP embedding. Defaults to all numeric columns in `tof_tibble`. Supports tidyselect helpers.
- num_comp
An integer for the number of UMAP components.
- neighbors
An integer for the number of nearest neighbors used to construct the target simplicial set.
- min_dist
The effective minimum distance between embedded points.
- learn_rate
Positive number of the learning rate for the optimization process.
- epochs
Number of iterations for the neighbor optimization. See
umap
for details.- verbose
A boolean indicating if run details should be logged to the console. Defaults to FALSE.
- n_threads
Number of threads to use during UMAP calculation. Defaults to 1.
- return_recipe
A boolean value indicating if instead of the UMAP result, a prepped
recipe
object containing the UMAP embedding should be returned. Set this option to TRUE if you want to create the UMAP embedding using one dataset but also want to project new observations onto the same embedding space later.- ...
Optional. Other options to be passed as arguments to
umap
.
Value
A tibble with the same number of rows as `tof_tibble`, each representing a single cell. Each of the `num_comp` columns represents each cell's embedding in the calculated UMAP space.
See also
Other dimensionality reduction functions:
tof_reduce_dimensions()
,
tof_reduce_pca()
,
tof_reduce_tsne()
Examples
# simulate single-cell data
sim_data <-
dplyr::tibble(
cd45 = rnorm(n = 200),
cd38 = rnorm(n = 200),
cd34 = rnorm(n = 200),
cd19 = rnorm(n = 200)
)
new_data <-
dplyr::tibble(
cd45 = rnorm(n = 50),
cd38 = rnorm(n = 50),
cd34 = rnorm(n = 50),
cd19 = rnorm(n = 50)
)
# calculate umap
tof_reduce_umap(tof_tibble = sim_data)
#> # A tibble: 200 × 2
#> .umap1 .umap2
#> <dbl> <dbl>
#> 1 -2.73 1.31
#> 2 -2.70 -2.01
#> 3 -1.12 3.50
#> 4 -3.21 1.63
#> 5 0.790 2.91
#> 6 -0.980 -1.42
#> 7 3.51 -1.76
#> 8 -1.99 -2.63
#> 9 -2.13 1.15
#> 10 -1.66 -0.0996
#> # ℹ 190 more rows
# calculate umap with only 2 columns
tof_reduce_tsne(tof_tibble = sim_data, umap_cols = c(cd34, cd38))
#> # A tibble: 200 × 2
#> .tsne1 .tsne2
#> <dbl> <dbl>
#> 1 -2.97 -2.92
#> 2 -4.27 -0.599
#> 3 0.462 2.34
#> 4 -4.94 6.91
#> 5 4.25 6.49
#> 6 -0.000823 1.16
#> 7 2.04 -4.65
#> 8 -5.32 -2.66
#> 9 -4.75 2.24
#> 10 -3.48 1.06
#> # ℹ 190 more rows
# return recipe
umap_recipe <- tof_reduce_umap(tof_tibble = sim_data, return_recipe = TRUE)
# apply recipe to new data
recipes::bake(umap_recipe, new_data = new_data)
#> # A tibble: 50 × 2
#> UMAP1 UMAP2
#> <dbl> <dbl>
#> 1 -2.40 1.40
#> 2 2.68 0.413
#> 3 -1.44 4.48
#> 4 -0.927 -2.78
#> 5 0.491 2.00
#> 6 3.03 -1.40
#> 7 2.03 -3.10
#> 8 0.204 -0.00277
#> 9 -0.922 1.25
#> 10 -0.865 -0.178
#> # ℹ 40 more rows