Perform principal component analysis on single-cell data

This function calculates principal components using single-cell data from a `tof_tibble`.

Usage

tof_reduce_pca(
  tof_tibble,
  pca_cols = where(tof_is_numeric),
  num_comp = 5,
  threshold = NA,
  center = TRUE,
  scale = TRUE,
  return_recipe = FALSE
)

Arguments

tof_tibble: A `tof_tbl` or `tibble`.
pca_cols: Unquoted column names indicating which columns in `tof_tibble` to use for computing the principal components. Defaults to all numeric columns. Supports tidyselect helpers.
num_comp: The number of PCA components to calculate. Defaults to 5. See step_pca.
threshold: A double between 0 and 1 representing the fraction of total variance that should be covered by the components returned in the output. See step_pca.
center: A boolean value indicating if each column should be centered to mean 0 before PCA analysis. Defaults to TRUE.
scale: A boolean value indicating if each column should be scaled to standard deviation = 1 before PCA analysis. Defaults to TRUE.
return_recipe: A boolean value indicating if instead of the UMAP result, a prepped recipe object containing the PCA embedding should be returned. Set this option to TRUE if you want to create the PCA embedding using one dataset but also want to project new observations onto the same embedding space later.

Value

A tibble with the same number of rows as `tof_tibble`, each representing a single cell. Each of the `num_comp` columns represents each cell's embedding in the calculated principal component space.

Examples

# simulate single-cell data
sim_data <-
    dplyr::tibble(
        cd45 = rnorm(n = 200),
        cd38 = rnorm(n = 200),
        cd34 = rnorm(n = 200),
        cd19 = rnorm(n = 200)
    )
new_data <-
    dplyr::tibble(
        cd45 = rnorm(n = 50),
        cd38 = rnorm(n = 50),
        cd34 = rnorm(n = 50),
        cd19 = rnorm(n = 50)
    )

# calculate pca
tof_reduce_pca(tof_tibble = sim_data, num_comp = 2)
#> # A tibble: 200 × 2
#>       .pc1    .pc2
#>      <dbl>   <dbl>
#>  1 -1.18    2.40  
#>  2  0.432   0.529 
#>  3  0.0764  0.761 
#>  4  3.09    0.0319
#>  5 -0.351  -1.84  
#>  6  0.619  -0.0466
#>  7  0.561   0.211 
#>  8  1.15    2.01  
#>  9  0.807   0.217 
#> 10 -0.647  -0.278 
#> # ℹ 190 more rows

# return recipe instead of embeddings
pca_recipe <- tof_reduce_pca(tof_tibble = sim_data, return_recipe = TRUE)

# apply recipe to new data
recipes::bake(pca_recipe, new_data = new_data)
#> # A tibble: 50 × 4
#>         PC1    PC2    PC3     PC4
#>       <dbl>  <dbl>  <dbl>   <dbl>
#>  1 -0.260   -1.04  -0.118 -0.598 
#>  2  0.00176 -0.537 -0.146  0.125 
#>  3 -0.119   -0.173 -0.500  1.84  
#>  4 -0.717    1.42   1.63  -1.67  
#>  5 -0.793   -1.08   0.397 -0.773 
#>  6  0.980    0.560 -0.946 -1.37  
#>  7  1.02    -0.372  0.720 -2.35  
#>  8 -0.740   -0.226  1.21   0.0807
#>  9 -1.32     1.68   0.376 -0.573 
#> 10 -0.0621  -0.574  0.941  0.179 
#> # ℹ 40 more rows

Perform principal component analysis on single-cell data

Usage

Arguments

Value

See also

Examples