Downsample high-dimensional cytometry data by randomly selecting a proportion of the cells in each group.
Source:R/downsampling.R
tof_downsample_prop.Rd
This function downsamples the number of cells in a `tof_tbl` by randomly selecting a `prop_cells` proportion of the total number of cells with each unique combination of values in `group_cols`.
Arguments
- tof_tibble
A `tof_tbl` or a `tibble`.
- group_cols
Unquoted names of the columns in `tof_tibble` that should be used to define groups from which `prop_cells` will be downsampled. Supports tidyselect helpers. Defaults to `NULL` (no grouping).
- prop_cells
A proportion of cells (between 0 and 1) that should be sampled from each group defined by `group_cols`.
Value
A `tof_tbl` with the same number of columns as the input `tof_tibble`, but fewer rows. Specifically, the number of rows should be `prop_cells` times the number of rows in the input `tof_tibble`.
See also
Other downsampling functions:
tof_downsample()
,
tof_downsample_constant()
,
tof_downsample_density()
Examples
sim_data <-
dplyr::tibble(
cd45 = rnorm(n = 1000),
cd38 = rnorm(n = 1000),
cd34 = rnorm(n = 1000),
cd19 = rnorm(n = 1000),
cluster_id = sample(letters, size = 1000, replace = TRUE)
)
# sample 10% of all cells from the input data
tof_downsample_prop(
tof_tibble = sim_data,
prop_cells = 0.1
)
#> # A tibble: 100 × 5
#> cd45 cd38 cd34 cd19 cluster_id
#> <dbl> <dbl> <dbl> <dbl> <chr>
#> 1 1.01 0.520 -2.18 -0.531 s
#> 2 0.542 1.50 -0.890 -0.815 a
#> 3 -0.166 0.513 -0.812 1.31 e
#> 4 0.116 -1.27 0.359 -0.412 f
#> 5 0.765 -1.46 -1.71 0.247 u
#> 6 -0.426 -0.412 1.99 0.465 x
#> 7 -1.10 -0.437 0.284 0.0682 s
#> 8 -0.620 -1.81 -0.928 -1.32 k
#> 9 0.0205 0.386 -0.465 -0.102 m
#> 10 -0.721 0.362 -0.594 -0.848 h
#> # ℹ 90 more rows
# sample 10% of all cells from each cluster in the input data
tof_downsample_prop(
tof_tibble = sim_data,
group_cols = cluster_id,
prop_cells = 0.1
)
#> # A tibble: 87 × 5
#> cd45 cd38 cd34 cd19 cluster_id
#> <dbl> <dbl> <dbl> <dbl> <chr>
#> 1 -1.02 -1.06 -1.12 1.63 a
#> 2 -0.959 -1.57 3.55 0.732 a
#> 3 0.542 1.50 -0.890 -0.815 a
#> 4 -0.617 0.0143 -1.23 0.291 a
#> 5 -1.39 2.26 -0.497 -0.166 b
#> 6 -0.396 -0.818 -0.720 0.0744 b
#> 7 0.916 -0.859 1.35 1.37 b
#> 8 0.438 1.33 -1.18 1.19 c
#> 9 -0.748 0.436 1.39 0.398 c
#> 10 0.512 -1.16 0.687 -0.731 c
#> # ℹ 77 more rows