Package index • tidytof

Reading and writing data

Functions for reading and writing high-dimensional cytometry data to and from file storage

tof_read_csv(): Read high-dimensional cytometry data from a .csv file into a tidy tibble.

tof_read_data(): Read data from an .fcs/.csv file or a directory of .fcs/.csv files.

tof_read_fcs(): Read high-dimensional cytometry data from an .fcs file into a tidy tibble.

tof_read_file(): Read high-dimensional cytometry data from a single .fcs or .csv file into a tidy tibble.

tof_write_csv(): Write a series of .csv files from a tof_tbl

tof_write_data(): Write high-dimensional cytometry data to a file or to a directory of files

tof_write_fcs(): Write a series of .fcs files from a tof_tbl

tof_assess_channels(): Detect low-expression (i.e. potentially failed) channels in high-dimensional cytometry data

tof_assess_flow_rate(): Detect flow rate abnormalities in high-dimensional cytometry data

tof_assess_flow_rate_tibble(): Detect flow rate abnormalities in high-dimensional cytometry data (stored in a single data.frame)

tof_calculate_flow_rate(): Calculate the relative flow rates of different timepoints throughout a flow or mass cytometry run.

tof_batch_correct(): Perform groupwise linear rescaling of high-dimensional cytometry measurements

tof_batch_correct_quantile(): Batch-correct a tibble of high-dimensional cytometry data using quantile normalization.

tof_batch_correct_quantile_tibble(): Batch-correct a tibble of high-dimensional cytometry data using quantile normalization.

tof_batch_correct_rescale(): Perform groupwise linear rescaling of high-dimensional cytometry measurements

tidytof_example_data(): Get paths to tidytof example data

new_tof_tibble(): Constructor for a tof_tibble.

as_tof_tbl(): Coerce flowFrames or flowSets into tof_tbl's.

as_tof_tbl(<flowSet>): Convert an object into a tof_tbl

tof_get_panel(): Get panel information from a tof_tibble

tof_set_panel(): Set panel information from a tof_tibble

tof_find_panel_info(): Use tidytof's opinionated heuristic for extracted a high-dimensional cytometry panel's metal-antigen pairs from a flowFrame (read from a .fcs file.)

Single-cell data analysis

Functions for data processing tasks at the single-cell level

tof_preprocess(): Preprocess raw high-dimensional cytometry data.

tof_transform(): Transform raw high-dimensional cytometry data.

tof_postprocess(): Post-process transformed CyTOF data.

tof_downsample(): Downsample high-dimensional cytometry data.

tof_downsample_constant(): Downsample high-dimensional cytometry data by randomly selecting a constant number of cells per group.

tof_downsample_density(): Downsample high-dimensional cytometry data by randomly selecting a proportion of the cells in each group.

tof_downsample_prop(): Downsample high-dimensional cytometry data by randomly selecting a proportion of the cells in each group.

tof_reduce_dimensions(): Apply dimensionality reduction to a single-cell dataset.

tof_reduce_pca(): Perform principal component analysis on single-cell data

tof_reduce_tsne(): Perform t-distributed stochastic neighborhood embedding on single-cell data

tof_reduce_umap(): Apply uniform manifold approximation and projection (UMAP) to single-cell data

tof_cluster(): Cluster high-dimensional cytometry data.

tof_cluster_ddpr(): Perform developmental clustering on high-dimensional cytometry data.

tof_cluster_flowsom(): Perform FlowSOM clustering on high-dimensional cytometry data

tof_cluster_grouped(): Cluster (grouped) high-dimensional cytometry data.

tof_cluster_kmeans(): Perform k-means clustering on high-dimensional cytometry data.

tof_cluster_phenograph(): Perform PhenoGraph clustering on high-dimensional cytometry data.

tof_cluster_tibble(): Cluster (ungrouped) high-dimensional cytometry data.

tof_estimate_density(): Estimate the local densities for all cells in a high-dimensional cytometry dataset.

tof_apply_classifier(): Perform developmental clustering on CyTOF data using a pre-fit classifier

tof_build_classifier(): Calculate centroids and covariance matrices for each cell subpopulation in healthy CyTOF data.

tof_classify_cells(): Classify each cell (i.e. each row) in a matrix of cancer cells into its most similar healthy developmental subpopulation.

Cluster-level data analysis

Functions for data processing tasks at the cluster or cell subpopulation level

tof_annotate_clusters(): Manually annotate tidytof-computed clusters using user-specified labels

tof_assess_clusters_distance(): Assess a clustering result by calculating the z-score of each cell's mahalanobis distance to its cluster centroid and flagging outliers.

tof_assess_clusters_entropy(): Assess a clustering result by calculating the shannon entropy of each cell's mahalanobis distance to all cluster centroids and flagging outliers.

tof_assess_clusters_knn(): Assess a clustering result by calculating a cell's cluster assignment to that of its K nearest neighbors.

tof_metacluster(): Metacluster clustered CyTOF data.

tof_metacluster_consensus(): Metacluster clustered CyTOF data using consensus clustering

tof_metacluster_flowsom(): Metacluster clustered CyTOF data using FlowSOM's built-in metaclustering algorithm

tof_metacluster_hierarchical(): Metacluster clustered CyTOF data using hierarchical agglomerative clustering

tof_metacluster_kmeans(): Metacluster clustered CyTOF data using k-means clustering

tof_metacluster_phenograph(): Metacluster clustered CyTOF data using PhenoGraph clustering

tof_upsample(): Upsample cells into the closest cluster in a reference dataset

tof_upsample_distance(): Upsample cells into the closest cluster in a reference dataset

tof_upsample_neighbor(): Upsample cells into the cluster of their nearest neighbor a reference dataset

tof_analyze_abundance(): Perform Differential Abundance Analysis (DAA) on high-dimensional cytometry data

tof_analyze_abundance_diffcyt(): Differential Abundance Analysis (DAA) with diffcyt

tof_analyze_abundance_glmm(): Differential Abundance Analysis (DAA) with generalized linear mixed-models (GLMMs)

tof_analyze_abundance_ttest(): Differential Abundance Analysis (DAA) with t-tests

tof_analyze_expression(): Perform Differential Expression Analysis (DEA) on high-dimensional cytometry data

tof_analyze_expression_diffcyt(): Differential Expression Analysis (DEA) with diffcyt

tof_analyze_expression_lmm(): Differential Expression Analysis (DEA) with linear mixed-models (LMMs)

tof_analyze_expression_ttest(): Differential Expression Analysis (DEA) with t-tests

tof_extract_central_tendency(): Extract the central tendencies of CyTOF markers in each cluster in a `tof_tibble`.

tof_extract_emd(): Extract aggregated features from CyTOF data using earth-mover's distance (EMD)

tof_extract_features(): Extract aggregated, sample-level features from CyTOF data.

tof_extract_jsd(): Extract aggregated features from CyTOF data using the Jensen-Shannon Distance (JSD)

tof_extract_proportion(): Extract the proportion of cells in each cluster in a `tof_tibble`.

tof_extract_threshold(): Extract aggregated features from CyTOF data using a binary threshold

Sample- or patient-level data analysis

Functions for data processing tasks at the whole-sample or whole-patient level

tof_split_data(): Split high-dimensional cytometry data into a training and test set

tof_train_model(): Train an elastic net model to predict sample-level phenomena using high-dimensional cytometry data.

tof_check_model_args(): Check argument specifications for a glmnet model.

tof_predict(): Use a trained elastic net model to predict fitted values from new data

tof_assess_model(): Assess a trained elastic net model

tof_assess_model_new_data(): Compute a trained elastic net model's performance metrics using new_data.

tof_assess_model_tuning(): Access a trained elastic net model's performance metrics using its tuning data.

tof_clean_metric_names(): Rename glmnet's default model evaluation metrics to make them more interpretable

tof_create_grid(): Create an elastic net hyperparameter search grid of a specified size

new_tof_model(): Constructor for a tof_model.

tof_get_model_mixture(): Get a `tof_model`'s optimal mixture (alpha) value

tof_get_model_outcomes(): Get a `tof_model`'s outcome variable name(s)

tof_get_model_penalty(): Get a `tof_model`'s optimal penalty (lambda) value

tof_get_model_training_data(): Get a `tof_model`'s training data

tof_get_model_type(): Get a `tof_model`'s model type

tof_get_model_x(): Get a `tof_model`'s processed predictor matrix (for glmnet)

tof_get_model_y(): Get a `tof_model`'s processed outcome variable matrix (for glmnet)

tof_fit_split(): Fit a glmnet model and calculate performance metrics using a single rsplit object

tof_tune_glmnet(): Tune an elastic net model's hyperparameters over multiple resamples

tof_find_best(): Find the optimal hyperparameters for an elastic net model from candidate performance metrics

Visualization

Functions for visualizing high-dimensional cytometry data

tof_plot_cells_density(): Plot marker expression density plots

tof_plot_cells_embedding(): Plot scatterplots of single-cell data using low-dimensional feature embeddings

tof_plot_cells_layout(): Plot force-directed layouts of single-cell data

tof_plot_cells_scatter(): Plot scatterplots of single-cell data.

tof_plot_clusters_heatmap(): Make a heatmap summarizing cluster marker expression patterns in CyTOF data

tof_plot_clusters_mst(): Visualize clusters in CyTOF data using a minimum spanning tree (MST).

tof_plot_clusters_volcano(): Create a volcano plot from differential expression analysis results

tof_plot_heatmap(): Make a heatmap summarizing group marker expression patterns in high-dimensional cytometry data

tof_plot_model(): Plot the results of a glmnet model fit on sample-level data.

tof_plot_model_linear(): Plot the results of a linear glmnet model fit on sample-level data.

tof_plot_model_logistic(): Plot the results of a two-class glmnet model fit on sample-level data.

tof_plot_model_multinomial(): Plot the results of a multiclass glmnet model fit on sample-level data.

tof_plot_model_survival(): Plot the results of a survival glmnet model fit on sample-level data.

tof_plot_sample_features(): Make a heatmap summarizing sample marker expression patterns in CyTOF data

tof_plot_sample_heatmap(): Make a heatmap summarizing sample marker expression patterns in CyTOF data

Utilities

Utility functions for performing miscellaneous high-dimensional cytometry data processing tasks

get_extension(): Find the extension for a file

rev_asinh(): Reverses arcsinh transformation with cofactor `scale_factor` and a shift of `shift_factor`.

cosine_similarity(): Find the cosine similarity between two vectors

l2_normalize(): L2 normalize an input vector x to a length of 1

dot(): Find the dot product between two vectors.

magnitude(): Find the magnitude of a vector.

where(): Select variables with a function

tof_cosine_dist(): A function for finding the cosine distance between each of the rows of a numeric matrix and a numeric vector.

tof_is_numeric(): Find if a vector is numeric

tof_find_knn(): Find the k-nearest neighbors of each cell in a high-dimensional cytometry dataset.

tof_knn_density(): Estimate cells' local densities using K-nearest-neighbor density estimation

tof_spade_density(): Estimate cells' local densities as done in Spanning-tree Progression Analysis of Density-normalized Events (SPADE)

tof_find_emd(): Find the earth-mover's distance between two numeric vectors

tof_find_jsd(): Find the Jensen-Shannon Divergence (JSD) between two numeric vectors

tof_create_recipe(): Create a recipe for preprocessing sample-level cytometry data for an elastic net model

tof_prep_recipe(): Train a recipe or list of recipes for preprocessing sample-level cytometry data

tof_compute_km_curve(): Compute a Kaplan-Meier curve from sample-level survival data

tof_find_cv_predictions(): Calculate and store the predicted outcomes for each validation set observation during model tuning

tof_find_log_rank_threshold(): Compute the log-rank test p-value for the difference between the two survival curves obtained by splitting a dataset into a "low" and "high" risk group using all possible relative-risk thresholds.

tof_log_rank_test(): Compute the log-rank test p-value for the difference between the two survival curves obtained by splitting a dataset into a "low" and "high" risk group using a given relative-risk threshold.

tof_make_roc_curve(): Compute a receiver-operating curve (ROC) for a two-class or multiclass dataset

tof_generate_palette(): Generate a color palette using tidytof.

tof_make_knn_graph(): Title

tof_split_tidytof_reduced_dimensions(): Split the dimensionality reduction data that tidytof combines during SingleCellExperiment conversion

make_flowcore_annotated_data_frame(): Make the AnnotatedDataFrame needed for the flowFrame class

Built-in data

Example cytometry datasets built into {tidytof}

ddpr_data: CyTOF data from two samples: 5,000 B-cell lineage cells from a healthy patient and 5,000 B-cell lineage cells from a B-cell precursor Acute Lymphoblastic Leukemia (BCP-ALL) patient.

ddpr_metadata: Clinical metadata for each patient sample in Good & Sarno et al. (2018).

phenograph_data: CyTOF data from 6,000 healthy immune cells from a single patient.

metal_masterlist: A character vector of metal name patterns supported by tidytof.

Integration with Bioconductor Data Structures

Adapter functions for interoperability with Bioconductor

as_flowFrame(): Coerce an object into a flowFrame

as_flowSet(): Coerce an object into a flowSet

as_seurat(): Coerce an object into a SeuratObject

as_SingleCellExperiment(): Coerce an object into a SingleCellExperiment