Test for spatial enrichment of gene expression sets in ST data sets

STenrich(
  x = NULL,
  gene_sets = NULL,
  reps = 1000,
  num_sds = 1,
  min_units = 20,
  min_genes = 5,
  pval_adj_method = "BH",
  seed = 12345,
  cores = NULL
)

Arguments

x

an STlist with transformed gene expression

gene_sets

a named list of gene sets to test. The names of the list should identify the gene sets to be tested

reps

the number of random samples to be extracted. Default is 1000 replicates

num_sds

the number of standard deviations to set the minimum gene set expression threshold. Default is one (1) standard deviation

min_units

Minimum number of spots with high expression of a pathway for that gene set to be considered in the analysis. Defaults to 20 spots or cells

min_genes

the minimum number of genes of a gene set present in the data set for that gene set to be included. Default is 5 genes

pval_adj_method

the method for multiple comparison adjustment of p-values. Options are the same as that of p.adjust. Default is 'BH'

seed

the seed number for the selection of random samples. Default is 12345

cores

the number of cores used during parallelization. If NULL (default), the number of cores is defined automatically

Value

a list of data frames with the results of the test

Details

The function performs a randomization test to assess if the sum of distances between cells/spots with high expression of a gene set is lower than the sum of distances of randomly selected cells/spots. The cells/spots are considered as having high gene set expression if the average expression of genes in a set is higher than the average expression plus a num_sds times the standard deviation. Control over the size of regions with high expression is provided by setting the minimum number of cells or spots (min_units). This method is a modification of the method devised by Hunter et al. 2021 (zebrafish melanoma study)