SeuratClusterStats¶
Statistics of the clustering.
Including the number/fraction of cells in each cluster, the gene expression values
and dimension reduction plots. It's also possible to perform stats on
TCR clones/clusters or other metadata for each T-cell cluster.
Environment Variables¶
mutaters
(type=json
): Default:{}
.
The mutaters to mutate the metadata to subset the cells.
The mutaters will be applied in the order specified.clustrees_defaults
(ns
): The parameters for the clustree plots.devpars
(ns
): The device parameters for the clustree plot.res
(type=int
): Default:100
.
The resolution of the plots.height
(type=int
): Default:1000
.
The height of the plots.width
(type=int
): Default:800
.
The width of the plots.
prefix
: Default:_auto
.
string indicating columns containing clustering information.
The trailing dot is not necessary and will be added automatically.
When_auto
, clustrees will be plotted when there isFindClusters
orFindClusters.*
in theobj@commands
.
The latter is generated bySeuratSubClustering
.
This will be ignored whenenvs.clustrees
is specified.<more>
: Other arguments passed toclustree::clustree()
.
See https://rdrr.io/cran/clustree/man/clustree.html
clustrees
(type=json
): Default:{}
.
The cases for clustree plots.
Keys are the names of the plots and values are the dicts inherited fromenv.clustrees_defaults
exceptprefix
.
There is no default case forclustrees
.hists_defaults
(ns
): The default parameters for histograms.
This will plot histograms for the number of cells alongx
.
For example, you can plot the number of cells along cell activity score.x
: The column name in metadata to plot as the x-axis.
The NA values will be removed.
It could be either numeric or factor/character.x_order
(list
): Default:[]
.
The order of the x-axis, only works for factor/characterx
.
You can also use it to subsetx
(showing only a subset values ofx
).cells_by
: A column name in metadata to group the cells.
The NA values will be removed. It should be a factor/character.
if not specified, all cells will be used.cells_order
(list
): Default:[]
.
The order of the cell groups for the plots.
It should be a list of strings. You can also usecells_orderby
andcells_n
to determine the order.cells_orderby
: An expression passed todplyr::arrange()
to order the cell groups.cells_n
: Default:10
.
The number of cell groups to show.
Ignored ifcells_order
is specified.ncol
(type=int
): Default:2
.
The number of columns for the plots, split bycells_by
.subset
: An expression to subset the cells, will be passed todplyr::filter()
.each
: Whether to plot each group separately.bins
: Default:30
.
The number of bins to use, only works for numericx
.plus
(list
): Default:[]
.
The extra elements to add to theggplot
object.devpars
(ns
): The device parameters for the plots.res
(type=int
): Default:100
.
The resolution of the plots.height
(type=int
): The height of the plots.width
(type=int
): The width of the plots.
hists
(type=json
): Default:{}
.
The cases for histograms.
Keys are the names of the plots and values are the dicts inherited fromenv.hists_defaults
.
There is no default case.stats_defaults
(ns
): The default parameters forstats
.
This is to do some basic statistics on the clusters. For more comprehensive analysis, seeRadarPlots
andCellsDistribution
.
The parameters from the cases can overwrite the default parameters.frac
(choice
): Default:none
.
How to calculate the fraction of cells.group
: calculate the fraction in each group.
The total fraction of the cells of idents in each group will be 1.
Whengroup-by
is not specified, it will be the same asall
.ident
: calculate the fraction in each ident.
The total fraction of the cells of groups in each ident will be 1.
Only works whengroup-by
is specified.cluster
: alias ofident
.all
: calculate the fraction against all cells.none
: do not calculate the fraction, use the number of cells instead.
pie
(flag
): Default:False
.
Also output a pie chart?circos
(flag
): Default:False
.
Also output a circos plot?table
(flag
): Default:False
.
Whether to output a table (in tab-delimited format) and in the report.transpose
(flag
): Default:False
.
Whether to transpose the cluster and group, that is, using group as the x-axis and cluster to fill the plot.
For circos plot, when transposed, the arrows will be drawn from the idents (byident
) to the the groups (bygroup-by
).
Only works whengroup-by
is specified.position
(choice
): Default:auto
.
The position of the bars. Does not work for pie and circos plots.stack
: Useposition_stack()
.fill
: Useposition_fill()
.dodge
: Useposition_dodge()
.auto
: Usestack
when there are more than 5 groups, otherwise usedodge
.
ident
: Default:seurat_clusters
.
The column name in metadata to use as the identity.group-by
: The column name in metadata to group the cells.
Does NOT support for pie charts.split-by
: The column name in metadata to split the cells into different plots.
Does NOT support for circos plots.subset
: An expression to subset the cells, will be passed todplyr::filter()
on metadata.circos_labels_rot
(flag
): Default:False
.
Whether to rotate the labels in the circos plot.
In case the labels are too long.circos_devpars
(ns
): The device parameters for the circos plots.res
(type=int
): Default:100
.
The resolution of the plots.height
(type=int
): Default:600
.
The height of the plots.width
(type=int
): Default:600
.
The width of the plots.
pie_devpars
(ns
): The device parameters for the pie charts.res
(type=int
): Default:100
.
The resolution of the plots.height
(type=int
): Default:600
.
The height of the plots.width
(type=int
): Default:800
.
The width of the plots.
devpars
(ns
): The device parameters for the plots.res
(type=int
): Default:100
.
The resolution of the plots.height
(type=int
): Default:600
.
The height of the plots.width
(type=int
): Default:800
.
The width of the plots.
-
stats
(type=json
): Default:{'Number of cells in each cluster': Diot({'pie': True}), 'Number of cells in each cluster by Sample': Diot({'group-by': 'Sample', 'table': True, 'frac': 'group'})}
.
The number/fraction of cells to plot.
Keys are the names of the plots and values are the dicts inherited fromenv.stats_defaults
.
Here are some examples -{ "nCells_All": {}, "nCells_Sample": {"group-by": "Sample"}, "fracCells_Sample": {"frac": True, "group-by": "Sample"}, }
-
ngenes_defaults
(ns
): The default parameters forngenes
.
The default parameters to plot the number of genes expressed in each cell.ident
: Default:seurat_clusters
.
The column name in metadata to use as the identity.group-by
: The column name in metadata to group the cells.
Dodge position will be used to separate the groups.split-by
: The column name in metadata to split the cells into different plots.subset
: An expression to subset the cells, will be passed totidyrseurat::filter()
.devpars
(ns
): The device parameters for the plots.res
(type=int
): Default:100
.
The resolution of the plots.height
(type=int
): Default:800
.
The height of the plots.width
(type=int
): Default:1000
.
The width of the plots.
ngenes
(type=json
): Default:{'Number of genes expressed in each cluster': Diot({})}
.
The number of genes expressed in each cell.
Keys are the names of the plots and values are the dicts inherited fromenv.ngenes_defaults
.features_defaults
(ns
): The default parameters forfeatures
.features
: The features to plot.
It can be either a string with comma separated features, a list of features, a file path withfile://
prefix with features (one per line), or an integer to use the top N features fromVariantFeatures(srtobj)
.ident
: Default:seurat_clusters
.
The column name in metadata to use as the identity.
If it is from subclustering (reductionsub_umap_<ident>
exists), the reduction will be used.cluster_orderby
(type=auto
): The order of the clusters to show on the plot.
An expression passed todplyr::summarise()
on the grouped data frame (byseurat_clusters
).
The summary stat will be passed todplyr::arrange()
to order the clusters. It's applied on the whole meta.data before grouping and subsetting.
For example, you can order the clusters by the activation score of the cluster:desc(mean(ActivationScore, na.rm = TRUE))
, suppose you have a columnActivationScore
in the metadata.
You may also specify the literal order of the clusters by a list of strings.subset
: An expression to subset the cells, will be passed totidyrseurat::filter()
.devpars
(ns
): The device parameters for the plots. Does not work fortable
.res
(type=int
): Default:100
.
The resolution of the plots.height
(type=int
): The height of the plots.width
(type=int
): The width of the plots.
plus
: The extra elements to add to theggplot
object. Does not work fortable
.group-by
: Group cells in different ways (for example, orig.ident). Works forridge
,vln
, anddot
.
It also works forfeature
asshape.by
being passed toSeurat::FeaturePlot
.split-by
: The column name in metadata to split the cells into different plots.
It works forvln
,feature
, anddot
.assay
: The assay to use.layer
: The layer to use.reduction
: The reduction to use. Only works forfeature
.section
: The section to put the plot in the report.
If not specified, the case title will be used.ncol
(type=int
): Default:2
.
The number of columns for the plots.kind
(choice
): The kind of the plot or table.ridge
: UseSeurat::RidgePlot
.ridgeplot
: Same asridge
.vln
: UseSeurat::VlnPlot
.vlnplot
: Same asvln
.violin
: Same asvln
.violinplot
: Same asvln
.feature
: UseSeurat::FeaturePlot
.featureplot
: Same asfeature
.dot
: UseSeurat::DotPlot
.dotplot
: Same asdot
.bar
: Bar plot on an aggregated feature.
The features must be a single feature, which will be either an existing feature or an expression passed todplyr::summarise()
(grouped byident
) on the existing features to create a new feature.barplot
: Same asbar
.heatmap
: UseSeurat::DoHeatmap
.avgheatmap
: Plot the average expression of the features in each cluster as a heatmap.table
: The table for the features, only gene expressions are supported.
(supported keys: ident, subset, and features).
features
(type=json
): Default:{}
.
The plots for features, include gene expressions, and columns from metadata.
Keys are the titles of the cases and values are the dicts inherited fromenv.features_defaults
. It can also have other parameters from each Seurat function used bykind
. Note that for argument name with.
, you should use-
instead.dimplots_defaults
(ns
): The default parameters fordimplots
.ident
: Default:seurat_clusters
.
The identity to use.
If it is from subclustering (reductionsub_umap_<ident>
exists), this reduction will be used ifreduction
is set todim
orauto
.group-by
: Same asident
if not specified, to define how the points are colored.na_group
: The group name for NA values, useNone
to ignore NA values.split-by
: The column name in metadata to split the cells into different plots.shape-by
: The column name in metadata to use as the shape.subset
: An expression to subset the cells, will be passed totidyrseurat::filter()
.devpars
(ns
): The device parameters for the plots.res
(type=int
): Default:100
.
The resolution of the plots.height
(type=int
): Default:800
.
The height of the plots.width
(type=int
): Default:1000
.
The width of the plots.
reduction
(choice
): Default:dim
.
Which dimensionality reduction to use.dim
: UseSeurat::DimPlot
.
First searches forumap
, thentsne
, thenpca
.
Ifident
is from subclustering,sub_umap_<ident>
will be used.auto
: Same asdim
umap
: UseSeurat::UMAPPlot
.tsne
: UseSeurat::TSNEPlot
.pca
: UseSeurat::PCAPlot
.
<more>
: See https://satijalab.org/seurat/reference/dimplot
dimplots
(type=json
): Default:{'Dimensional reduction plot': Diot({'label': True, 'label-box': True, 'repel': True}), 'TCR presence': Diot({'ident': 'TCR_Presence', 'order': 'TCR_absent', 'cols': ['#FF000066', 'gray']})}
.
The dimensional reduction plots.
Keys are the titles of the plots and values are the dicts inherited fromenv.dimplots_defaults
. It can also have other parameters fromSeurat::DimPlot
.
Examples¶
Number of cells in each cluster¶
[SeuratClusterStats.envs.stats]
# suppose you have nothing set in `envs.stats_defaults`
# otherwise, the settings will be inherited here
nCells_All = { }
Number of cells in each cluster by groups¶
[SeuratClusterStats.envs.stats]
nCells_Sample = { group-by = "Sample" }
Violin plots for the gene expressions¶
[SeuratClusterStats.envs.features]
features = "CD4,CD8A"
# Remove the dots in the violin plots
vlnplots = { pt-size = 0, kind = "vln" }
# Don't use the default genes
vlnplots_1 = { features = ["FOXP3", "IL2RA"], pt-size = 0, kind = "vln" }
Dimension reduction plot with labels¶
[SeuratClusterStats.envs.dimplots.Idents]
label = true
label-box = true
repel = true