MarkersFinder¶
Find markers between different groups of cells
MarkersFinder
is a process that wraps the
Seurat::FindMarkers()
function, and performs enrichment analysis for the markers found.
Input¶
srtobj
: The seurat object loaded bySeuratPreparing
If you have yourSeurat
object prepared by yourself, you can also use it here, but you should make sure that the object has been processed byPrepSCTFindMarkers
if data is not normalized usingSCTransform
.
Output¶
outdir
: Default:{{in.srtobj | stem0}}.markers
.
The output directory for the markers and plots
Environment Variables¶
ncores
(type=int
): Default:1
.
Number of cores to use for parallel computing for someSeurat
procedures.- Used in
future::plan(strategy = "multicore", workers = <ncores>)
to parallelize some Seurat procedures. - See also: https://satijalab.org/seurat/articles/future_vignette.html
- Used in
mutaters
(type=json
): Default:{}
.
The mutaters to mutate the metadata.
You can also use the clone selectors to select the TCR clones/clusters.
See https://pwwang.github.io/scplotter/reference/clone_selectors.html..
See also mutating the metadata.group_by
: The column name in metadata to group the cells.
If onlygroup_by
is specified, andident-1
andident-2
are not specified, markers will be found for all groups in this column in the manner of "group vs rest" comparison.
NA
group will be ignored.
IfNone
,Seurat::Idents(srtobj)
will be used, which is usually"seurat_clusters"
after unsupervised clustering.ident_1
: The first group of cells to compare When this is empty, the comparisons will be expanded to each group v.s. the rest of the cells ingroup_by
.ident_2
: The second group of cells to compare If not provided, the rest of the cells are used forident-2
.each
: The column name in metadata to separate the cells into different cases.
When this is specified, the case will be expanded for each value of the column in metadata. For example, when you haveenvs.cases."Cluster Markers".each = "Sample"
, then the case will be expanded asenvs.cases."Cluster Markers - Sample1"
,envs.cases."Cluster Markers - Sample2"
, etc.
You can specifyallmarker_plots
andoverlaps
to plot the markers for all cases in the same plot and plot the overlaps of the markers between different cases by values in this column.dbs
(list
): Default:['KEGG_2021_Human', 'MSigDB_Hallmark_2020']
.
The dbs to do enrichment analysis for significant markers See below for all libraries.
https://maayanlab.cloud/Enrichr/#librariessigmarkers
: Default:p_val_adj < 0.05
.
An expression passed todplyr::filter()
to filter the significant markers for enrichment analysis.
Available variables arep_val
,avg_log2FC
,pct.1
,pct.2
andp_val_adj
. For example,"p_val_adj < 0.05 & abs(avg_log2FC) > 1"
to select markers with adjusted p-value < 0.05 and absolute log2 fold change > 1.enrich_style
(choice
): Default:enrichr
.
The style of the enrichment analysis.
The enrichment analysis will be done byEnrichIt()
fromenrichit
.
Two styles are available:enrichr
:enrichr
style enrichment analysis (fisher's exact test will be used).clusterprofiler
:clusterProfiler
style enrichment analysis (hypergeometric test will be used).clusterProfiler
: alias forclusterprofiler
assay
: The assay to use.error
(flag
): Default:False
.
Error out if no/not enough markers are found or no pathways are enriched.
IfFalse
, empty results will be returned.subset
: An expression to subset the cells for each case.cache
(type=auto
): Default:/tmp
.
Where to cache the results.
IfTrue
, cache tooutdir
of the job. IfFalse
, don't cache.
Otherwise, specify the directory to cache to.rest
(ns
): Rest arguments forSeurat::FindMarkers()
.
Use-
to replace.
in the argument name. For example, usemin-pct
instead ofmin.pct
.
allmarker_plots_defaults
(ns
): Default options for the plots for all markers whenident-1
is not specified.plot_type
: The type of the plot.
See https://pwwang.github.io/scplotter/reference/FeatureStatPlot.html.
Available types areviolin
,box
,bar
,ridge
,dim
,heatmap
anddot
.more_formats
(type=list
): Default:[]
.
The extra formats to save the plot in.save_code
(flag
): Default:False
.
Whether to save the code to generate the plot.devpars
(ns
): The device parameters for the plots.res
(type=int
): Default:100
.
The resolution of the plots.height
(type=int
): The height of the plots.width
(type=int
): The width of the plots.
order_by
: Default:desc(abs(avg_log2FC))
.
an expression to order the markers, passed bydplyr::arrange()
.genes
: Default:10
.
The number of top genes to show or an expression passed todplyr::filter()
to filter the genes.<more>
: Other arguments passed toscplotter::FeatureStatPlot()
.
allmarker_plots
(type=json
): Default:{}
.
All marker plot cases.
The keys are the names of the cases and the values are the dicts inherited fromallmarker_plots_defaults
.allenrich_plots_defaults
(ns
): Default options for the plots to generate for the enrichment analysis.plot_type
: Default:heatmap
.
The type of the plot.devpars
(ns
): The device parameters for the plots.res
(type=int
): Default:100
.
The resolution of the plots.height
(type=int
): The height of the plots.width
(type=int
): The width of the plots.
<more>
: See https://pwwang.github.io/scplotter/reference/EnrichmentPlot.html.
allenrich_plots
(type=json
): Default:{}
.
Cases of the plots to generate for the enrichment analysis.
The keys are the names of the cases and the values are the dicts inherited fromallenrich_plots_defaults
.
The cases underenvs.cases
can inherit this options.marker_plots_defaults
(ns
): Default options for the plots to generate for the markers.plot_type
: The type of the plot.
See https://pwwang.github.io/scplotter/reference/FeatureStatPlot.html.
Available types areviolin
,box
,bar
,ridge
,dim
,heatmap
anddot
.
There are two additional types available -volcano_pct
andvolcano_log2fc
.more_formats
(type=list
): Default:[]
.
The extra formats to save the plot in.save_code
(flag
): Default:False
.
Whether to save the code to generate the plot.devpars
(ns
): The device parameters for the plots.res
(type=int
): Default:100
.
The resolution of the plots.height
(type=int
): The height of the plots.width
(type=int
): The width of the plots.
order_by
: Default:desc(abs(avg_log2FC))
.
an expression to order the markers, passed bydplyr::arrange()
.genes
: Default:10
.
The number of top genes to show or an expression passed todplyr::filter()
to filter the genes.<more>
: Other arguments passed toscplotter::FeatureStatPlot()
.
Ifplot_type
isvolcano_pct
orvolcano_log2fc
, they will be passed toscplotter::VolcanoPlot()
.
marker_plots
(type=json
): Default:{'Volcano Plot (diff_pct)': Diot({'plot_type': 'volcano_pct'}), 'Volcano Plot (log2FC)': Diot({'plot_type': 'volcano_log2fc'}), 'Dot Plot': Diot({'plot_type': 'dot'})}
.
Cases of the plots to generate for the markers.
Plot cases. The keys are the names of the cases and the values are the dicts inherited frommarker_plots_defaults
.
The cases underenvs.cases
can inherit this options.enrich_plots_defaults
(ns
): Default options for the plots to generate for the enrichment analysis.plot_type
: The type of the plot.
See https://pwwang.github.io/scplotter/reference/EnrichmentPlot.html.
Available types arebar
,dot
,lollipop
,network
,enrichmap
andwordcloud
.more_formats
(type=list
): Default:[]
.
The extra formats to save the plot in.save_code
(flag
): Default:False
.
Whether to save the code to generate the plot.devpars
(ns
): The device parameters for the plots.res
(type=int
): Default:100
.
The resolution of the plots.height
(type=int
): The height of the plots.width
(type=int
): The width of the plots.
<more>
: See https://pwwang.github.io/scplotter/reference/EnrichmentPlot.htmll.
enrich_plots
(type=json
): Default:{'Bar Plot': Diot({'plot_type': 'bar', 'ncol': 1, 'top_term': 10})}
.
Cases of the plots to generate for the enrichment analysis.
The keys are the names of the cases and the values are the dicts inherited fromenrich_plots_defaults
.
The cases underenvs.cases
can inherit this options.overlaps_defaults
(ns
): Default options for investigating the overlapping of significant markers between different cases or comparisons.
This means eitherident-1
should be empty, so that they can be expanded to multiple comparisons.sigmarkers
: The expression to filter the significant markers for each case.
If not provided,envs.sigmarkers
will be used.plot_type
(choice
): Default:venn
.
The type of the plot to generate for the overlaps.venn
: Useplotthis::VennDiagram()
.upset
: Useplotthis::UpsetPlot()
.
more_formats
(type=list
): Default:[]
.
The extra formats to save the plot in.save_code
(flag
): Default:False
.
Whether to save the code to generate the plot.devpars
(ns
): The device parameters for the plots.res
(type=int
): Default:100
.
The resolution of the plots.height
(type=int
): The height of the plots.width
(type=int
): The width of the plots.
<more>
: More arguments pased toplotthis::VennDiagram()
(https://pwwang.github.io/plotthis/reference/venndiagram1.html) orplotthis::UpsetPlot()
(https://pwwang.github.io/plotthis/reference/upsetplot1.html)
overlaps
(type=json
): Default:{}
.
Cases for investigating the overlapping of significant markers between different cases or comparisons.
The keys are the names of the cases and the values are the dicts inherited fromoverlaps_defaults
.
There are two situations that we can perform overlaps:- If
ident-1
is not specified, the overlaps can be performed between different comparisons. - If
each
is specified, the overlaps can be performed between different cases, where in each case,ident-1
must be specified.
- If
cases
(type=json
): Default:{}
.
If you have multiple cases for marker discovery, you can specify them here. The keys are the names of the cases and the values are the above options. If some options are not specified, the default values specified above (underenvs
) will be used.
If no cases are specified, the default case will be added with the default values underenvs
with the nameMarker Discovery
.
Examples¶
The examples are for more general use of MarkersFinder
, in order to
demonstrate how the final cases are constructed.
Suppose we have a metadata like this:
id | seurat_clusters | Group |
---|---|---|
1 | 1 | A |
2 | 1 | A |
3 | 2 | A |
4 | 2 | A |
5 | 3 | B |
6 | 3 | B |
7 | 4 | B |
8 | 4 | B |
Default¶
By default, group_by
is seurat_clusters
, and ident_1
and ident_2
are not specified. So markers will be found for all clusters in the manner
of "cluster vs rest" comparison.
- Cluster
- 1 (vs 2, 3, 4)
- 2 (vs 1, 3, 4)
- 3 (vs 1, 2, 4)
- 4 (vs 1, 2, 3)
Each case will have the markers and the enrichment analysis for the
markers as the results.
With each
group¶
each
is used to separate the cells into different cases. group_by
is still seurat_clusters
.
[<Proc>.envs]
group_by = "seurat_clusters"
each = "Group"
- A:Cluster
- 1 (vs 2)
- 2 (vs 1)
- B:Cluster
- 3 (vs 4)
- 4 (vs 3)
With ident_1
only¶
ident_1
is used to specify the first group of cells to compare.
Then the rest of the cells in the case are used for ident_2
.
[<Proc>.envs]
group_by = "seurat_clusters"
ident_1 = "1"
- Cluster
- 1 (vs 2, 3, 4)
With both ident_1
and ident_2
¶
ident_1
and ident_2
are used to specify the two groups of cells to
compare.
[<Proc>.envs]
group_by = "seurat_clusters"
ident_1 = "1"
ident_2 = "2"
- Cluster
- 1 (vs 2)
Multiple cases¶
[<Proc>.envs.cases]
c1_vs_c2 = {ident_1 = "1", ident_2 = "2"}
c3_vs_c4 = {ident_1 = "3", ident_2 = "4"}
- DEFAULT:c1_vs_c2
- 1 (vs 2)
- DEFAULT:c3_vs_c4
- 3 (vs 4)
The DEFAULT
section name will be ignored in the report. You can specify
a section name other than DEFAULT
for each case to group them
in the report.