Metabolic landscape analysis for scRNA-seq data

Classes
Bases
biopipen.core.proc.Proc pipen.proc.Proc

This process calculates the pathway activities in different groups and subsets.

The cells are first grouped by subsets and then the metabolic activities are examined for each groups in different subsets.

For each subset, a heatmap and a violin plot will be generated. The heatmap shows the pathway activities for each group and each metabolic pathway

MetabolicPathwayActivity_heatmap{: width="80%"}

The violin plot shows the distribution of the pathway activities for each group

MetabolicPathwayActivity_violin{: width="45%"}

Envs
  • gmtfile (pgarg) The GMT file with the metabolic pathways.Defaults to ScrnaMetabolicLandscape.gmtfile
  • grouping (type=auto;pgarg;readonly) Defines the basic groups toinvestigate the metabolic activity, typically the clusters. Defaults to ScrnaMetabolicLandscape.grouping
  • grouping_prefix (type=auto;pgarg;readonly) Working as a prefix to groupnames. For example, if we have grouping_prefix = "cluster" and we have 1 and 2 in the grouping column, the groups will be named as cluster_1 and cluster_2. Defaults to ScrnaMetabolicLandscape.grouping_prefix
  • heatmap_devpars (ns) Device parameters for the heatmap
    • width (type=int): Width of the heatmap
    • height (type=int): Height of the heatmap
    • res (type=int): Resolution of the heatmap
  • ncores (type=int;pgarg) Number of cores to use for parallelizationDefaults to ScrnaMetabolicLandscape.ncores
  • ntimes (type=int) Number of times to do the permutation
  • subsetting (type=auto;pgarg;readonly) How do we subset the data. Othercolumns in the metadata to do comparisons. For example, "TimePoint" or ["TimePoint", "Response"]. Defaults to ScrnaMetabolicLandscape.subsetting
  • subsetting_prefix (type=auto;pgarg;readonly) Working as a prefix tosubset names. For example, if we have subsetting_prefix = "timepoint" and we have pre and post in the subsetting column, the subsets will be named as timepoint_pre and timepoint_post. If subsetting is a list, then this should also be a same-length list. If a single string is given, it will be repeated to a list with the same length as subsetting. Defaults to ScrnaMetabolicLandscape.subsetting_prefix
  • violin_devpars (ns) Device parameters for the violin plot
    • width (type=int): Width of the violin plot
    • height (type=int): Height of the violin plot
    • res (type=int): Resolution of the violin plot
Requires
  • r-complexheatmap
    • check: {{proc.lang}} <(echo "library(ComplexHeatmap)")
  • r-ggplot2
    • check: {{proc.lang}} <(echo "library(ggplot2)")
  • r-ggprism
    • check: {{proc.lang}} <(echo "library(ggprism)")
  • r-parallel
    • check: {{proc.lang}} <(echo "library(parallel)")
  • r-rcolorbrewer
    • check: {{proc.lang}} <(echo "library(RColorBrewer)")
  • r-reshape2
    • check: {{proc.lang}} <(echo "library(reshape2)")
  • r-scater
    • check: {{proc.lang}} <(echo "library(scater)")
Bases
biopipen.core.proc.Proc pipen.proc.Proc

This process performs enrichment analysis for the metabolic pathwaysfor each group in each subset.

The enrichment analysis is done with fgsea package or the GSEA_R package.

Envs
  • fgsea (flag) Whether to do fast gsea analysis using fgsea package.If False, the GSEA_R package will be used.
  • gmtfile (pgarg) The GMT file with the metabolic pathways.Defaults to ScrnaMetabolicLandscape.gmtfile
  • grouping (type=auto;pgarg;readonly) Defines the basic groups toinvestigate the metabolic activity. Defaults to ScrnaMetabolicLandscape.grouping
  • grouping_prefix (type=auto;pgarg;readonly) Working as a prefix togroup names. Defaults to ScrnaMetabolicLandscape.grouping_prefix
  • ncores (type=int;pgarg) Number of cores to use for parallelization.Defaults to ScrnaMetabolicLandscape.ncores
  • prerank_method (choice) Method to use for gene preranking.Signal to noise: the larger the differences of the means (scaled by the standard deviations); that is, the more distinct the gene expression is in each phenotype and the more the gene acts as a “class marker.”. Absolute signal to noise: the absolute value of the signal to noise. T test: Uses the difference of means scaled by the standard deviation and number of samples. Ratio of classes: Uses the ratio of class means to calculate fold change for natural scale data. Diff of classes: Uses the difference of class means to calculate fold change for nature scale data Log2 ratio of classes: Uses the log2 ratio of class means to calculate fold change for natural scale data. This is the recommended statistic for calculating fold change for log scale data.
    • signal_to_noise: Signal to noise
    • s2n: Alias of signal_to_noise
    • abs_signal_to_noise: absolute signal to noise
    • abs_s2n: Alias of abs_signal_to_noise
    • t_test: T test
    • ratio_of_classes: Also referred to as fold change
    • diff_of_classes: Difference of class means
    • log2_ratio_of_classes: Log2 ratio of class means
  • subsetting (type=auto;pgarg;readonly) How do we subset the data.Another column(s) in the metadata. Defaults to ScrnaMetabolicLandscape.subsetting
  • subsetting_prefix (type=auto;pgarg;readonly) Working as a prefix tosubset names. Defaults to ScrnaMetabolicLandscape.subsetting_prefix
  • top (type=int) N top of enriched pathways to show
Requires
  • r-fgsea
    • check: {{proc.lang}} <(echo "library(fgsea)")
  • r-parallel
    • check: {{proc.lang}} <(echo "library(parallel)")
Bases
biopipen.core.proc.Proc pipen.proc.Proc

Intra-subset metabolic features - Enrichment analysis in details

Similar to the MetabolicFeatures process, this process performs enrichment analysis for the metabolic pathways for each subset in each group, instead of each group in each subset.

Envs
  • fgsea (flag) Whether to do fast gsea analysis
  • gmtfile (pgarg) The GMT file with the metabolic pathways.Defaults to ScrnaMetabolicLandscape.gmtfile
  • grouping (type=auto;pgarg;readonly) Defines the basic groups toinvestigate the metabolic activity. Defaults to ScrnaMetabolicLandscape.grouping
  • grouping_prefix (type=auto;pgarg;readonly) Working as a prefix to groupnames. Defaults to ScrnaMetabolicLandscape.grouping_prefix
  • ncores (type=int; pgarg) Number of cores to use for parallelizationDefaults to ScrnaMetabolicLandscape.ncores
  • prerank_method (choice) Method to use for gene prerankingSignal to noise: the larger the differences of the means (scaled by the standard deviations); that is, the more distinct the gene expression is in each phenotype and the more the gene acts as a “class marker.”. Absolute signal to noise: the absolute value of the signal to noise. T test: Uses the difference of means scaled by the standard deviation and number of samples. Ratio of classes: Uses the ratio of class means to calculate fold change for natural scale data. Diff of classes: Uses the difference of class means to calculate fold change for nature scale data Log2 ratio of classes: Uses the log2 ratio of class means to calculate fold change for natural scale data. This is the recommended statistic for calculating fold change for log scale data.
    • signal_to_noise: Signal to noise
    • s2n: Alias of signal_to_noise
    • abs_signal_to_noise: absolute signal to noise
    • abs_s2n: Alias of abs_signal_to_noise
    • t_test: T test
    • ratio_of_classes: Also referred to as fold change
    • diff_of_classes: Difference of class means
    • log2_ratio_of_classes: Log2 ratio of class means
  • subsetting (type=auto;pgarg;readonly) How do we subset the data.Another column(s) in the metadata. Defaults to ScrnaMetabolicLandscape.subsetting
  • subsetting_comparison (type=json;pgarg;readonly) How do we compare thesubsets. Defaults to ScrnaMetabolicLandscape.subsetting_comparison
  • subsetting_prefix (type=auto;pgarg;readonly) Working as a prefix tosubset names. Defaults to ScrnaMetabolicLandscape.subsetting_prefix
  • top (type=int) N top of enriched pathways to show
Requires
  • r-fgsea
    • check: {{proc.lang}} <(echo "library(fgsea)")
  • r-parallel
    • check: {{proc.lang}} <(echo "library(parallel)")
  • r-scater
    • check: {{proc.lang}} <(echo "library(scater)")
Bases
biopipen.core.proc.Proc pipen.proc.Proc

Calculate Metabolic Pathway heterogeneity.

For each subset, the normalized enrichment score (NES) of each metabolic pathway is calculated for each group. The NES is calculated by comparing the enrichment score of the subset to the enrichment scores of the same subset in the permutations. The p-value is calculated by comparing the NES to the NESs of the same subset in the permutations. The heterogeneity can be reflected by the NES values and the p-values in different groups for the metabolic pathways.

MetabolicPathwayHeterogeneity

Envs
  • bubble_devpars (ns) The devpars for the bubble plot
    • width (type=int): The width of the plot
    • height (type=int): The height of the plot
    • res (type=int): The resolution of the plot
  • gmtfile (pgarg) The GMT file with the metabolic pathways.Defaults to ScrnaMetabolicLandscape.gmtfile
  • grouping (type=auto;pgarg;readonly) Defines the basic groups toinvestigate the metabolic activity. Defaults to ScrnaMetabolicLandscape.grouping
  • grouping_prefix (type=auto;pgarg;readonly) Working as a prefix to groupnames. Defaults to ScrnaMetabolicLandscape.grouping_prefix
  • ncores (type=int;pgarg) Number of cores to use for parallelizationDefaults to ScrnaMetabolicLandscape.ncores
  • pathway_pval_cutoff (type=float) The p-value cutoff to selectthe enriched pathways
  • select_pcs (type=float) Select the PCs to use for the analysis.
  • subsetting (type=auto;pgarg;readonly) How do we subset the data.Another column(s) in the metadata. Defaults to ScrnaMetabolicLandscape.subsetting
  • subsetting_prefix (type=auto;pgarg;readonly) Working as a prefix tosubset names. Defaults to ScrnaMetabolicLandscape.subsetting_prefix
Requires
  • r-data.table
    • check: {{proc.lang}} <(echo "library(data.table)")
  • r-dplyr
    • check: {{proc.lang}} <(echo "library(dplyr)")
  • r-enrichr
    • check: {{proc.lang}} <(echo "library(enrichR)")
  • r-fgsea
    • check: {{proc.lang}} <(echo "library(fgsea)")
  • r-ggplot2
    • check: {{proc.lang}} <(echo "library(ggplot2)")
  • r-ggprism
    • check: {{proc.lang}} <(echo "library(ggprism)")
  • r-gtools
    • check: {{proc.lang}} <(echo "library(gtools)")
  • r-parallel
    • check: {{proc.lang}} <(echo "library(parallel)")
  • r-tibble
    • check: {{proc.lang}} <(echo "library(tibble)")
Bases
pipen_args.procgroup.ProcGroup pipen.procgroup.ProcGroup

Metabolic landscape analysis for scRNA-seq data

An abstract from https://github.com/LocasaleLab/Single-Cell-Metabolic-Landscape

See docs here for more details https://pwwang.github.io/biopipen/pipelines/scrna_metabolic_landscape

Reference: Xiao, Zhengtao, Ziwei Dai, and Jason W. Locasale. "Metabolic landscape of the tumor microenvironment at single cell resolution." Nature communications 10.1 (2019): 1-12.