Metabolic landscape analysis for scRNA-seq data

Classes
Bases
biopipen.core.proc.Proc pipen.proc.Proc

This process calculates the pathway activities in different groups and subsets.

The cells are first grouped by subsets and then the metabolic activities are examined for each groups in different subsets.

For each subset, a heatmap and a violin plot will be generated. The heatmap shows the pathway activities for each group and each metabolic pathway

MetabolicPathwayActivity_heatmap{: width="80%"}

The violin plot shows the distribution of the pathway activities for each group

MetabolicPathwayActivity_violin{: width="45%"}

Attributes
  • cache Should we detect whether the jobs are cached?
  • desc The description of the process. Will use the summary fromthe docstring by default.
  • dirsig When checking the signature for caching, whether should we walkthrough the content of the directory? This is sometimes time-consuming if the directory is big.
  • envs The arguments that are job-independent, useful for common optionsacross jobs.
  • envs_depth How deep to update the envs when subclassed.
  • error_strategy How to deal with the errors
    • - retry, ignore, halt
    • - halt to halt the whole pipeline, no submitting new jobs
    • - terminate to just terminate the job itself
  • export When True, the results will be exported to <pipeline.outdir>Defaults to None, meaning only end processes will export. You can set it to True/False to enable or disable exporting for processes
  • forks How many jobs to run simultaneously?
  • input The keys for the input channel
  • input_data The input data (will be computed for dependent processes)
  • lang The language for the script to run. Should be the path to theinterpreter if lang is not in $PATH.
  • name The name of the process. Will use the class name by default.
  • nexts Computed from requires to build the process relationships
  • num_retries How many times to retry to jobs once error occurs
  • order The execution order for this process. The bigger the numberis, the later the process will be executed. Default: 0. Note that the dependent processes will always be executed first. This doesn't work for start processes either, whose orders are determined by Pipen.set_starts()
  • output The output keys for the output channel(the data will be computed)
  • output_data The output data (to pass to the next processes)
  • plugin_opts Options for process-level plugins
  • requires The dependency processes
  • scheduler The scheduler to run the jobs
  • scheduler_opts The options for the scheduler
  • script The script template for the process
  • submission_batch How many jobs to be submited simultaneously
  • template Define the template engine to use.This could be either a template engine or a dict with key engine indicating the template engine and the rest the arguments passed to the constructor of the pipen.template.Template object. The template engine could be either the name of the engine, currently jinja2 and liquidpy are supported, or a subclass of pipen.template.Template. You can subclass pipen.template.Template to use your own template engine.
Envs
  • gmtfile (pgarg) The GMT file with the metabolic pathways.Defaults to ScrnaMetabolicLandscape.gmtfile
  • grouping (type=auto;pgarg;readonly) Defines the basic groups toinvestigate the metabolic activity, typically the clusters. Defaults to ScrnaMetabolicLandscape.grouping
  • grouping_prefix (type=auto;pgarg;readonly) Working as a prefix to groupnames. For example, if we have grouping_prefix = "cluster" and we have 1 and 2 in the grouping column, the groups will be named as cluster_1 and cluster_2. Defaults to ScrnaMetabolicLandscape.grouping_prefix
  • heatmap_devpars (ns) Device parameters for the heatmap
    • - width (type=int): Width of the heatmap
    • - height (type=int): Height of the heatmap
    • - res (type=int): Resolution of the heatmap
  • ncores (type=int;pgarg) Number of cores to use for parallelizationDefaults to ScrnaMetabolicLandscape.ncores
  • ntimes (type=int) Number of times to do the permutation
  • subsetting (type=auto;pgarg;readonly) How do we subset the data. Othercolumns in the metadata to do comparisons. For example, "TimePoint" or ["TimePoint", "Response"]. Defaults to ScrnaMetabolicLandscape.subsetting
  • subsetting_prefix (type=auto;pgarg;readonly) Working as a prefix tosubset names. For example, if we have subsetting_prefix = "timepoint" and we have pre and post in the subsetting column, the subsets will be named as timepoint_pre and timepoint_post. If subsetting is a list, then this should also be a same-length list. If a single string is given, it will be repeated to a list with the same length as subsetting. Defaults to ScrnaMetabolicLandscape.subsetting_prefix
  • violin_devpars (ns) Device parameters for the violin plot
    • - width (type=int): Width of the violin plot
    • - height (type=int): Height of the violin plot
    • - res (type=int): Resolution of the violin plot
Requires
  • r-complexheatmap
    • check: {{proc.lang}} <(echo "library(ComplexHeatmap)")
  • r-ggplot2
    • check: {{proc.lang}} <(echo "library(ggplot2)")
  • r-ggprism
    • check: {{proc.lang}} <(echo "library(ggprism)")
  • r-parallel
    • check: {{proc.lang}} <(echo "library(parallel)")
  • r-rcolorbrewer
    • check: {{proc.lang}} <(echo "library(RColorBrewer)")
  • r-reshape2
    • check: {{proc.lang}} <(echo "library(reshape2)")
  • r-scater
    • check: {{proc.lang}} <(echo "library(scater)")
Classes
Methods
  • __init_subclass__() Do the requirements inferring since we need them to build up theprocess relationship </>
  • from_proc(proc, name, desc, envs, envs_depth, cache, export, error_strategy, num_retries, forks, input_data, order, plugin_opts, requires, scheduler, scheduler_opts, submission_batch) (Type) Create a subclass of Proc using another Proc subclass or Proc itself</>
  • gc() GC process for the process to save memory after it's done</>
  • init() Init all other properties and jobs</>
  • log(level, msg, *args, logger) Log message for the process</>
  • run() Run the process</>
class

pipen.proc.ProcMeta(name, bases, namespace, **kwargs)

Bases
abc.ABCMeta

Meta class for Proc

Methods
  • __call__(cls, *args, **kwds) (Proc) Make sure Proc subclasses are singletons</>
  • __instancecheck__(cls, instance) Override for isinstance(instance, cls).</>
  • __repr__(cls) (str) Representation for the Proc subclasses</>
  • __subclasscheck__(cls, subclass) Override for issubclass(subclass, cls).</>
  • register(cls, subclass) Register a virtual subclass of an ABC.</>
staticmethod
register(cls, subclass)

Register a virtual subclass of an ABC.

Returns the subclass, to allow usage as a class decorator.

staticmethod
__instancecheck__(cls, instance)

Override for isinstance(instance, cls).

staticmethod
__subclasscheck__(cls, subclass)

Override for issubclass(subclass, cls).

staticmethod
__repr__(cls) → str

Representation for the Proc subclasses

staticmethod
__call__(cls, *args, **kwds)

Make sure Proc subclasses are singletons

Parameters
  • *args (Any) and
  • **kwds (Any) Arguments for the constructor
Returns (Proc)

The Proc instance

classmethod

from_proc(proc, name=None, desc=None, envs=None, envs_depth=None, cache=None, export=None, error_strategy=None, num_retries=None, forks=None, input_data=None, order=None, plugin_opts=None, requires=None, scheduler=None, scheduler_opts=None, submission_batch=None)

Create a subclass of Proc using another Proc subclass or Proc itself

Parameters
  • proc (Type) The Proc subclass
  • name (str, optional) The new name of the process
  • desc (str, optional) The new description of the process
  • envs (Mapping, optional) The arguments of the process, will overwrite parent oneThe items that are specified will be inherited
  • envs_depth (int, optional) How deep to update the envs when subclassed.
  • cache (bool, optional) Whether we should check the cache for the jobs
  • export (bool, optional) When True, the results will be exported to<pipeline.outdir> Defaults to None, meaning only end processes will export. You can set it to True/False to enable or disable exporting for processes
  • error_strategy (str, optional) How to deal with the errors
    • - retry, ignore, halt
    • - halt to halt the whole pipeline, no submitting new jobs
    • - terminate to just terminate the job itself
  • num_retries (int, optional) How many times to retry to jobs once error occurs
  • forks (int, optional) New forks for the new process
  • input_data (Any, optional) The input data for the process. Only when this processis a start process
  • order (int, optional) The order to execute the new process
  • plugin_opts (Mapping, optional) The new plugin options, unspecified items will beinherited.
  • requires (Sequence, optional) The required processes for the new process
  • scheduler (str, optional) The new shedular to run the new process
  • scheduler_opts (Mapping, optional) The new scheduler options, unspecified items willbe inherited.
  • submission_batch (int, optional) How many jobs to be submited simultaneously
Returns (Type)

The new process class

classmethod

__init_subclass__()

Do the requirements inferring since we need them to build up theprocess relationship

method

init()

Init all other properties and jobs

method

gc()

GC process for the process to save memory after it's done

method

log(level, msg, *args, logger=<LoggerAdapter pipen.core (WARNING)>)

Log message for the process

Parameters
  • level (int | str) The log level of the record
  • msg (str) The message to log
  • *args The arguments to format the message
  • logger (LoggerAdapter, optional) The logging logger
method

run()

Run the process

Bases
biopipen.core.proc.Proc pipen.proc.Proc

This process performs enrichment analysis for the metabolic pathwaysfor each group in each subset.

The enrichment analysis is done with fgsea package or the GSEA_R package.

Attributes
  • cache Should we detect whether the jobs are cached?
  • desc The description of the process. Will use the summary fromthe docstring by default.
  • dirsig When checking the signature for caching, whether should we walkthrough the content of the directory? This is sometimes time-consuming if the directory is big.
  • envs The arguments that are job-independent, useful for common optionsacross jobs.
  • envs_depth How deep to update the envs when subclassed.
  • error_strategy How to deal with the errors
    • - retry, ignore, halt
    • - halt to halt the whole pipeline, no submitting new jobs
    • - terminate to just terminate the job itself
  • export When True, the results will be exported to <pipeline.outdir>Defaults to None, meaning only end processes will export. You can set it to True/False to enable or disable exporting for processes
  • forks How many jobs to run simultaneously?
  • input The keys for the input channel
  • input_data The input data (will be computed for dependent processes)
  • lang The language for the script to run. Should be the path to theinterpreter if lang is not in $PATH.
  • name The name of the process. Will use the class name by default.
  • nexts Computed from requires to build the process relationships
  • num_retries How many times to retry to jobs once error occurs
  • order The execution order for this process. The bigger the numberis, the later the process will be executed. Default: 0. Note that the dependent processes will always be executed first. This doesn't work for start processes either, whose orders are determined by Pipen.set_starts()
  • output The output keys for the output channel(the data will be computed)
  • output_data The output data (to pass to the next processes)
  • plugin_opts Options for process-level plugins
  • requires The dependency processes
  • scheduler The scheduler to run the jobs
  • scheduler_opts The options for the scheduler
  • script The script template for the process
  • submission_batch How many jobs to be submited simultaneously
  • template Define the template engine to use.This could be either a template engine or a dict with key engine indicating the template engine and the rest the arguments passed to the constructor of the pipen.template.Template object. The template engine could be either the name of the engine, currently jinja2 and liquidpy are supported, or a subclass of pipen.template.Template. You can subclass pipen.template.Template to use your own template engine.
Envs
  • fgsea (flag) Whether to do fast gsea analysis using fgsea package.If False, the GSEA_R package will be used.
  • gmtfile (pgarg) The GMT file with the metabolic pathways.Defaults to ScrnaMetabolicLandscape.gmtfile
  • grouping (type=auto;pgarg;readonly) Defines the basic groups toinvestigate the metabolic activity. Defaults to ScrnaMetabolicLandscape.grouping
  • grouping_prefix (type=auto;pgarg;readonly) Working as a prefix togroup names. Defaults to ScrnaMetabolicLandscape.grouping_prefix
  • ncores (type=int;pgarg) Number of cores to use for parallelization.Defaults to ScrnaMetabolicLandscape.ncores
  • prerank_method (choice) Method to use for gene preranking.Signal to noise: the larger the differences of the means (scaled by the standard deviations); that is, the more distinct the gene expression is in each phenotype and the more the gene acts as a “class marker.”. Absolute signal to noise: the absolute value of the signal to noise. T test: Uses the difference of means scaled by the standard deviation and number of samples. Ratio of classes: Uses the ratio of class means to calculate fold change for natural scale data. Diff of classes: Uses the difference of class means to calculate fold change for nature scale data Log2 ratio of classes: Uses the log2 ratio of class means to calculate fold change for natural scale data. This is the recommended statistic for calculating fold change for log scale data.
    • - signal_to_noise: Signal to noise
    • - s2n: Alias of signal_to_noise
    • - abs_signal_to_noise: absolute signal to noise
    • - abs_s2n: Alias of abs_signal_to_noise
    • - t_test: T test
    • - ratio_of_classes: Also referred to as fold change
    • - diff_of_classes: Difference of class means
    • - log2_ratio_of_classes: Log2 ratio of class means
  • subsetting (type=auto;pgarg;readonly) How do we subset the data.Another column(s) in the metadata. Defaults to ScrnaMetabolicLandscape.subsetting
  • subsetting_prefix (type=auto;pgarg;readonly) Working as a prefix tosubset names. Defaults to ScrnaMetabolicLandscape.subsetting_prefix
  • top (type=int) N top of enriched pathways to show
Requires
  • r-fgsea
    • check: {{proc.lang}} <(echo "library(fgsea)")
  • r-parallel
    • check: {{proc.lang}} <(echo "library(parallel)")
Classes
Methods
  • __init_subclass__() Do the requirements inferring since we need them to build up theprocess relationship </>
  • from_proc(proc, name, desc, envs, envs_depth, cache, export, error_strategy, num_retries, forks, input_data, order, plugin_opts, requires, scheduler, scheduler_opts, submission_batch) (Type) Create a subclass of Proc using another Proc subclass or Proc itself</>
  • gc() GC process for the process to save memory after it's done</>
  • init() Init all other properties and jobs</>
  • log(level, msg, *args, logger) Log message for the process</>
  • run() Run the process</>
class

pipen.proc.ProcMeta(name, bases, namespace, **kwargs)

Bases
abc.ABCMeta

Meta class for Proc

Methods
  • __call__(cls, *args, **kwds) (Proc) Make sure Proc subclasses are singletons</>
  • __instancecheck__(cls, instance) Override for isinstance(instance, cls).</>
  • __repr__(cls) (str) Representation for the Proc subclasses</>
  • __subclasscheck__(cls, subclass) Override for issubclass(subclass, cls).</>
  • register(cls, subclass) Register a virtual subclass of an ABC.</>
staticmethod
register(cls, subclass)

Register a virtual subclass of an ABC.

Returns the subclass, to allow usage as a class decorator.

staticmethod
__instancecheck__(cls, instance)

Override for isinstance(instance, cls).

staticmethod
__subclasscheck__(cls, subclass)

Override for issubclass(subclass, cls).

staticmethod
__repr__(cls) → str

Representation for the Proc subclasses

staticmethod
__call__(cls, *args, **kwds)

Make sure Proc subclasses are singletons

Parameters
  • *args (Any) and
  • **kwds (Any) Arguments for the constructor
Returns (Proc)

The Proc instance

classmethod

from_proc(proc, name=None, desc=None, envs=None, envs_depth=None, cache=None, export=None, error_strategy=None, num_retries=None, forks=None, input_data=None, order=None, plugin_opts=None, requires=None, scheduler=None, scheduler_opts=None, submission_batch=None)

Create a subclass of Proc using another Proc subclass or Proc itself

Parameters
  • proc (Type) The Proc subclass
  • name (str, optional) The new name of the process
  • desc (str, optional) The new description of the process
  • envs (Mapping, optional) The arguments of the process, will overwrite parent oneThe items that are specified will be inherited
  • envs_depth (int, optional) How deep to update the envs when subclassed.
  • cache (bool, optional) Whether we should check the cache for the jobs
  • export (bool, optional) When True, the results will be exported to<pipeline.outdir> Defaults to None, meaning only end processes will export. You can set it to True/False to enable or disable exporting for processes
  • error_strategy (str, optional) How to deal with the errors
    • - retry, ignore, halt
    • - halt to halt the whole pipeline, no submitting new jobs
    • - terminate to just terminate the job itself
  • num_retries (int, optional) How many times to retry to jobs once error occurs
  • forks (int, optional) New forks for the new process
  • input_data (Any, optional) The input data for the process. Only when this processis a start process
  • order (int, optional) The order to execute the new process
  • plugin_opts (Mapping, optional) The new plugin options, unspecified items will beinherited.
  • requires (Sequence, optional) The required processes for the new process
  • scheduler (str, optional) The new shedular to run the new process
  • scheduler_opts (Mapping, optional) The new scheduler options, unspecified items willbe inherited.
  • submission_batch (int, optional) How many jobs to be submited simultaneously
Returns (Type)

The new process class

classmethod

__init_subclass__()

Do the requirements inferring since we need them to build up theprocess relationship

method

init()

Init all other properties and jobs

method

gc()

GC process for the process to save memory after it's done

method

log(level, msg, *args, logger=<LoggerAdapter pipen.core (WARNING)>)

Log message for the process

Parameters
  • level (int | str) The log level of the record
  • msg (str) The message to log
  • *args The arguments to format the message
  • logger (LoggerAdapter, optional) The logging logger
method

run()

Run the process

Bases
biopipen.core.proc.Proc pipen.proc.Proc

Intra-subset metabolic features - Enrichment analysis in details

Similar to the MetabolicFeatures process, this process performs enrichment analysis for the metabolic pathways for each subset in each group, instead of each group in each subset.

Attributes
  • cache Should we detect whether the jobs are cached?
  • desc The description of the process. Will use the summary fromthe docstring by default.
  • dirsig When checking the signature for caching, whether should we walkthrough the content of the directory? This is sometimes time-consuming if the directory is big.
  • envs The arguments that are job-independent, useful for common optionsacross jobs.
  • envs_depth How deep to update the envs when subclassed.
  • error_strategy How to deal with the errors
    • - retry, ignore, halt
    • - halt to halt the whole pipeline, no submitting new jobs
    • - terminate to just terminate the job itself
  • export When True, the results will be exported to <pipeline.outdir>Defaults to None, meaning only end processes will export. You can set it to True/False to enable or disable exporting for processes
  • forks How many jobs to run simultaneously?
  • input The keys for the input channel
  • input_data The input data (will be computed for dependent processes)
  • lang The language for the script to run. Should be the path to theinterpreter if lang is not in $PATH.
  • name The name of the process. Will use the class name by default.
  • nexts Computed from requires to build the process relationships
  • num_retries How many times to retry to jobs once error occurs
  • order The execution order for this process. The bigger the numberis, the later the process will be executed. Default: 0. Note that the dependent processes will always be executed first. This doesn't work for start processes either, whose orders are determined by Pipen.set_starts()
  • output The output keys for the output channel(the data will be computed)
  • output_data The output data (to pass to the next processes)
  • plugin_opts Options for process-level plugins
  • requires The dependency processes
  • scheduler The scheduler to run the jobs
  • scheduler_opts The options for the scheduler
  • script The script template for the process
  • submission_batch How many jobs to be submited simultaneously
  • template Define the template engine to use.This could be either a template engine or a dict with key engine indicating the template engine and the rest the arguments passed to the constructor of the pipen.template.Template object. The template engine could be either the name of the engine, currently jinja2 and liquidpy are supported, or a subclass of pipen.template.Template. You can subclass pipen.template.Template to use your own template engine.
Envs
  • fgsea (flag) Whether to do fast gsea analysis
  • gmtfile (pgarg) The GMT file with the metabolic pathways.Defaults to ScrnaMetabolicLandscape.gmtfile
  • grouping (type=auto;pgarg;readonly) Defines the basic groups toinvestigate the metabolic activity. Defaults to ScrnaMetabolicLandscape.grouping
  • grouping_prefix (type=auto;pgarg;readonly) Working as a prefix to groupnames. Defaults to ScrnaMetabolicLandscape.grouping_prefix
  • ncores (type=int; pgarg) Number of cores to use for parallelizationDefaults to ScrnaMetabolicLandscape.ncores
  • prerank_method (choice) Method to use for gene prerankingSignal to noise: the larger the differences of the means (scaled by the standard deviations); that is, the more distinct the gene expression is in each phenotype and the more the gene acts as a “class marker.”. Absolute signal to noise: the absolute value of the signal to noise. T test: Uses the difference of means scaled by the standard deviation and number of samples. Ratio of classes: Uses the ratio of class means to calculate fold change for natural scale data. Diff of classes: Uses the difference of class means to calculate fold change for nature scale data Log2 ratio of classes: Uses the log2 ratio of class means to calculate fold change for natural scale data. This is the recommended statistic for calculating fold change for log scale data.
    • - signal_to_noise: Signal to noise
    • - s2n: Alias of signal_to_noise
    • - abs_signal_to_noise: absolute signal to noise
    • - abs_s2n: Alias of abs_signal_to_noise
    • - t_test: T test
    • - ratio_of_classes: Also referred to as fold change
    • - diff_of_classes: Difference of class means
    • - log2_ratio_of_classes: Log2 ratio of class means
  • subsetting (type=auto;pgarg;readonly) How do we subset the data.Another column(s) in the metadata. Defaults to ScrnaMetabolicLandscape.subsetting
  • subsetting_comparison (type=json;pgarg;readonly) How do we compare thesubsets. Defaults to ScrnaMetabolicLandscape.subsetting_comparison
  • subsetting_prefix (type=auto;pgarg;readonly) Working as a prefix tosubset names. Defaults to ScrnaMetabolicLandscape.subsetting_prefix
  • top (type=int) N top of enriched pathways to show
Requires
  • r-fgsea
    • check: {{proc.lang}} <(echo "library(fgsea)")
  • r-parallel
    • check: {{proc.lang}} <(echo "library(parallel)")
  • r-scater
    • check: {{proc.lang}} <(echo "library(scater)")
Classes
Methods
  • __init_subclass__() Do the requirements inferring since we need them to build up theprocess relationship </>
  • from_proc(proc, name, desc, envs, envs_depth, cache, export, error_strategy, num_retries, forks, input_data, order, plugin_opts, requires, scheduler, scheduler_opts, submission_batch) (Type) Create a subclass of Proc using another Proc subclass or Proc itself</>
  • gc() GC process for the process to save memory after it's done</>
  • init() Init all other properties and jobs</>
  • log(level, msg, *args, logger) Log message for the process</>
  • run() Run the process</>
class

pipen.proc.ProcMeta(name, bases, namespace, **kwargs)

Bases
abc.ABCMeta

Meta class for Proc

Methods
  • __call__(cls, *args, **kwds) (Proc) Make sure Proc subclasses are singletons</>
  • __instancecheck__(cls, instance) Override for isinstance(instance, cls).</>
  • __repr__(cls) (str) Representation for the Proc subclasses</>
  • __subclasscheck__(cls, subclass) Override for issubclass(subclass, cls).</>
  • register(cls, subclass) Register a virtual subclass of an ABC.</>
staticmethod
register(cls, subclass)

Register a virtual subclass of an ABC.

Returns the subclass, to allow usage as a class decorator.

staticmethod
__instancecheck__(cls, instance)

Override for isinstance(instance, cls).

staticmethod
__subclasscheck__(cls, subclass)

Override for issubclass(subclass, cls).

staticmethod
__repr__(cls) → str

Representation for the Proc subclasses

staticmethod
__call__(cls, *args, **kwds)

Make sure Proc subclasses are singletons

Parameters
  • *args (Any) and
  • **kwds (Any) Arguments for the constructor
Returns (Proc)

The Proc instance

classmethod

from_proc(proc, name=None, desc=None, envs=None, envs_depth=None, cache=None, export=None, error_strategy=None, num_retries=None, forks=None, input_data=None, order=None, plugin_opts=None, requires=None, scheduler=None, scheduler_opts=None, submission_batch=None)

Create a subclass of Proc using another Proc subclass or Proc itself

Parameters
  • proc (Type) The Proc subclass
  • name (str, optional) The new name of the process
  • desc (str, optional) The new description of the process
  • envs (Mapping, optional) The arguments of the process, will overwrite parent oneThe items that are specified will be inherited
  • envs_depth (int, optional) How deep to update the envs when subclassed.
  • cache (bool, optional) Whether we should check the cache for the jobs
  • export (bool, optional) When True, the results will be exported to<pipeline.outdir> Defaults to None, meaning only end processes will export. You can set it to True/False to enable or disable exporting for processes
  • error_strategy (str, optional) How to deal with the errors
    • - retry, ignore, halt
    • - halt to halt the whole pipeline, no submitting new jobs
    • - terminate to just terminate the job itself
  • num_retries (int, optional) How many times to retry to jobs once error occurs
  • forks (int, optional) New forks for the new process
  • input_data (Any, optional) The input data for the process. Only when this processis a start process
  • order (int, optional) The order to execute the new process
  • plugin_opts (Mapping, optional) The new plugin options, unspecified items will beinherited.
  • requires (Sequence, optional) The required processes for the new process
  • scheduler (str, optional) The new shedular to run the new process
  • scheduler_opts (Mapping, optional) The new scheduler options, unspecified items willbe inherited.
  • submission_batch (int, optional) How many jobs to be submited simultaneously
Returns (Type)

The new process class

classmethod

__init_subclass__()

Do the requirements inferring since we need them to build up theprocess relationship

method

init()

Init all other properties and jobs

method

gc()

GC process for the process to save memory after it's done

method

log(level, msg, *args, logger=<LoggerAdapter pipen.core (WARNING)>)

Log message for the process

Parameters
  • level (int | str) The log level of the record
  • msg (str) The message to log
  • *args The arguments to format the message
  • logger (LoggerAdapter, optional) The logging logger
method

run()

Run the process

Bases
biopipen.core.proc.Proc pipen.proc.Proc

Calculate Metabolic Pathway heterogeneity.

For each subset, the normalized enrichment score (NES) of each metabolic pathway is calculated for each group. The NES is calculated by comparing the enrichment score of the subset to the enrichment scores of the same subset in the permutations. The p-value is calculated by comparing the NES to the NESs of the same subset in the permutations. The heterogeneity can be reflected by the NES values and the p-values in different groups for the metabolic pathways.

MetabolicPathwayHeterogeneity

Attributes
  • cache Should we detect whether the jobs are cached?
  • desc The description of the process. Will use the summary fromthe docstring by default.
  • dirsig When checking the signature for caching, whether should we walkthrough the content of the directory? This is sometimes time-consuming if the directory is big.
  • envs The arguments that are job-independent, useful for common optionsacross jobs.
  • envs_depth How deep to update the envs when subclassed.
  • error_strategy How to deal with the errors
    • - retry, ignore, halt
    • - halt to halt the whole pipeline, no submitting new jobs
    • - terminate to just terminate the job itself
  • export When True, the results will be exported to <pipeline.outdir>Defaults to None, meaning only end processes will export. You can set it to True/False to enable or disable exporting for processes
  • forks How many jobs to run simultaneously?
  • input The keys for the input channel
  • input_data The input data (will be computed for dependent processes)
  • lang The language for the script to run. Should be the path to theinterpreter if lang is not in $PATH.
  • name The name of the process. Will use the class name by default.
  • nexts Computed from requires to build the process relationships
  • num_retries How many times to retry to jobs once error occurs
  • order The execution order for this process. The bigger the numberis, the later the process will be executed. Default: 0. Note that the dependent processes will always be executed first. This doesn't work for start processes either, whose orders are determined by Pipen.set_starts()
  • output The output keys for the output channel(the data will be computed)
  • output_data The output data (to pass to the next processes)
  • plugin_opts Options for process-level plugins
  • requires The dependency processes
  • scheduler The scheduler to run the jobs
  • scheduler_opts The options for the scheduler
  • script The script template for the process
  • submission_batch How many jobs to be submited simultaneously
  • template Define the template engine to use.This could be either a template engine or a dict with key engine indicating the template engine and the rest the arguments passed to the constructor of the pipen.template.Template object. The template engine could be either the name of the engine, currently jinja2 and liquidpy are supported, or a subclass of pipen.template.Template. You can subclass pipen.template.Template to use your own template engine.
Envs
  • bubble_devpars (ns) The devpars for the bubble plot
    • - width (type=int): The width of the plot
    • - height (type=int): The height of the plot
    • - res (type=int): The resolution of the plot
  • gmtfile (pgarg) The GMT file with the metabolic pathways.Defaults to ScrnaMetabolicLandscape.gmtfile
  • grouping (type=auto;pgarg;readonly) Defines the basic groups toinvestigate the metabolic activity. Defaults to ScrnaMetabolicLandscape.grouping
  • grouping_prefix (type=auto;pgarg;readonly) Working as a prefix to groupnames. Defaults to ScrnaMetabolicLandscape.grouping_prefix
  • ncores (type=int;pgarg) Number of cores to use for parallelizationDefaults to ScrnaMetabolicLandscape.ncores
  • pathway_pval_cutoff (type=float) The p-value cutoff to selectthe enriched pathways
  • select_pcs (type=float) Select the PCs to use for the analysis.
  • subsetting (type=auto;pgarg;readonly) How do we subset the data.Another column(s) in the metadata. Defaults to ScrnaMetabolicLandscape.subsetting
  • subsetting_prefix (type=auto;pgarg;readonly) Working as a prefix tosubset names. Defaults to ScrnaMetabolicLandscape.subsetting_prefix
Requires
  • r-data.table
    • check: {{proc.lang}} <(echo "library(data.table)")
  • r-dplyr
    • check: {{proc.lang}} <(echo "library(dplyr)")
  • r-enrichr
    • check: {{proc.lang}} <(echo "library(enrichR)")
  • r-fgsea
    • check: {{proc.lang}} <(echo "library(fgsea)")
  • r-ggplot2
    • check: {{proc.lang}} <(echo "library(ggplot2)")
  • r-ggprism
    • check: {{proc.lang}} <(echo "library(ggprism)")
  • r-gtools
    • check: {{proc.lang}} <(echo "library(gtools)")
  • r-parallel
    • check: {{proc.lang}} <(echo "library(parallel)")
  • r-tibble
    • check: {{proc.lang}} <(echo "library(tibble)")
Classes
Methods
  • __init_subclass__() Do the requirements inferring since we need them to build up theprocess relationship </>
  • from_proc(proc, name, desc, envs, envs_depth, cache, export, error_strategy, num_retries, forks, input_data, order, plugin_opts, requires, scheduler, scheduler_opts, submission_batch) (Type) Create a subclass of Proc using another Proc subclass or Proc itself</>
  • gc() GC process for the process to save memory after it's done</>
  • init() Init all other properties and jobs</>
  • log(level, msg, *args, logger) Log message for the process</>
  • run() Run the process</>
class

pipen.proc.ProcMeta(name, bases, namespace, **kwargs)

Bases
abc.ABCMeta

Meta class for Proc

Methods
  • __call__(cls, *args, **kwds) (Proc) Make sure Proc subclasses are singletons</>
  • __instancecheck__(cls, instance) Override for isinstance(instance, cls).</>
  • __repr__(cls) (str) Representation for the Proc subclasses</>
  • __subclasscheck__(cls, subclass) Override for issubclass(subclass, cls).</>
  • register(cls, subclass) Register a virtual subclass of an ABC.</>
staticmethod
register(cls, subclass)

Register a virtual subclass of an ABC.

Returns the subclass, to allow usage as a class decorator.

staticmethod
__instancecheck__(cls, instance)

Override for isinstance(instance, cls).

staticmethod
__subclasscheck__(cls, subclass)

Override for issubclass(subclass, cls).

staticmethod
__repr__(cls) → str

Representation for the Proc subclasses

staticmethod
__call__(cls, *args, **kwds)

Make sure Proc subclasses are singletons

Parameters
  • *args (Any) and
  • **kwds (Any) Arguments for the constructor
Returns (Proc)

The Proc instance

classmethod

from_proc(proc, name=None, desc=None, envs=None, envs_depth=None, cache=None, export=None, error_strategy=None, num_retries=None, forks=None, input_data=None, order=None, plugin_opts=None, requires=None, scheduler=None, scheduler_opts=None, submission_batch=None)

Create a subclass of Proc using another Proc subclass or Proc itself

Parameters
  • proc (Type) The Proc subclass
  • name (str, optional) The new name of the process
  • desc (str, optional) The new description of the process
  • envs (Mapping, optional) The arguments of the process, will overwrite parent oneThe items that are specified will be inherited
  • envs_depth (int, optional) How deep to update the envs when subclassed.
  • cache (bool, optional) Whether we should check the cache for the jobs
  • export (bool, optional) When True, the results will be exported to<pipeline.outdir> Defaults to None, meaning only end processes will export. You can set it to True/False to enable or disable exporting for processes
  • error_strategy (str, optional) How to deal with the errors
    • - retry, ignore, halt
    • - halt to halt the whole pipeline, no submitting new jobs
    • - terminate to just terminate the job itself
  • num_retries (int, optional) How many times to retry to jobs once error occurs
  • forks (int, optional) New forks for the new process
  • input_data (Any, optional) The input data for the process. Only when this processis a start process
  • order (int, optional) The order to execute the new process
  • plugin_opts (Mapping, optional) The new plugin options, unspecified items will beinherited.
  • requires (Sequence, optional) The required processes for the new process
  • scheduler (str, optional) The new shedular to run the new process
  • scheduler_opts (Mapping, optional) The new scheduler options, unspecified items willbe inherited.
  • submission_batch (int, optional) How many jobs to be submited simultaneously
Returns (Type)

The new process class

classmethod

__init_subclass__()

Do the requirements inferring since we need them to build up theprocess relationship

method

init()

Init all other properties and jobs

method

gc()

GC process for the process to save memory after it's done

method

log(level, msg, *args, logger=<LoggerAdapter pipen.core (WARNING)>)

Log message for the process

Parameters
  • level (int | str) The log level of the record
  • msg (str) The message to log
  • *args The arguments to format the message
  • logger (LoggerAdapter, optional) The logging logger
method

run()

Run the process

Bases
pipen_args.procgroup.ProcGroup pipen.procgroup.ProcGroup

Metabolic landscape analysis for scRNA-seq data

An abstract from https://github.com/LocasaleLab/Single-Cell-Metabolic-Landscape

See docs here for more details https://pwwang.github.io/biopipen/pipelines/scrna_metabolic_landscape

Reference: Xiao, Zhengtao, Ziwei Dai, and Jason W. Locasale. "Metabolic landscape of the tumor microenvironment at single cell resolution." Nature communications 10.1 (2019): 1-12.

Attributes
  • parser Pass arguments to initialize the parser
    The parser is a singleton and by default initalized at plugin.on_init() hook, which happens usually after the initialization of a process group. </>
Classes
Methods
  • __init_subclass__() This method is called when a class is subclassed.</>
  • add_proc(self_or_method, proc) (Union) Add a process to the proc group</>
  • as_pipen(name, desc, outdir, **kwargs) (Pipen) Convert the pipeline to a Pipen instance</>
  • post_init() Load runtime processes</>
class

pipen.procgroup.ProcGropuMeta(name, bases, namespace, **kwargs)

Bases
abc.ABCMeta

Meta class for ProcGroup

Methods
staticmethod
register(cls, subclass)

Register a virtual subclass of an ABC.

Returns the subclass, to allow usage as a class decorator.

staticmethod
__instancecheck__(cls, instance)

Override for isinstance(instance, cls).

staticmethod
__subclasscheck__(cls, subclass)

Override for issubclass(subclass, cls).

staticmethod
__call__(cls, *args, **kwds)

Make sure Proc subclasses are singletons

Parameters
  • *args and
  • **kwds Arguments for the constructor
Returns

The Proc instance

classmethod

__init_subclass__()

This method is called when a class is subclassed.

The default implementation does nothing. It may be overridden to extend subclasses.

staticmethod

add_proc(self_or_method, proc=None)

Add a process to the proc group

It works either as a decorator to the process directly or as a decorator to a method that returns the process.

Parameters
  • self_or_method (Union) The proc group instance or a method thatreturns the process
  • proc (Optional, optional) The process class if self_or_method is the proc group
Returns (Union)

The process class if self_or_method is the proc group, ora cached property that returns the process class

method

as_pipen(name=None, desc=None, outdir=None, **kwargs)

Convert the pipeline to a Pipen instance

Parameters
  • name (str | none, optional) The name of the pipeline
  • desc (str | none, optional) The description of the pipeline
  • outdir (str | os.pathlike | none, optional) The output directory of the pipeline
  • **kwargs The keyword arguments to pass to Pipen
Returns (Pipen)

The Pipen instance

method

post_init()

Load runtime processes