biopipen.ns.plot
Plotting data
VennDiagram
(Proc) — Plot Venn diagram</>Heatmap
(Proc) — Plot heatmaps usingComplexHeatmap
</>ROC
(Proc) — Plot ROC curve usingplotROC
.</>Manhattan
(Proc) — Plot Manhattan plot.</>QQPlot
(Proc) — Generate QQ-plot or PP-plot using qqplotr.</>Scatter
(Proc) — Generate scatter plot using ggplot2.</>
biopipen.ns.plot.
VennDiagram
(
*args
, **kwds
)
→ Proc
Plot Venn diagram
Needs ggVennDiagram
cache
— Should we detect whether the jobs are cached?desc
— The description of the process. Will use the summary fromthe docstring by default.dirsig
— When checking the signature for caching, whether should we walkthrough the content of the directory? This is sometimes time-consuming if the directory is big.envs
— The arguments that are job-independent, useful for common optionsacross jobs.envs_depth
— How deep to update the envs when subclassed.error_strategy
— How to deal with the errors- - retry, ignore, halt
- - halt to halt the whole pipeline, no submitting new jobs
- - terminate to just terminate the job itself
export
— When True, the results will be exported to<pipeline.outdir>
Defaults to None, meaning only end processes will export. You can set it to True/False to enable or disable exporting for processesforks
— How many jobs to run simultaneously?input
— The keys for the input channelinput_data
— The input data (will be computed for dependent processes)lang
— The language for the script to run. Should be the path to theinterpreter iflang
is not in$PATH
.name
— The name of the process. Will use the class name by default.nexts
— Computed fromrequires
to build the process relationshipsnum_retries
— How many times to retry to jobs once error occursorder
— The execution order for this process. The bigger the numberis, the later the process will be executed. Default: 0. Note that the dependent processes will always be executed first. This doesn't work for start processes either, whose orders are determined byPipen.set_starts()
output
— The output keys for the output channel(the data will be computed)output_data
— The output data (to pass to the next processes)plugin_opts
— Options for process-level pluginsrequires
— The dependency processesscheduler
— The scheduler to run the jobsscheduler_opts
— The options for the schedulerscript
— The script template for the processsubmission_batch
— How many jobs to be submited simultaneouslytemplate
— Define the template engine to use.This could be either a template engine or a dict with keyengine
indicating the template engine and the rest the arguments passed to the constructor of thepipen.template.Template
object. The template engine could be either the name of the engine, currently jinja2 and liquidpy are supported, or a subclass ofpipen.template.Template
. You can subclasspipen.template.Template
to use your own template engine.
infile
— The input file for dataIfenvs.intype
is raw, it should be a data frame with row names as categories and only column as elements separated by comma (,
) If it iscomputed
, it should be a data frame with row names the elements and columns the categories. The data should be binary indicator (0, 1
) indicating whether the elements are present in the categories.
outfile
— The output figure file
args
— Additional arguments forggVennDiagram()
devpars
— The parameters forpng()
ggs
— Additional ggplot expression to adjust the plotinopts
— The options forread.table()
to readin.infile
intype
—raw
orcomputed
. Seein.infile
__init_subclass__
(
)
— Do the requirements inferring since we need them to build up theprocess relationship </>from_proc
(
proc
,name
,desc
,envs
,envs_depth
,cache
,export
,error_strategy
,num_retries
,forks
,input_data
,order
,plugin_opts
,requires
,scheduler
,scheduler_opts
,submission_batch
)
(Type) — Create a subclass of Proc using another Proc subclass or Proc itself</>gc
(
)
— GC process for the process to save memory after it's done</>init
(
)
— Init all other properties and jobs</>log
(
level
,msg
,*args
,logger
)
— Log message for the process</>run
(
)
— Run the process</>
pipen.proc.
ProcMeta
(
name
, bases
, namespace
, **kwargs
)
Meta class for Proc
__call__
(
cls
,*args
,**kwds
)
(Proc) — Make sure Proc subclasses are singletons</>__instancecheck__
(
cls
,instance
)
— Override for isinstance(instance, cls).</>__repr__
(
cls
)
(str) — Representation for the Proc subclasses</>__subclasscheck__
(
cls
,subclass
)
— Override for issubclass(subclass, cls).</>register
(
cls
,subclass
)
— Register a virtual subclass of an ABC.</>
register
(
cls
, subclass
)
Register a virtual subclass of an ABC.
Returns the subclass, to allow usage as a class decorator.
__instancecheck__
(
cls
, instance
)
Override for isinstance(instance, cls).
__subclasscheck__
(
cls
, subclass
)
Override for issubclass(subclass, cls).
__repr__
(
cls
)
→ strRepresentation for the Proc subclasses
__call__
(
cls
, *args
, **kwds
)
Make sure Proc subclasses are singletons
*args
(Any) — and**kwds
(Any) — Arguments for the constructor
The Proc instance
from_proc
(
proc
, name=None
, desc=None
, envs=None
, envs_depth=None
, cache=None
, export=None
, error_strategy=None
, num_retries=None
, forks=None
, input_data=None
, order=None
, plugin_opts=None
, requires=None
, scheduler=None
, scheduler_opts=None
, submission_batch=None
)
Create a subclass of Proc using another Proc subclass or Proc itself
proc
(Type) — The Proc subclassname
(str, optional) — The new name of the processdesc
(str, optional) — The new description of the processenvs
(Mapping, optional) — The arguments of the process, will overwrite parent oneThe items that are specified will be inheritedenvs_depth
(int, optional) — How deep to update the envs when subclassed.cache
(bool, optional) — Whether we should check the cache for the jobsexport
(bool, optional) — When True, the results will be exported to<pipeline.outdir>
Defaults to None, meaning only end processes will export. You can set it to True/False to enable or disable exporting for processeserror_strategy
(str, optional) — How to deal with the errors- - retry, ignore, halt
- - halt to halt the whole pipeline, no submitting new jobs
- - terminate to just terminate the job itself
num_retries
(int, optional) — How many times to retry to jobs once error occursforks
(int, optional) — New forks for the new processinput_data
(Any, optional) — The input data for the process. Only when this processis a start processorder
(int, optional) — The order to execute the new processplugin_opts
(Mapping, optional) — The new plugin options, unspecified items will beinherited.requires
(Sequence, optional) — The required processes for the new processscheduler
(str, optional) — The new shedular to run the new processscheduler_opts
(Mapping, optional) — The new scheduler options, unspecified items willbe inherited.submission_batch
(int, optional) — How many jobs to be submited simultaneously
The new process class
__init_subclass__
(
)
Do the requirements inferring since we need them to build up theprocess relationship
init
(
)
Init all other properties and jobs
gc
(
)
GC process for the process to save memory after it's done
log
(
level
, msg
, *args
, logger=<LoggerAdapter pipen.core (WARNING)>
)
Log message for the process
level
(int | str) — The log level of the recordmsg
(str) — The message to log*args
— The arguments to format the messagelogger
(LoggerAdapter, optional) — The logging logger
run
(
)
Run the process
biopipen.ns.plot.
Heatmap
(
*args
, **kwds
)
→ Proc
Plot heatmaps using ComplexHeatmap
cache
— Should we detect whether the jobs are cached?desc
— The description of the process. Will use the summary fromthe docstring by default.dirsig
— When checking the signature for caching, whether should we walkthrough the content of the directory? This is sometimes time-consuming if the directory is big.envs
— The arguments that are job-independent, useful for common optionsacross jobs.envs_depth
— How deep to update the envs when subclassed.error_strategy
— How to deal with the errors- - retry, ignore, halt
- - halt to halt the whole pipeline, no submitting new jobs
- - terminate to just terminate the job itself
export
— When True, the results will be exported to<pipeline.outdir>
Defaults to None, meaning only end processes will export. You can set it to True/False to enable or disable exporting for processesforks
— How many jobs to run simultaneously?input
— The keys for the input channelinput_data
— The input data (will be computed for dependent processes)lang
— The language for the script to run. Should be the path to theinterpreter iflang
is not in$PATH
.name
— The name of the process. Will use the class name by default.nexts
— Computed fromrequires
to build the process relationshipsnum_retries
— How many times to retry to jobs once error occursorder
— The execution order for this process. The bigger the numberis, the later the process will be executed. Default: 0. Note that the dependent processes will always be executed first. This doesn't work for start processes either, whose orders are determined byPipen.set_starts()
output
— The output keys for the output channel(the data will be computed)output_data
— The output data (to pass to the next processes)plugin_opts
— Options for process-level pluginsrequires
— The dependency processesscheduler
— The scheduler to run the jobsscheduler_opts
— The options for the schedulerscript
— The script template for the processsubmission_batch
— How many jobs to be submited simultaneouslytemplate
— Define the template engine to use.This could be either a template engine or a dict with keyengine
indicating the template engine and the rest the arguments passed to the constructor of thepipen.template.Template
object. The template engine could be either the name of the engine, currently jinja2 and liquidpy are supported, or a subclass ofpipen.template.Template
. You can subclasspipen.template.Template
to use your own template engine.
>>> pipen run plot Heatmap >>> --in.infile data.txt >>> --in.annofiles anno.txt >>> --envs.args.row_names_gp 'r:fontsize5' >>> --envs.args.column_names_gp 'r:fontsize5' >>> --envs.args.clustering_distance_rows pearson >>> --envs.args.clustering_distance_columns pearson >>> --envs.args.show_row_names false >>> --envs.args.row_split 3 >>> --args.devpars.width 5000 >>> --args.devpars.height 5000 >>> --args.draw.merge_legends >>> --envs.args.heatmap_legend_param.title AUC >>> --envs.args.row_dend_reorder >>> --envs.args.column_dend_reorder >>> --envs.args.top_annotation >>> 'r:HeatmapAnnotation( >>> Mutation = as.matrix(annos[,(length(groups)+1):ncol(annos)]) >>> )' >>> --envs.args.right_annotation >>> 'r:rowAnnotation( >>> AUC = anno_boxplot(as.matrix(data), outline = F) >>> )' >>> --args.globals >>> 'fontsize8 = gpar(fontsize = 12); >>> fontsize5 = gpar(fontsize = 8); >>> groups = c ("Group1", "Group2", "Group3")' >>> --args.seed 8525
annofiles
— The files for annotation datainfile
— The data matrix file
outdir
— Other data of the heatmapIncluding RDS file of the heatmap, row clusters and col clusters.outfile
— The heatmap plot
anopts
— Options forread.table()
to readin.annofiles
args
— Arguments forComplexHeatmap::Heatmap()
devpars
— The parameters for device.draw
— Options forComplexHeatmap::draw()
globals
— Some globals for the expression inargs
to be evaluatedinopts
— Options forread.table()
to readin.infile
seed
— The seed
bioconductor-complexheatmap
—- check: {{proc.lang}} <(echo "library(ComplexHeatmap)")
__init_subclass__
(
)
— Do the requirements inferring since we need them to build up theprocess relationship </>from_proc
(
proc
,name
,desc
,envs
,envs_depth
,cache
,export
,error_strategy
,num_retries
,forks
,input_data
,order
,plugin_opts
,requires
,scheduler
,scheduler_opts
,submission_batch
)
(Type) — Create a subclass of Proc using another Proc subclass or Proc itself</>gc
(
)
— GC process for the process to save memory after it's done</>init
(
)
— Init all other properties and jobs</>log
(
level
,msg
,*args
,logger
)
— Log message for the process</>run
(
)
— Run the process</>
pipen.proc.
ProcMeta
(
name
, bases
, namespace
, **kwargs
)
Meta class for Proc
__call__
(
cls
,*args
,**kwds
)
(Proc) — Make sure Proc subclasses are singletons</>__instancecheck__
(
cls
,instance
)
— Override for isinstance(instance, cls).</>__repr__
(
cls
)
(str) — Representation for the Proc subclasses</>__subclasscheck__
(
cls
,subclass
)
— Override for issubclass(subclass, cls).</>register
(
cls
,subclass
)
— Register a virtual subclass of an ABC.</>
register
(
cls
, subclass
)
Register a virtual subclass of an ABC.
Returns the subclass, to allow usage as a class decorator.
__instancecheck__
(
cls
, instance
)
Override for isinstance(instance, cls).
__subclasscheck__
(
cls
, subclass
)
Override for issubclass(subclass, cls).
__repr__
(
cls
)
→ strRepresentation for the Proc subclasses
__call__
(
cls
, *args
, **kwds
)
Make sure Proc subclasses are singletons
*args
(Any) — and**kwds
(Any) — Arguments for the constructor
The Proc instance
from_proc
(
proc
, name=None
, desc=None
, envs=None
, envs_depth=None
, cache=None
, export=None
, error_strategy=None
, num_retries=None
, forks=None
, input_data=None
, order=None
, plugin_opts=None
, requires=None
, scheduler=None
, scheduler_opts=None
, submission_batch=None
)
Create a subclass of Proc using another Proc subclass or Proc itself
proc
(Type) — The Proc subclassname
(str, optional) — The new name of the processdesc
(str, optional) — The new description of the processenvs
(Mapping, optional) — The arguments of the process, will overwrite parent oneThe items that are specified will be inheritedenvs_depth
(int, optional) — How deep to update the envs when subclassed.cache
(bool, optional) — Whether we should check the cache for the jobsexport
(bool, optional) — When True, the results will be exported to<pipeline.outdir>
Defaults to None, meaning only end processes will export. You can set it to True/False to enable or disable exporting for processeserror_strategy
(str, optional) — How to deal with the errors- - retry, ignore, halt
- - halt to halt the whole pipeline, no submitting new jobs
- - terminate to just terminate the job itself
num_retries
(int, optional) — How many times to retry to jobs once error occursforks
(int, optional) — New forks for the new processinput_data
(Any, optional) — The input data for the process. Only when this processis a start processorder
(int, optional) — The order to execute the new processplugin_opts
(Mapping, optional) — The new plugin options, unspecified items will beinherited.requires
(Sequence, optional) — The required processes for the new processscheduler
(str, optional) — The new shedular to run the new processscheduler_opts
(Mapping, optional) — The new scheduler options, unspecified items willbe inherited.submission_batch
(int, optional) — How many jobs to be submited simultaneously
The new process class
__init_subclass__
(
)
Do the requirements inferring since we need them to build up theprocess relationship
init
(
)
Init all other properties and jobs
gc
(
)
GC process for the process to save memory after it's done
log
(
level
, msg
, *args
, logger=<LoggerAdapter pipen.core (WARNING)>
)
Log message for the process
level
(int | str) — The log level of the recordmsg
(str) — The message to log*args
— The arguments to format the messagelogger
(LoggerAdapter, optional) — The logging logger
run
(
)
Run the process
biopipen.ns.plot.
ROC
(
*args
, **kwds
)
→ Proc
Plot ROC curve using plotROC
.
cache
— Should we detect whether the jobs are cached?desc
— The description of the process. Will use the summary fromthe docstring by default.dirsig
— When checking the signature for caching, whether should we walkthrough the content of the directory? This is sometimes time-consuming if the directory is big.envs
— The arguments that are job-independent, useful for common optionsacross jobs.envs_depth
— How deep to update the envs when subclassed.error_strategy
— How to deal with the errors- - retry, ignore, halt
- - halt to halt the whole pipeline, no submitting new jobs
- - terminate to just terminate the job itself
export
— When True, the results will be exported to<pipeline.outdir>
Defaults to None, meaning only end processes will export. You can set it to True/False to enable or disable exporting for processesforks
— How many jobs to run simultaneously?input
— The keys for the input channelinput_data
— The input data (will be computed for dependent processes)lang
— The language for the script to run. Should be the path to theinterpreter iflang
is not in$PATH
.name
— The name of the process. Will use the class name by default.nexts
— Computed fromrequires
to build the process relationshipsnum_retries
— How many times to retry to jobs once error occursorder
— The execution order for this process. The bigger the numberis, the later the process will be executed. Default: 0. Note that the dependent processes will always be executed first. This doesn't work for start processes either, whose orders are determined byPipen.set_starts()
output
— The output keys for the output channel(the data will be computed)output_data
— The output data (to pass to the next processes)plugin_opts
— Options for process-level pluginsrequires
— The dependency processesscheduler
— The scheduler to run the jobsscheduler_opts
— The options for the schedulerscript
— The script template for the processsubmission_batch
— How many jobs to be submited simultaneouslytemplate
— Define the template engine to use.This could be either a template engine or a dict with keyengine
indicating the template engine and the rest the arguments passed to the constructor of thepipen.template.Template
object. The template engine could be either the name of the engine, currently jinja2 and liquidpy are supported, or a subclass ofpipen.template.Template
. You can subclasspipen.template.Template
to use your own template engine.
infile
— The input file for data, tab-separated.The first column should be ids of the records (this is optional ifenvs.noids
is True). The second column should be the labels of the records (1 for positive, 0 for negative). If they are not binary, you can specify the positive label byenvs.pos_label
. From the third column, it should be the scores of the different models.
outfile
— The output figure file
args
— Additional arguments forgeom_roc()
orgeom_rocci()
ifenvs.ci
is True.ci
— Whether to usegeom_rocci()
instead ofgeom_roc()
.devpars
— The parameters forpng()
noids
— Whether the input file has ids (first column) or not.pos_label
— The positive label.style_roc
— Arguments forstyle_roc()
__init_subclass__
(
)
— Do the requirements inferring since we need them to build up theprocess relationship </>from_proc
(
proc
,name
,desc
,envs
,envs_depth
,cache
,export
,error_strategy
,num_retries
,forks
,input_data
,order
,plugin_opts
,requires
,scheduler
,scheduler_opts
,submission_batch
)
(Type) — Create a subclass of Proc using another Proc subclass or Proc itself</>gc
(
)
— GC process for the process to save memory after it's done</>init
(
)
— Init all other properties and jobs</>log
(
level
,msg
,*args
,logger
)
— Log message for the process</>run
(
)
— Run the process</>
pipen.proc.
ProcMeta
(
name
, bases
, namespace
, **kwargs
)
Meta class for Proc
__call__
(
cls
,*args
,**kwds
)
(Proc) — Make sure Proc subclasses are singletons</>__instancecheck__
(
cls
,instance
)
— Override for isinstance(instance, cls).</>__repr__
(
cls
)
(str) — Representation for the Proc subclasses</>__subclasscheck__
(
cls
,subclass
)
— Override for issubclass(subclass, cls).</>register
(
cls
,subclass
)
— Register a virtual subclass of an ABC.</>
register
(
cls
, subclass
)
Register a virtual subclass of an ABC.
Returns the subclass, to allow usage as a class decorator.
__instancecheck__
(
cls
, instance
)
Override for isinstance(instance, cls).
__subclasscheck__
(
cls
, subclass
)
Override for issubclass(subclass, cls).
__repr__
(
cls
)
→ strRepresentation for the Proc subclasses
__call__
(
cls
, *args
, **kwds
)
Make sure Proc subclasses are singletons
*args
(Any) — and**kwds
(Any) — Arguments for the constructor
The Proc instance
from_proc
(
proc
, name=None
, desc=None
, envs=None
, envs_depth=None
, cache=None
, export=None
, error_strategy=None
, num_retries=None
, forks=None
, input_data=None
, order=None
, plugin_opts=None
, requires=None
, scheduler=None
, scheduler_opts=None
, submission_batch=None
)
Create a subclass of Proc using another Proc subclass or Proc itself
proc
(Type) — The Proc subclassname
(str, optional) — The new name of the processdesc
(str, optional) — The new description of the processenvs
(Mapping, optional) — The arguments of the process, will overwrite parent oneThe items that are specified will be inheritedenvs_depth
(int, optional) — How deep to update the envs when subclassed.cache
(bool, optional) — Whether we should check the cache for the jobsexport
(bool, optional) — When True, the results will be exported to<pipeline.outdir>
Defaults to None, meaning only end processes will export. You can set it to True/False to enable or disable exporting for processeserror_strategy
(str, optional) — How to deal with the errors- - retry, ignore, halt
- - halt to halt the whole pipeline, no submitting new jobs
- - terminate to just terminate the job itself
num_retries
(int, optional) — How many times to retry to jobs once error occursforks
(int, optional) — New forks for the new processinput_data
(Any, optional) — The input data for the process. Only when this processis a start processorder
(int, optional) — The order to execute the new processplugin_opts
(Mapping, optional) — The new plugin options, unspecified items will beinherited.requires
(Sequence, optional) — The required processes for the new processscheduler
(str, optional) — The new shedular to run the new processscheduler_opts
(Mapping, optional) — The new scheduler options, unspecified items willbe inherited.submission_batch
(int, optional) — How many jobs to be submited simultaneously
The new process class
__init_subclass__
(
)
Do the requirements inferring since we need them to build up theprocess relationship
init
(
)
Init all other properties and jobs
gc
(
)
GC process for the process to save memory after it's done
log
(
level
, msg
, *args
, logger=<LoggerAdapter pipen.core (WARNING)>
)
Log message for the process
level
(int | str) — The log level of the recordmsg
(str) — The message to log*args
— The arguments to format the messagelogger
(LoggerAdapter, optional) — The logging logger
run
(
)
Run the process
biopipen.ns.plot.
Manhattan
(
*args
, **kwds
)
→ Proc
Plot Manhattan plot.
Using the ggmanh
package.
Requires ggmanh
v1.9.6 or later.
cache
— Should we detect whether the jobs are cached?desc
— The description of the process. Will use the summary fromthe docstring by default.dirsig
— When checking the signature for caching, whether should we walkthrough the content of the directory? This is sometimes time-consuming if the directory is big.envs
— The arguments that are job-independent, useful for common optionsacross jobs.envs_depth
— How deep to update the envs when subclassed.error_strategy
— How to deal with the errors- - retry, ignore, halt
- - halt to halt the whole pipeline, no submitting new jobs
- - terminate to just terminate the job itself
export
— When True, the results will be exported to<pipeline.outdir>
Defaults to None, meaning only end processes will export. You can set it to True/False to enable or disable exporting for processesforks
— How many jobs to run simultaneously?input
— The keys for the input channelinput_data
— The input data (will be computed for dependent processes)lang
— The language for the script to run. Should be the path to theinterpreter iflang
is not in$PATH
.name
— The name of the process. Will use the class name by default.nexts
— Computed fromrequires
to build the process relationshipsnum_retries
— How many times to retry to jobs once error occursorder
— The execution order for this process. The bigger the numberis, the later the process will be executed. Default: 0. Note that the dependent processes will always be executed first. This doesn't work for start processes either, whose orders are determined byPipen.set_starts()
output
— The output keys for the output channel(the data will be computed)output_data
— The output data (to pass to the next processes)plugin_opts
— Options for process-level pluginsrequires
— The dependency processesscheduler
— The scheduler to run the jobsscheduler_opts
— The options for the schedulerscript
— The script template for the processsubmission_batch
— How many jobs to be submited simultaneouslytemplate
— Define the template engine to use.This could be either a template engine or a dict with keyengine
indicating the template engine and the rest the arguments passed to the constructor of thepipen.template.Template
object. The template engine could be either the name of the engine, currently jinja2 and liquidpy are supported, or a subclass ofpipen.template.Template
. You can subclasspipen.template.Template
to use your own template engine.
infile
— The input file for dataIt should contain at least three columns, the chromosome, the position and the p-value of the SNPs. Header is required.
outfile
— The output figure file
args
(ns) — Additional arguments formanhattan_plot()
.See https://rdrr.io/github/leejs-abv/ggmanh/man/manhattan_plot.html. Note that-
will be replaced by.
in the argument names.- -
: Additional arguments for manhattan_plot()
- -
chrom_col
— The column for chromosomeAn integer (1-based) or a string indicating the column name.chroms
(auto) — The chromosomes and order to plotA hyphen (-
) can be used to indicate a range. For examplechr1-22,chrX,chrY,chrM
will plot all autosomes, X, Y and M. ifauto
, only the chromosomes in the data will be plotted in the order they appear in the data.devpars
(ns) — The parameters forpng()
- - res (type=int): The resolution
- - width (type=int): The width
- - height (type=int): The height
hicolors
(auto) — The colors for significant and non-significant SNPsIf a single color is given, the non-significant SNPs will be in grey. Set it to None to disable the highlighting.label_col
— The column for label.Once specified, the significant SNPs will be labeled on the plot.pos_col
— The column for positionAn integer (1-based) or a string indicating the column name.pval_col
— The column for p-valueAn integer (1-based) or a string indicating the column name.rescale
(flag) — Whether to rescale the p-valuesrescale_ratio_threshold
(type=float) — Threshold of that triggers the rescalesignif
(auto) — A single value or a list of values to indicate the significance levelsMultiple values should be also separated by comma (,
). The minimum value will be used as the cutoff to determine if the SNPs are significant.thin_bins
(type=int) — Number of bins to partition the data.thin_n
(type=int) — Number of max points per horizontal partitions of the plot.0
orNone
to disable thinning.title
— The title of the plotylabel
— The y-axis labelzoom
(auto) — Chromosomes to zoom inEach chromosome should be separated by comma (,
) or in a list. Single chromosome is also accepted. Ranges are also accepted, seeenvs.chroms
. Each chromosome will be saved in a separate file.zoom_devpars
(ns) — The parameters for the zoomed plot- - width (type=int): The width
- - height (type=int): The height, inherited from
devpars
by default - - res (type=int): The resolution, inherited from
devpars
by default
__init_subclass__
(
)
— Do the requirements inferring since we need them to build up theprocess relationship </>from_proc
(
proc
,name
,desc
,envs
,envs_depth
,cache
,export
,error_strategy
,num_retries
,forks
,input_data
,order
,plugin_opts
,requires
,scheduler
,scheduler_opts
,submission_batch
)
(Type) — Create a subclass of Proc using another Proc subclass or Proc itself</>gc
(
)
— GC process for the process to save memory after it's done</>init
(
)
— Init all other properties and jobs</>log
(
level
,msg
,*args
,logger
)
— Log message for the process</>run
(
)
— Run the process</>
pipen.proc.
ProcMeta
(
name
, bases
, namespace
, **kwargs
)
Meta class for Proc
__call__
(
cls
,*args
,**kwds
)
(Proc) — Make sure Proc subclasses are singletons</>__instancecheck__
(
cls
,instance
)
— Override for isinstance(instance, cls).</>__repr__
(
cls
)
(str) — Representation for the Proc subclasses</>__subclasscheck__
(
cls
,subclass
)
— Override for issubclass(subclass, cls).</>register
(
cls
,subclass
)
— Register a virtual subclass of an ABC.</>
register
(
cls
, subclass
)
Register a virtual subclass of an ABC.
Returns the subclass, to allow usage as a class decorator.
__instancecheck__
(
cls
, instance
)
Override for isinstance(instance, cls).
__subclasscheck__
(
cls
, subclass
)
Override for issubclass(subclass, cls).
__repr__
(
cls
)
→ strRepresentation for the Proc subclasses
__call__
(
cls
, *args
, **kwds
)
Make sure Proc subclasses are singletons
*args
(Any) — and**kwds
(Any) — Arguments for the constructor
The Proc instance
from_proc
(
proc
, name=None
, desc=None
, envs=None
, envs_depth=None
, cache=None
, export=None
, error_strategy=None
, num_retries=None
, forks=None
, input_data=None
, order=None
, plugin_opts=None
, requires=None
, scheduler=None
, scheduler_opts=None
, submission_batch=None
)
Create a subclass of Proc using another Proc subclass or Proc itself
proc
(Type) — The Proc subclassname
(str, optional) — The new name of the processdesc
(str, optional) — The new description of the processenvs
(Mapping, optional) — The arguments of the process, will overwrite parent oneThe items that are specified will be inheritedenvs_depth
(int, optional) — How deep to update the envs when subclassed.cache
(bool, optional) — Whether we should check the cache for the jobsexport
(bool, optional) — When True, the results will be exported to<pipeline.outdir>
Defaults to None, meaning only end processes will export. You can set it to True/False to enable or disable exporting for processeserror_strategy
(str, optional) — How to deal with the errors- - retry, ignore, halt
- - halt to halt the whole pipeline, no submitting new jobs
- - terminate to just terminate the job itself
num_retries
(int, optional) — How many times to retry to jobs once error occursforks
(int, optional) — New forks for the new processinput_data
(Any, optional) — The input data for the process. Only when this processis a start processorder
(int, optional) — The order to execute the new processplugin_opts
(Mapping, optional) — The new plugin options, unspecified items will beinherited.requires
(Sequence, optional) — The required processes for the new processscheduler
(str, optional) — The new shedular to run the new processscheduler_opts
(Mapping, optional) — The new scheduler options, unspecified items willbe inherited.submission_batch
(int, optional) — How many jobs to be submited simultaneously
The new process class
__init_subclass__
(
)
Do the requirements inferring since we need them to build up theprocess relationship
init
(
)
Init all other properties and jobs
gc
(
)
GC process for the process to save memory after it's done
log
(
level
, msg
, *args
, logger=<LoggerAdapter pipen.core (WARNING)>
)
Log message for the process
level
(int | str) — The log level of the recordmsg
(str) — The message to log*args
— The arguments to format the messagelogger
(LoggerAdapter, optional) — The logging logger
run
(
)
Run the process
biopipen.ns.plot.
QQPlot
(
*args
, **kwds
)
→ Proc
Generate QQ-plot or PP-plot using qqplotr.
See https://cran.r-project.org/web/packages/qqplotr/vignettes/introduction.html.
cache
— Should we detect whether the jobs are cached?desc
— The description of the process. Will use the summary fromthe docstring by default.dirsig
— When checking the signature for caching, whether should we walkthrough the content of the directory? This is sometimes time-consuming if the directory is big.envs
— The arguments that are job-independent, useful for common optionsacross jobs.envs_depth
— How deep to update the envs when subclassed.error_strategy
— How to deal with the errors- - retry, ignore, halt
- - halt to halt the whole pipeline, no submitting new jobs
- - terminate to just terminate the job itself
export
— When True, the results will be exported to<pipeline.outdir>
Defaults to None, meaning only end processes will export. You can set it to True/False to enable or disable exporting for processesforks
— How many jobs to run simultaneously?input
— The keys for the input channelinput_data
— The input data (will be computed for dependent processes)lang
— The language for the script to run. Should be the path to theinterpreter iflang
is not in$PATH
.name
— The name of the process. Will use the class name by default.nexts
— Computed fromrequires
to build the process relationshipsnum_retries
— How many times to retry to jobs once error occursorder
— The execution order for this process. The bigger the numberis, the later the process will be executed. Default: 0. Note that the dependent processes will always be executed first. This doesn't work for start processes either, whose orders are determined byPipen.set_starts()
output
— The output keys for the output channel(the data will be computed)output_data
— The output data (to pass to the next processes)plugin_opts
— Options for process-level pluginsrequires
— The dependency processesscheduler
— The scheduler to run the jobsscheduler_opts
— The options for the schedulerscript
— The script template for the processsubmission_batch
— How many jobs to be submited simultaneouslytemplate
— Define the template engine to use.This could be either a template engine or a dict with keyengine
indicating the template engine and the rest the arguments passed to the constructor of thepipen.template.Template
object. The template engine could be either the name of the engine, currently jinja2 and liquidpy are supported, or a subclass ofpipen.template.Template
. You can subclasspipen.template.Template
to use your own template engine.
infile
— The input file for dataIt should contain at least one column of p-values or the values to be plotted. Header is required.theorfile
— The file for theoretical values (optional)This file should contain at least one column of theoretical values. The values will be passed toenvs.theor_qfunc
to calculate the theoretical quantiles. Header is required.
outfile
— The output figure file
args
(ns) — The common arguments forenvs.band
,envs.line
andenvs.point
.- - distribution: The distribution of the theoretical quantiles
Whencustom
is used, theenvs.theor_col
should be provided and
values
will be added todparams
automatically. - - dparams (type=json): The parameters for the distribution
- -
: Other shared arguments between stat_*_band
,stat_*_line
andstat_*_point
.
- - distribution: The distribution of the theoretical quantiles
band
(ns) — The arguments forstat_qq_band()
orstat_pp_band()
.See https://rdrr.io/cran/qqplotr/man/stat_qq_band.html and https://rdrr.io/cran/qqplotr/man/stat_pp_band.html. Set toNone
orband.disabled
to True to disable the band.- - disabled (flag): Disable the band
- - distribution: The distribution of the theoretical quantiles
Whencustom
is used, theenvs.theor_col
should be provided and
values
will be added todparams
automatically. - - dparams (type=json): The parameters for the distribution
- -
: Additional arguments for stat_qq_band()
orstat_pp_band()
devpars
(ns) — The parameters forpng()
- - res (type=int): The resolution
- - width (type=int): The width
- - height (type=int): The height
ggs
(list) — Additional ggplot expression to adjust the plot.kind
(choice) — The kind of the plot,qq
orpp
- - qq: QQ-plot
- - pp: PP-plot
line
(ns) — The arguments forstat_qq_line()
orstat_pp_line()
.See https://rdrr.io/cran/qqplot/man/stat_qq_line.html and https://rdrr.io/cran/qqplot/man/stat_pp_line.html. Set toNone
orline.disabled
to True to disable the line.- - disabled (flag): Disable the line
- - distribution: The distribution of the theoretical quantiles
Whencustom
is used, theenvs.theor_col
should be provided and
values
will be added todparams
automatically. - - dparams (type=json): The parameters for the distribution
- -
: Additional arguments for stat_qq_line()
orstat_pp_line()
point
(ns) — The arguments forgeom_qq_point()
orgeom_pp_point()
.See https://rdrr.io/cran/qqplot/man/stat_qq_point.html and https://rdrr.io/cran/qqplot/man/stat_pp_point.html. Set toNone
orpoint.disabled
to True to disable the point.- - disabled (flag): Disable the point
- - distribution: The distribution of the theoretical quantiles
Whencustom
is used, theenvs.theor_col
should be provided and
values
will be added todparams
automatically. - - dparams (type=json): The parameters for the distribution
- -
: Additional arguments for geom_qq_point()
orgeom_pp_point()
theor_col
— The column for theoretical values inin.theorfile
if provided,otherwise inin.infile
. An integer (1-based) or a string indicating the column name. Ifdistribution
ofband
,line
, orpoint
iscustom
, this column must be provided.theor_funs
(ns) — The R functions to generate density, quantile and deviatesof the theoretical distribution base on the theoretical values ifdistribution
ofband
,line
, orpoint
iscustom
.- - dcustom: The density function, used by band
- - qcustom: The quantile function, used by point
- - rcustom: The deviates function, used by line
theor_trans
— The transformation of the theoretical values.Thetheor_funs
have default functions to take the theoretical values. This transformation will be applied to the theoretical values before passing to thetheor_funs
.title
— The title of the plottrans
— The transformation of the valuesYou can use-log10
to transform the values to-log10(values)
. Otherwise you can a direct R function or a custom R function. For examplefunction(x) -log10(x)
.val_col
— The column for values to be plottedAn integer (1-based) or a string indicating the column name.xlabel
— The x-axis labelylabel
— The y-axis label
__init_subclass__
(
)
— Do the requirements inferring since we need them to build up theprocess relationship </>from_proc
(
proc
,name
,desc
,envs
,envs_depth
,cache
,export
,error_strategy
,num_retries
,forks
,input_data
,order
,plugin_opts
,requires
,scheduler
,scheduler_opts
,submission_batch
)
(Type) — Create a subclass of Proc using another Proc subclass or Proc itself</>gc
(
)
— GC process for the process to save memory after it's done</>init
(
)
— Init all other properties and jobs</>log
(
level
,msg
,*args
,logger
)
— Log message for the process</>run
(
)
— Run the process</>
pipen.proc.
ProcMeta
(
name
, bases
, namespace
, **kwargs
)
Meta class for Proc
__call__
(
cls
,*args
,**kwds
)
(Proc) — Make sure Proc subclasses are singletons</>__instancecheck__
(
cls
,instance
)
— Override for isinstance(instance, cls).</>__repr__
(
cls
)
(str) — Representation for the Proc subclasses</>__subclasscheck__
(
cls
,subclass
)
— Override for issubclass(subclass, cls).</>register
(
cls
,subclass
)
— Register a virtual subclass of an ABC.</>
register
(
cls
, subclass
)
Register a virtual subclass of an ABC.
Returns the subclass, to allow usage as a class decorator.
__instancecheck__
(
cls
, instance
)
Override for isinstance(instance, cls).
__subclasscheck__
(
cls
, subclass
)
Override for issubclass(subclass, cls).
__repr__
(
cls
)
→ strRepresentation for the Proc subclasses
__call__
(
cls
, *args
, **kwds
)
Make sure Proc subclasses are singletons
*args
(Any) — and**kwds
(Any) — Arguments for the constructor
The Proc instance
from_proc
(
proc
, name=None
, desc=None
, envs=None
, envs_depth=None
, cache=None
, export=None
, error_strategy=None
, num_retries=None
, forks=None
, input_data=None
, order=None
, plugin_opts=None
, requires=None
, scheduler=None
, scheduler_opts=None
, submission_batch=None
)
Create a subclass of Proc using another Proc subclass or Proc itself
proc
(Type) — The Proc subclassname
(str, optional) — The new name of the processdesc
(str, optional) — The new description of the processenvs
(Mapping, optional) — The arguments of the process, will overwrite parent oneThe items that are specified will be inheritedenvs_depth
(int, optional) — How deep to update the envs when subclassed.cache
(bool, optional) — Whether we should check the cache for the jobsexport
(bool, optional) — When True, the results will be exported to<pipeline.outdir>
Defaults to None, meaning only end processes will export. You can set it to True/False to enable or disable exporting for processeserror_strategy
(str, optional) — How to deal with the errors- - retry, ignore, halt
- - halt to halt the whole pipeline, no submitting new jobs
- - terminate to just terminate the job itself
num_retries
(int, optional) — How many times to retry to jobs once error occursforks
(int, optional) — New forks for the new processinput_data
(Any, optional) — The input data for the process. Only when this processis a start processorder
(int, optional) — The order to execute the new processplugin_opts
(Mapping, optional) — The new plugin options, unspecified items will beinherited.requires
(Sequence, optional) — The required processes for the new processscheduler
(str, optional) — The new shedular to run the new processscheduler_opts
(Mapping, optional) — The new scheduler options, unspecified items willbe inherited.submission_batch
(int, optional) — How many jobs to be submited simultaneously
The new process class
__init_subclass__
(
)
Do the requirements inferring since we need them to build up theprocess relationship
init
(
)
Init all other properties and jobs
gc
(
)
GC process for the process to save memory after it's done
log
(
level
, msg
, *args
, logger=<LoggerAdapter pipen.core (WARNING)>
)
Log message for the process
level
(int | str) — The log level of the recordmsg
(str) — The message to log*args
— The arguments to format the messagelogger
(LoggerAdapter, optional) — The logging logger
run
(
)
Run the process
biopipen.ns.plot.
Scatter
(
*args
, **kwds
)
→ Proc
Generate scatter plot using ggplot2.
ggpmisc
is used
for the stats and labels.
See also https://cran.r-project.org/web/packages/ggpmisc/vignettes/model-based-annotations.html
cache
— Should we detect whether the jobs are cached?desc
— The description of the process. Will use the summary fromthe docstring by default.dirsig
— When checking the signature for caching, whether should we walkthrough the content of the directory? This is sometimes time-consuming if the directory is big.envs
— The arguments that are job-independent, useful for common optionsacross jobs.envs_depth
— How deep to update the envs when subclassed.error_strategy
— How to deal with the errors- - retry, ignore, halt
- - halt to halt the whole pipeline, no submitting new jobs
- - terminate to just terminate the job itself
export
— When True, the results will be exported to<pipeline.outdir>
Defaults to None, meaning only end processes will export. You can set it to True/False to enable or disable exporting for processesforks
— How many jobs to run simultaneously?input
— The keys for the input channelinput_data
— The input data (will be computed for dependent processes)lang
— The language for the script to run. Should be the path to theinterpreter iflang
is not in$PATH
.name
— The name of the process. Will use the class name by default.nexts
— Computed fromrequires
to build the process relationshipsnum_retries
— How many times to retry to jobs once error occursorder
— The execution order for this process. The bigger the numberis, the later the process will be executed. Default: 0. Note that the dependent processes will always be executed first. This doesn't work for start processes either, whose orders are determined byPipen.set_starts()
output
— The output keys for the output channel(the data will be computed)output_data
— The output data (to pass to the next processes)plugin_opts
— Options for process-level pluginsrequires
— The dependency processesscheduler
— The scheduler to run the jobsscheduler_opts
— The options for the schedulerscript
— The script template for the processsubmission_batch
— How many jobs to be submited simultaneouslytemplate
— Define the template engine to use.This could be either a template engine or a dict with keyengine
indicating the template engine and the rest the arguments passed to the constructor of thepipen.template.Template
object. The template engine could be either the name of the engine, currently jinja2 and liquidpy are supported, or a subclass ofpipen.template.Template
. You can subclasspipen.template.Template
to use your own template engine.
infile
— The input file for dataIt should contain at least two columns for x and y values. Header is required.
outfile
— The output figure file
args
(ns) — Additional arguments forgeom_point()
See https://ggplot2.tidyverse.org/reference/geom_point.html.- -
: Additional arguments for geom_point()
- -
devpars
(ns) — The parameters forpng()
- - res (type=int): The resolution
- - width (type=int): The width
- - height (type=int): The height
formula
— The formula for the modelggs
(list) — Additional ggplot expression to adjust the plot.mapping
— Extra mapping for all geoms, includingstats
.Should beaes(color = group)
but all these are valid:color = group
or(color = group)
.stats
(type=json) — The stats to add to the plot.A dict with keys available stats inggpmisc
(withoutstat_
). See https://cran.r-project.org/web/packages/ggpmisc/vignettes/model-based-annotations.html#statistics. The values should be the arguments for the stats. If you want a stat to be added multiple times, add a suffix#x
to the key. For example,poly_line#1
andpoly_line#2
will add two polynomial lines.x_col
— The column for x valuesAn integer (1-based) or a string indicating the column name.y_col
— The column for y valuesAn integer (1-based) or a string indicating the column name.
__init_subclass__
(
)
— Do the requirements inferring since we need them to build up theprocess relationship </>from_proc
(
proc
,name
,desc
,envs
,envs_depth
,cache
,export
,error_strategy
,num_retries
,forks
,input_data
,order
,plugin_opts
,requires
,scheduler
,scheduler_opts
,submission_batch
)
(Type) — Create a subclass of Proc using another Proc subclass or Proc itself</>gc
(
)
— GC process for the process to save memory after it's done</>init
(
)
— Init all other properties and jobs</>log
(
level
,msg
,*args
,logger
)
— Log message for the process</>run
(
)
— Run the process</>
pipen.proc.
ProcMeta
(
name
, bases
, namespace
, **kwargs
)
Meta class for Proc
__call__
(
cls
,*args
,**kwds
)
(Proc) — Make sure Proc subclasses are singletons</>__instancecheck__
(
cls
,instance
)
— Override for isinstance(instance, cls).</>__repr__
(
cls
)
(str) — Representation for the Proc subclasses</>__subclasscheck__
(
cls
,subclass
)
— Override for issubclass(subclass, cls).</>register
(
cls
,subclass
)
— Register a virtual subclass of an ABC.</>
register
(
cls
, subclass
)
Register a virtual subclass of an ABC.
Returns the subclass, to allow usage as a class decorator.
__instancecheck__
(
cls
, instance
)
Override for isinstance(instance, cls).
__subclasscheck__
(
cls
, subclass
)
Override for issubclass(subclass, cls).
__repr__
(
cls
)
→ strRepresentation for the Proc subclasses
__call__
(
cls
, *args
, **kwds
)
Make sure Proc subclasses are singletons
*args
(Any) — and**kwds
(Any) — Arguments for the constructor
The Proc instance
from_proc
(
proc
, name=None
, desc=None
, envs=None
, envs_depth=None
, cache=None
, export=None
, error_strategy=None
, num_retries=None
, forks=None
, input_data=None
, order=None
, plugin_opts=None
, requires=None
, scheduler=None
, scheduler_opts=None
, submission_batch=None
)
Create a subclass of Proc using another Proc subclass or Proc itself
proc
(Type) — The Proc subclassname
(str, optional) — The new name of the processdesc
(str, optional) — The new description of the processenvs
(Mapping, optional) — The arguments of the process, will overwrite parent oneThe items that are specified will be inheritedenvs_depth
(int, optional) — How deep to update the envs when subclassed.cache
(bool, optional) — Whether we should check the cache for the jobsexport
(bool, optional) — When True, the results will be exported to<pipeline.outdir>
Defaults to None, meaning only end processes will export. You can set it to True/False to enable or disable exporting for processeserror_strategy
(str, optional) — How to deal with the errors- - retry, ignore, halt
- - halt to halt the whole pipeline, no submitting new jobs
- - terminate to just terminate the job itself
num_retries
(int, optional) — How many times to retry to jobs once error occursforks
(int, optional) — New forks for the new processinput_data
(Any, optional) — The input data for the process. Only when this processis a start processorder
(int, optional) — The order to execute the new processplugin_opts
(Mapping, optional) — The new plugin options, unspecified items will beinherited.requires
(Sequence, optional) — The required processes for the new processscheduler
(str, optional) — The new shedular to run the new processscheduler_opts
(Mapping, optional) — The new scheduler options, unspecified items willbe inherited.submission_batch
(int, optional) — How many jobs to be submited simultaneously
The new process class
__init_subclass__
(
)
Do the requirements inferring since we need them to build up theprocess relationship
init
(
)
Init all other properties and jobs
gc
(
)
GC process for the process to save memory after it's done
log
(
level
, msg
, *args
, logger=<LoggerAdapter pipen.core (WARNING)>
)
Log message for the process
level
(int | str) — The log level of the recordmsg
(str) — The message to log*args
— The arguments to format the messagelogger
(LoggerAdapter, optional) — The logging logger
run
(
)
Run the process