Script
For templating in script
, see templating
Choosing your language
You can specify the path of interpreter to lang
. If the interpreter is in $PATH
, you can directly give the basename of the interpreter (i.e. python
instead of /path/to/python
).
For example, if you have your own perl installed at /home/user/bin/perl
, then you need to tell pipen
where it is: lang = "/home/user/bin/perl"
. If /home/user/bin
is in your $PATH
, you can simply do: lang = "perl"
You can also use shebang to specify the interperter:
#!/home/usr/bin/perl
# You perl code goes here
If you have shebang in your script, the lang
specified in the configuration files and Pipen
constructor will be ignored (but the one specified in process definition is not).
Use script from a file
You can also put the script into a file, and use it with a file://
prefix: script = "file:///a/b/c.pl"
Note
You may also use a script file with a relative path, which is relative to where process is defined. For example: a process with script = "file://./scripts/script.py"
is defined in /a/b/pipeline.py
, then the script file refers to /a/b/scripts/script.py
Hint
Indents are important in python, when you write your scripts, you don't have to worry about the indents in your first empty lines. For example, you don't have to do this:
class P1(Proc):
lang = "python"
script = """
import os
import re
def somefunc ():
pass
"""
You can do this:
class P1(Proc):
lang = "python"
script = """
import os
import re
def somefunc ():
pass
"""
Only the first non-empty line is used to detect the indent for the whole script.
Debugging your script
If you need to debug your script, you just need to find the real running script, which is at: <pipeline-workdir>/<proc-name>/<job.index>/job.script
. The template is rendered already in the file. You can debug it using the tool according to the language you used for the script.
Caching your results
Job results get automatically cached previous run is successful and input/output data are not changed, see caching.
However, there are cases when you want to cache some results even when the job fails. For example, there is a very time-consuming chunk of code in your script that you don't want to run that part each time if it finishes once. In that case, you can save the intermediate results in a directory under <job.outdir>
, where the directory is not specified in output
. This keeps that directory untouched each time when the running data get purged if previous run fails.