Installation

Install the pipline and the dependencies using conda

Tip

If you plan to use the docker image to run the pipeline locally, you can skip this section.

immunopipe is built upon pipen framework, and a number of packages written in R and python. It's not recommended to install the packages manually. Instead, you can use the provided environment_base.yml to create a conda environment.

$ conda env create \
    -n immunopipe \
    -f https://raw.githubusercontent.com/pwwang/immunopipe/master/docker/environment_base.yml

Then update the environment with essential R packages:

$ conda env update \
    -n immunopipe \
    -f https://raw.githubusercontent.com/pwwang/immunopipe/master/docker/environment_rpkgs.yml

If the URL doesn't work, you can download the file and create the environment locally.

For more detailed instructions of conda env create, please refer to conda docs.

Note

If you are using celltypist for cell type annotation:

[CellTypeAnnotation.envs]
tool = "celltypist"

Or if you are enabling TESSA and CDR3Clustering processes, you need to install additional dependencies, including numpy v1, which is not compatible with some other packages in the base environment. You can create a separate conda environment for these processes.

$ conda env create \
    -n python_np1 \
    -f https://raw.githubusercontent.com/pwwang/immunopipe/master/docker/environment_np1.yml

Then in your pipeline configuration file, specify the conda environment for these processes:

[CellTypeAnnotation.envs]
tool = "celltypist"

  [CellTypeAnnotation.envs.celltypist_args]
  model = "data/Immune_All_Low.pkl"
  python = "/path/to/conda/envs/python_np1/bin/python"

[CDR3Clustering]
  python = "/path/to/conda/envs/python_np1/bin/python"

[TESSA.envs]
predefined_b = true
python = "/path/to/conda/envs/python_np1/bin/python"

Attention

The pipeline itself is NOT included in the conda environment. You need to install it separately.

$ conda activate immunopipe
$ pip install -U immunopipe
$ # If you want to create diagram and generate running information
$ # or use the dry run scheduler, install with the extras
$ pip install -U immunopipe[diagram,runinfo,dry]
$ # You also need to install the frontend dependencies to generate reports
$ pipen report update

Use the docker image

You can also use the docker image to run the pipeline. The image is built upon miniconda3 and micromamba is used as the package manager. The image is available at Docker Hub.

To pull the image:

$ docker pull justold/immunopipe:<tag>

If you are using singularity, you can pull and convert the image to sif format:

$ singularity pull docker://justold/immunopipe:<tag>
$ apptainer pull docker://justold/immunopipe:<tag>

To run the pipeline use the image, please refer to Running the pipeline.

The directory structure in the container

The docker image is build upon mambaorg/micromamba:2.3.0. The OS is linux/amd64. Other than the default directories, the following directories are also created or should be mapped during the run:

  • /immunopipe: The directory where the source code of the pipeline is. It is general a clone of the repository. The pipeline is also installed from this directory.
  • /workdir: The working directory. It is the directory where the pipeline is run. It is recommended to map the current directory (.) to this directory.

Prepare to run the pipeline via Google Batch Jobs

There are two ways of running the pipeline via Google Batch Jobs: using the gbatch scheduler (provided by xqute) or using pipen-cli-gbatch. See more details in Running the pipeline via Google Batch Jobs.

In addition to prepare the docker image in the artifact registry (or docker hub if your google cloud project allows pulling from docker hub), you also need to install some dependencies locally.

If you choose to use the gbatch scheduler, in addition to installing the pipeline:

$ pip install -U immunopipe

# install cloud dependencies
$ pip install -U cloudpathlib[gs]

You still have to install the following dependencies to generate reports:

Then you need to install frontend dependencies for report generation:

$ pipen report update

If you choose to use pipen-cli-gbatch (running the pipeline via immunopipe gbatch), you just need to install the pipeline with the cli-gbatch extra:

$ pip install -U immunopipe[cli-gbatch]