TCRClustering¶
Cluster the TCR clones by their CDR3 sequences
You can disable this by remving the whole sections of
TCRClustering in the config file.
This process is used to cluster TCR clones based on their CDR3 sequences.
It uses either
Zhang, Hongyi, Xiaowei Zhan, and Bo Li.
"GIANA allows computationally-efficient TCR clustering and multi-disease repertoire classification by isometric transformation." Nature communications 12.1 (2021): 1-11.
Or ClusTCR
Sebastiaan Valkiers, Max Van Houcke, Kris Laukens, Pieter Meysman, ClusTCR: a Python interface for rapid clustering of large sets of CDR3 sequences with unknown antigen specificity, Bioinformatics, 2021.
Both methods are based on the
Faiss Clustering Library,
for efficient similarity search and clustering of dense vectors, so both methods
yield similar results.
A text file will be generated with the cluster assignments for each cell, together
with the immunarch
object (in R
) with the cluster assignments at TCR_Clsuter
column. This information will then be merged to a Seurat
object for further
downstream analysis.
The cluster assignments are prefixed with S_
or M_
to indicate whether a
cluster has only one unique CDR3 sequence or multiple CDR3 sequences.
Note that a cluster with S_
prefix may still have multiple cells, as the same
CDR3 sequence may be shared by multiple cells.
Input¶
screpfile
: The TCR data object loaded byscRepertoire::CombineTCR()
orscRepertoire::CombineExpression()
Output¶
outfile
: Default:{{in.screpfile | stem}}.tcr_clustered.qs
.
ThescRepertoire
object in qs with TCR cluster information.
ColumnTCR_Cluster
will be added to the metadata.
Environment Variables¶
tool
(choice
): Default:GIANA
.
The tool used to do the clustering, either GIANA or ClusTCR.
For GIANA, using TRBV mutations is not supportedGIANA
: by Li lab at UT Southwestern Medical CenterClusTCR
: by Sebastiaan Valkiers, etc
python
: Default:python
.
The path of python withGIANA
's dependencies installed or withclusTCR
installed. Depending on thetool
you choose.within_sample
(flag
): Default:True
.
Whether to cluster the TCR clones within each sample.
Whenin.screpfile
is aSeurat
object, the samples are marked by theSample
column in the metadata.args
(type=json
): Default:{}
.
The arguments for the clustering tool For GIANA, they will be passed topython GIAna.py
See https://github.com/s175573/GIANA#usage.
For ClusTCR, they will be passed toclustcr.Clustering(...)
See https://svalkiers.github.io/clusTCR/docs/clustering/how-to-use.html#clustering.chain
(choice
): Default:both
.
The TCR chain to use for clustering.alpha
: TCR alpha chain (the first sequence in CTaa, separated by_
)beta
: TCR beta chain (the second sequence in CTaa, separated by_
)both
: Both TCR alpha and beta chains