TCellSelection¶
Separate T and non-T cells and select T cells.
If all of your cells are T cells, do not set any configurations for this process.
In such a case, SeuratClusteringOfAllCells
should
not be used, and SeuratClustering
will be clustering all
of the cells, which are all T cells.
There are two ways to separate T and non-T cells:
- Use the an expression indicator directly from the metadata.
- Use the expression values of indicator genes, and the clonotype percentage
of the clusters.
You can also use indicator gene expression values only to select T cells by setting
envs.ignore_tcr
to true.
Environment Variables¶
ignore_tcr
(flag
): Default:False
.
Ignore TCR information for T cell selection.
Use only the expression values of indicator genes.
In this case, theClonotype_Pct
column does not exist in the metadata.
If you want to usek-means
to select T cells, you must have more than 1 indicator gene, and the first indicator gene inenvs.indicator_genes
must be a positive marker, which will be used to select the cluster with higher expression values as T cells.tcell_selector
: The expression passed totidyseurat::mutate(is_TCell = ...)
to indicate whether a cell is a T cell. For example,Clonotype_Pct > 0.25
to indicate cells with clonotype percentage > 25% are T cells.
Ifindicator_genes
is provided, the expression values can also be used in the expression. For example,Clonotype_Pct > 0.25 & CD3E > 0
.
Iftcell_selector
is not provided, a kmeans clustering will be performed on the expression values ofindicator_genes
andClonotype_Pct
, with K=2, and the cluster with higher clonotype percentage will be selected as T cells.-
indicator_genes
(list
): Default:['CD3E']
.
A list of indicator genes whose expression values and clonotype percentage will be used to determine T cells.
The markers could be either positive, such asCD3E
,CD3D
,CD3G
, or negative, such asCD19
,CD14
,CD68
. -
kmeans
(type=json
): Default:{'nstart': 25}
.
The parameters forkmeans
clustering.
Other arguments forstats::kmeans
can be provided here. If there are dots in the argument names, replace them with-
.
Examples¶
Use T cell indicator directly¶
If you have a metadata like this:
id | Clonotype_Pct | seurat_clusters |
---|---|---|
1 | 0.1 | 1 |
2 | 0.3 | 2 |
3 | 0.5 | 3 |
With the configuration below:
[TCellSelection.envs]
tcell_selector = "Clonotype_Pct > 0.25"
The T cells will be selected as:
id | Clonotype_Pct | seurat_clusters | is_TCell |
---|---|---|---|
1 | 0.1 | 1 | FALSE |
2 | 0.3 | 2 | TRUE |
3 | 0.5 | 3 | TRUE |
Use indicator genes¶
Let's say we set the indicator genes to ["CD3D", "CD3E", "CD3G"]
.
The mean expression values will be calculated for each cluster:
id | Clonotype_Pct | seurat_clusters | CD3D | CD3E | CD3G |
---|---|---|---|---|---|
1 | 0.1 | 1 | 0.1 | 0.0 | 0.1 |
2 | 0.3 | 2 | 1.2 | 1.3 | 0.6 |
3 | 0.5 | 3 | 1.5 | 0.8 | 0.9 |
Then a kmeans clustering will be performed on the mean expression values of
the indicator genes, together with Clonotype_Pct
, with K=2.
id | Clonotype_Pct | seurat_clusters | CD3D | CD3E | CD3G | is_TCell |
---|---|---|---|---|---|---|
1 | 0.1 | 1 | 0.1 | 0.0 | 0.1 | FALSE |
2 | 0.3 | 2 | 1.2 | 1.3 | 0.6 | TRUE |
3 | 0.5 | 3 | 1.5 | 0.8 | 0.9 | TRUE |
The cluster with higher clonoype percentage will be selected as T cells
(is_TCell = TRUE
), and sent to
SeuratClustering
for
further clustering and downstream analysis.