Load samples into a Seurat object
LoadSeuratAndPerformQC.Rd
Cell QC will be performed, either per-sample or on the whole object
Usage
LoadSeuratAndPerformQC(
meta,
samples = NULL,
per_sample_qc = FALSE,
cell_qc = NULL,
gene_qc = NULL,
tmpdir = NULL,
log = NULL,
cache = NULL
)
Arguments
- meta
Metadata of the samples Required columns: Sample, RNAData. The RNAData column should contain the path to the 10X data, either a directory or a file If the path is a directory, the function will look for barcodes.tsv.gz, features.tsv.gz and matrix.mtx.gz. The directory should be loaded by Seurat::Read10X. Sometimes, there may be prefix in the file names, e.g. "'prefix'.barcodes.tsv.gz", which is also supported. If the path is a file, it should be a h5 file that can be loaded by Seurat::Read10X_h5
- samples
Samples to load. If NULL, all samples will be loaded
- per_sample_qc
Whether to perform per-sample cell QC
- cell_qc
Cell QC criteria
- gene_qc
Gene QC criteria A list containing the following fields:
min_cells: Minimum number of cells a gene should be expressed in to be kept
excludes: A string or strings to exclude certain genes. Regular expressions are supported. Multiple strings can also be separated by commas in a single string.
- tmpdir
Temporary directory to store intermediate files when there are prefix in the file names
- log
Logger
- cache
Directory to cache the results. Set to
FALSE
to disable caching
Value
A Seurat object.
In slot misc
, there are two fields: cell_qc_df
and gene_qc
, a list of gene QC results
Examples
# \donttest{
datadir <- system.file("extdata", "scrna", package = "biopipen.utils")
meta <- data.frame(
Sample = c("Sample1", "Sample2"),
RNAData = c(
file.path(datadir, "Sample1"),
file.path(datadir, "Sample2")
)
)
obj <- LoadSeuratAndPerformQC(meta, cache = FALSE, gene_qc = list(min_cells = 3))
#> INFO [2025-01-03 21:07:38] Loading each sample ...
#> INFO [2025-01-03 21:07:38] - Loading sample Sample1 ...
#> INFO [2025-01-03 21:07:38] - Loading sample Sample2 ...
#> INFO [2025-01-03 21:07:38] Merging samples ...
#> INFO [2025-01-03 21:07:39] Performing cell QC ...
#> INFO [2025-01-03 21:07:39] Performing gene QC ...
head(obj@misc$cell_qc_df)
#> Sample nFeature_RNA nCount_RNA percent.mt percent.ribo
#> Sample1_TEINH_1 Sample1 1153 2196 0 0
#> Sample1_TEINH_2 Sample1 1219 2560 0 0
#> Sample1_TEINH_3 Sample1 981 1784 0 0
#> Sample1_TEINH_4 Sample1 1159 2159 0 0
#> Sample1_TEINH_5 Sample1 1111 1841 0 0
#> Sample1_TEINH_6 Sample1 766 1236 0 0
#> percent.hb percent.plat .QC
#> Sample1_TEINH_1 0 0 TRUE
#> Sample1_TEINH_2 0 0 TRUE
#> Sample1_TEINH_3 0 0 TRUE
#> Sample1_TEINH_4 0 0 TRUE
#> Sample1_TEINH_5 0 0 TRUE
#> Sample1_TEINH_6 0 0 TRUE
print(obj@misc$gene_qc)
#> $before
#> [1] 3636
#>
#> $criteria
#> $criteria$min_cells
#> [1] 3
#>
#>
#> $ncells
#> Sample feature ncells
#> 79 Sample1 ENSG00000145708 2
#> 182 Sample1 ENSG00000100097 4
#> 383 Sample1 ENSG00000225968 3
#> 398 Sample1 ENSG00000211445 0
#> 703 Sample1 ENSG00000152822 1
#> 749 Sample1 ENSG00000277586 5
#> 996 Sample1 ENSG00000143479 4
#> 1054 Sample1 ENSG00000072657 2
#> 1547 Sample1 ENSG00000101463 0
#> 1554 Sample1 ENSG00000079102 0
#> 1558 Sample1 ENSG00000172137 2
#> 1560 Sample1 ENSG00000188803 2
#> 1684 Sample1 ENSG00000183773 2
#> 1703 Sample1 ENSG00000150556 3
#> 1785 Sample1 ENSG00000101327 1
#> 1942 Sample1 ENSG00000106852 2
#> 2011 Sample1 ENSG00000006128 2
#> 2040 Sample1 ENSG00000130751 1
#> 2126 Sample1 ENSG00000082074 2
#> 2279 Sample1 ENSG00000188647 1
#> 2316 Sample1 ENSG00000166407 0
#> 2340 Sample1 ENSG00000118733 3
#> 2343 Sample1 ENSG00000151693 3
#> 2446 Sample1 ENSG00000166863 2
#> 2453 Sample1 ENSG00000133101 2
#> 2543 Sample1 ENSG00000185513 1
#> 2561 Sample1 ENSG00000123096 2
#> 2881 Sample1 ENSG00000162631 4
#> 2982 Sample1 ENSG00000163873 2
#> 3028 Sample1 ENSG00000112893 2
#> 3057 Sample1 ENSG00000262655 1
#> 3085 Sample1 ENSG00000164326 0
#> 3172 Sample1 ENSG00000090006 5
#> 3188 Sample1 ENSG00000181234 1
#> 3202 Sample1 ENSG00000113657 4
#> 3386 Sample1 ENSG00000168546 2
#> 3506 Sample1 ENSG00000189409 2
#> 3510 Sample1 ENSG00000165973 2
#> 3594 Sample1 ENSG00000054803 0
#> 3715 Sample2 ENSG00000145708 5
#> 3818 Sample2 ENSG00000100097 2
#> 4019 Sample2 ENSG00000225968 0
#> 4034 Sample2 ENSG00000211445 3
#> 4339 Sample2 ENSG00000152822 3
#> 4385 Sample2 ENSG00000277586 2
#> 4632 Sample2 ENSG00000143479 1
#> 4690 Sample2 ENSG00000072657 3
#> 5183 Sample2 ENSG00000101463 0
#> 5190 Sample2 ENSG00000079102 1
#> 5194 Sample2 ENSG00000172137 1
#> 5196 Sample2 ENSG00000188803 0
#> 5320 Sample2 ENSG00000183773 4
#> 5339 Sample2 ENSG00000150556 2
#> 5421 Sample2 ENSG00000101327 0
#> 5578 Sample2 ENSG00000106852 2
#> 5647 Sample2 ENSG00000006128 1
#> 5676 Sample2 ENSG00000130751 4
#> 5762 Sample2 ENSG00000082074 5
#> 5915 Sample2 ENSG00000188647 3
#> 5952 Sample2 ENSG00000166407 0
#> 5976 Sample2 ENSG00000118733 1
#> 5979 Sample2 ENSG00000151693 2
#> 6082 Sample2 ENSG00000166863 1
#> 6089 Sample2 ENSG00000133101 2
#> 6179 Sample2 ENSG00000185513 5
#> 6197 Sample2 ENSG00000123096 5
#> 6517 Sample2 ENSG00000162631 1
#> 6618 Sample2 ENSG00000163873 1
#> 6664 Sample2 ENSG00000112893 2
#> 6693 Sample2 ENSG00000262655 0
#> 6721 Sample2 ENSG00000164326 0
#> 6808 Sample2 ENSG00000090006 2
#> 6824 Sample2 ENSG00000181234 2
#> 6838 Sample2 ENSG00000113657 2
#> 7022 Sample2 ENSG00000168546 3
#> 7142 Sample2 ENSG00000189409 1
#> 7146 Sample2 ENSG00000165973 2
#> 7230 Sample2 ENSG00000054803 1
#>
#> $after
#> [1] 3597
#>
# }