Installation#
Install MicroCAT#
MicroCAT runs on Python 3.8 and above. We provide several installation methods:
Conda Installation (Recommended)#
conda create -n MICROCAT -c bioconda microcat
If you don’t have conda in your environment, we recommend installing it first. See the official conda documentation for installation instructions.
Pip Installation#
Use pip to quickly install microcat from PyPI:
pip install microcat
Then install the software required to run microcat, or use the ‘–use-conda’ parameter during execution to automatically build the runtime environment (see microcat’s official documentation).
Docker Image#
Docker image is still under construction, please be patient.
If you enter microcat --help
in the terminal and the following information is displayed, it means that MicroCAT has been successfully installed:
!microcat --help
Usage: microcat [OPTIONS] COMMAND [ARGS]...
███╗ ███╗██╗ ██████╗██████╗ ██████╗ ██████╗ █████╗ ████████╗
████╗ ████║██║██╔════╝██╔══██╗██╔═══██╗██╔════╝██╔══██╗╚══██╔══╝
██╔████╔██║██║██║ ██████╔╝██║ ██║██║ ███████║ ██║
██║╚██╔╝██║██║██║ ██╔══██╗██║ ██║██║ ██╔══██║ ██║
██║ ╚═╝ ██║██║╚██████╗██║ ██║╚██████╔╝╚██████╗██║ ██║ ██║
╚═╝ ╚═╝╚═╝ ╚═════╝╚═╝ ╚═╝ ╚═════╝ ╚═════╝╚═╝ ╚═╝ ╚═╝
Microbiome Identification upon Cell Resolution from Omics-
Computational Analysis Toolbox
Options:
-v, --version Show the version and exit.
-h, --help Show this message and exit.
Commands:
config Quickly adjust microcat's default configurations
debug Execute the analysis workflow on debug mode.
download Download necessary files for running microcat
init Init microcat style analysis project
path Print out microcat install path
run-local Execute the analysis workflow on local computer mode
run-remote Execute the analysis workflow on remote cluster mode
Installation of tools for host read mapping and counting#
For the read mapping and UMI counting step microcat offers pre-defined rules for using either Cellranger or STARsolo. Both tools are not available for installation via conda and need to be installed separately. Only one of the tools needs to be installed, depending on the method of choice.
Cellranger: Follow the instructions on the 10xGenomics installation support page to install cellranger and to include it into the PATH. Webpage: https://support.10xgenomics.com/single-cell-gene-expression/software/pipelines/latest/installation
STAR as open source alternative to Cellranger. For installation, follow the instructions in the excellent STAR documentation and include it in your PATH.
STAR of version 2.7.9a or above is recommended (2.7.10a is the latest and greatest, as of August’22). The newest update includes the ability to correctly process multi-mapping reads, and adds many important options and bug fixes.
In order to use settings that closely mimic those of Cell Ranger v4 or above (see explanations below, particularly –clipAdapterType CellRanger4 option), STAR needs to be re-compiled from source with make STAR CXXFLAGS_SIMD=”-msse4.2” (see this issue for more info). If you get the Illegal instruction error, that’s what you need to do.
You can also use the command line to check if cellranger
has been successfully installed.
!cellranger --version
cellranger cellranger-7.1.0
STARsolo is invoked through the STAR
command line
!STAR --version
2.7.11b
Note
In the workflow of microcat, we assume that users have already added cellranger
and STAR
to the environment variables, and use the software by calling the command line of cellranger
and STAR
.
In the future, microcat will support user-defined software paths, such as calling the software after using module load in a high-performance computing cluster.
Adapting/Integrating rules in Snakemake#
Snakemake is a Python-based workflow management system for building and executing pipelines. A pipeline is made up of “rules” that represent single steps of the analysis. In a yaml config file parameters and rule-specific input can be adjusted to a new analysis without changing the rules. In a “master” snake file the desired end points of the analysis are specified. With the input and the desired output defined, Snakemake is able infer all steps that have to be performed in-between.
To change one of the steps, e.g. to a different software tool, one can create a new rule, insert a new code block into the config file, and include the input/output directory of this step in the master snake file. It is important to make sure that the format of the input and output of each rule is compatible with the previous and the subsequent rule. For more detailed information please have a look at the excellent online documentation of Snakemake.
Install snakemake cluster profile#
Thanks to the organizational characteristics of snakemake itself, users can quickly download the corresponding cluster configuration files and configure and automate the task scheduling system. For details, see the snakemake documentation
Note
After snakemake 8.0, the cluster call interface was changed to use the plugin mode. At present, microcat does not yet support this mode, so microcat only adapts to snakemake > 7 <8 versions.
We recommend users to use the snakemake profile generic profile, which can configure dynamic tasks according to the resource requirements of different task nodes.
Users can download the corresponding profile file through microcat download profile
and configure it.
!microcat download profile -h
Usage: microcat download profile [OPTIONS]
Download profile config from Github
$ microcat download profile --cluster lsf
$ microcat download profile --cluster slurm
$ microcat download profile --cluster sge
Options:
--cluster [slurm|sge|lsf] Cluster workflow manager engine, now support
generic
-h, --help Show this message and exit.