snakepipes_fastqc-multiqc is a standard quality control snakemake pipeline for NGS/HTS data

华男菌
华男菌 潜水,潜水

1 人点赞了该文章 · 404 浏览

snakepipes_fastqc-multiqc

仓库地址

欢迎提Issue

 

about author

author: 赵华男 | ZHAO Hua-nan

email: hermanzhaozzzz@gmail.com

Zhihu | BLOG

doc

snakepipes_fastqc-multiqc is a standard quality control snakemake pipeline for NGS/HTS data

  • input file: FASTQ file by NGS sequencing, Single-end (SE) Paired-end (PE) are supported.
  • output file:
    • fastqc report
    • multiqc report
  • requirement
    • raw FASTQ file must put in ../fastq directory
    • only the same sequencing type (SE or PE) can be assigned into the sample.json at once!
    • SE sequencing data must named its suffix -> _SE.fastq.gz
    • PE sequencing data must named its suffix -> _R1.fastq.gz and_R2.fastq.gz
    • run Jupyter notebook to abtain the config for snakemake -> sample.json
    • run Snakemake to abtain the QC results at directory -> ../qc
      • summary html for QC stat -> ../qc/multiqc/multiqc_report.html

env:

tree .
.
└── fastq


git clone https://github.com/hermanzhaozzzz/snakepipes_fastqc-multiqc.git
cd snakepipes_fastqc-multiqc


conda env create -f conda_env.yml
conda activate snakepipes_fastqc-multiqc

run

# run Jupyter notebook to abtain the config
# run this cmd
# or
# open notebook and run all cells
runipy step.01.GetFileName.ipynb
# dry run for test
snakemake -pr -j 10 -s step.02.Snakefile.py -n
# run calculation
snakemake -pr -j 10 -s step.02.Snakefile.py

project structure

tree -L 2 .
.
├── fastq
│   ├── CTCF_ChIP-seq_CTCF-AID_auxin2days_rep1_SE.fastq.gz
│   ├── CTCF_ChIP-seq_CTCF-AID_auxin2days_rep2_SE.fastq.gz
│   ├── CTCF_ChIP-seq_CTCF-AID_untreated_rep1_SE.fastq.gz
│   ├── CTCF_ChIP-seq_CTCF-AID_untreated_rep2_SE.fastq.gz
│   ├── CTCF_ChIP-seq_CTCF-AID_washoff2days_rep1_SE.fastq.gz
│   ├── CTCF_ChIP-seq_CTCF-AID_washoff2days_rep2_SE.fastq.gz
│   ├── Input_for_CTCF_ChIP-seq_CTCF-AID_auxin2days_rep1_SE.fastq.gz
│   ├── Input_for_CTCF_ChIP-seq_CTCF-AID_auxin2days_rep2_SE.fastq.gz
│   ├── Spike-in-antibody-only_ChIP-seq_CTCF-AID_untreated_rep1_SE.fastq.gz
│   └── Spike-in-antibody-only_ChIP-seq_CTCF-AID_untreated_rep2_SE.fastq.gz
└── snakepipes_fastqc-multiqc
    ├── README.md
    ├── samples.json
    ├── step.01.GetFileName.ipynb
    └── step.02.Snakefile.smk.py

发布于 2023-05-19 20:19

免责声明:

本文由 华男菌 原创发布于 生信坑 ,著作权归作者所有。

登录一下,更多精彩内容等你发现,贡献精彩回答,参与评论互动

登录! 还没有账号?去注册

暂无评论

相关文章

Hisat建立索引

文章目录

All Rights Reserved Powered BY WeCenter V4.1.0 © 2024