peak_calling V1.1
peak_calling workflow , which agree to FAIR principles , was built in Nexflow dsl2 language, with singularity container for used softwares, optimized in terms of computing resources (cpu, memory), and its use on a informatic farm with a slurm scheduler.
- Peaks calling (broad/narrow) was achieved using MACS3 [1]
- normalised bigWig files scaled to 1 million mapped reads with BEDTools [2] and bedGraphToBigWig [3]
- genome-wide enrichment with deepTools [4].
genome_aln workflow is available for first step.
Install flow and build singularity image
Clone peak_calling git and build local singularity image (with system admin rights) based on the provided singularity definition file.
git clone https://forgemia.inra.fr/lpgp/peak_calling.git
sudo singularity build ./peak_calling/singularity/peak_calling.sif ./genome_aln/singularity/peak_calling.def
Usage example ATAC seq
design.csv file must have ID and target header and write with comma separator.
ID | target |
---|---|
A | /path/to/targetA.bam |
B | /path/to/targetB.bam |
C | /path/to/targetC.bam |
narrow peaks
#!/bin/bash
#SBATCH -J peak_calling
#SBATCH --mem=10GB
#SBATCH -p unlimitq
module load containers/singularity/3.9.9
module load bioinfo/Nextflow/23.04.3
nextflow run /work/project/lpgp/Nextflow/peak_calling/ \
-profile slurm \
--input "${PWD}/design.csv" \
--ATAC \
--gsize 2.34167e+09 \
--qvalue 0.05 \
--out_dir "${PWD}/results/"
broad peaks
#!/bin/bash
#SBATCH -J peak_calling
#SBATCH --mem=10GB
#SBATCH -p unlimitq
module load containers/singularity/3.9.9
module load bioinfo/Nextflow/23.04.3
nextflow run /work/project/lpgp/Nextflow/peak_calling/ \
-profile slurm \
--input "${PWD}/design.csv" \
--ATAC \
--gsize 2.34167e+09 \
--qvalue 0.05 \
--broad \
--broad_cutoff 0.1 \
--out_dir "${PWD}/results"
Usage example CHIP seq
design.csv file must have ID, target and input header and write with comma separator.
ID | target | input |
---|---|---|
A | /path/to/targetA.bam | /path/to/inputA.bam |
B | /path/to/targetB.bam | /path/to/inputB.bam |
C | /path/to/targetC.bam | /path/to/inputC.bam |
narrow peaks
#!/bin/bash
#SBATCH -J peak_calling
#SBATCH --mem=10GB
#SBATCH -p unlimitq
module load containers/singularity/3.9.9
module load bioinfo/Nextflow/23.04.3
nextflow run /work/project/lpgp/Nextflow/peak_calling/ \
-profile slurm \
--input "${PWD}/design.csv" \
--gsize 2.34167e+09 \
--qvalue 0.05 \
--out_dir "${PWD}/results/"
broad peaks
#!/bin/bash
#SBATCH -J peak_calling
#SBATCH --mem=10GB
#SBATCH -p unlimitq
module load containers/singularity/3.9.9
module load bioinfo/Nextflow/23.04.3
nextflow run /work/project/lpgp/Nextflow/peak_calling/ \
-profile slurm \
--input "${PWD}/design.csv" \
--gsize 2.34167e+09 \
--qvalue 0.05 \
--broad \
--broad_cutoff 0.1 \
--out_dir "${PWD}/results"
Defaults parameters
Please refer to macs2, and deeptools for complete arguments explanation.
# design file
input = false
# bam coverage
skip_coverage = false
# bam depth
skip_depth = false
# plotfingerprint
numberOfSamples = 500000
skip_plotfingerprint = false
# MACS2 parameters
skip_macs = false
gsize = false
qvalue = 0.05
pvalue = false
broad = false
broad_cutoff = 0.1
# count
min_reps_consensus = 1
# ATAC seq
ATAC = false
# save directory
out_dir = "${PWD}/results"
References
- MACS: Model-based analysis for ChIP-seq [Internet]. Available from: https://github.com/macs3-project/MACS
- Bedtools: A powerful toolset for genome arithmetic [Internet]. Available from: https://bedtools.readthedocs.io/en/latest/
- bedGraphToBigWig [Internet]. Available from: http://hgdownload.soe.ucsc.edu/admin/exe/linux.x86_64.v385/bedGraphToBigWig
- Ramírez F, Ryan DP, Grüning B, Bhardwaj V, Kilpert F, Richter AS, et al. deepTools2: a next generation web server for deep-sequencing data analysis. Nucleic Acids Res. 2016;44:W160–165.