Skip to content
Snippets Groups Projects
abrionne's avatar
Aurelien Brionne authored
a580907c
History

peak_calling V1.1

peak_calling workflow , which agree to FAIR principles , was built in Nexflow dsl2 language, with singularity container for used softwares, optimized in terms of computing resources (cpu, memory), and its use on a informatic farm with a slurm scheduler.

  • Peaks calling (broad/narrow) was achieved using MACS3 [1]
  • normalised bigWig files scaled to 1 million mapped reads with BEDTools [2] and bedGraphToBigWig [3]
  • genome-wide enrichment with deepTools [4].

genome_aln workflow is available for first step.

Install flow and build singularity image

Clone peak_calling git and build local singularity image (with system admin rights) based on the provided singularity definition file.

git clone https://forgemia.inra.fr/lpgp/peak_calling.git
sudo singularity build ./peak_calling/singularity/peak_calling.sif ./genome_aln/singularity/peak_calling.def

Usage example ATAC seq

design.csv file must have ID and target header and write with comma separator.

ID target
A /path/to/targetA.bam
B /path/to/targetB.bam
C /path/to/targetC.bam

narrow peaks

#!/bin/bash
#SBATCH -J peak_calling
#SBATCH --mem=10GB
#SBATCH -p unlimitq
module load containers/singularity/3.9.9
module load bioinfo/Nextflow/23.04.3
nextflow run /work/project/lpgp/Nextflow/peak_calling/ \
-profile slurm \
--input "${PWD}/design.csv" \
--ATAC \
--gsize 2.34167e+09 \
--qvalue 0.05 \
--out_dir "${PWD}/results/"

broad peaks

#!/bin/bash
#SBATCH -J peak_calling
#SBATCH --mem=10GB
#SBATCH -p unlimitq
module load containers/singularity/3.9.9
module load bioinfo/Nextflow/23.04.3
nextflow run /work/project/lpgp/Nextflow/peak_calling/ \
-profile slurm \
--input "${PWD}/design.csv" \
--ATAC \
--gsize 2.34167e+09 \
--qvalue 0.05 \
--broad \
--broad_cutoff 0.1 \
--out_dir "${PWD}/results"

Usage example CHIP seq

design.csv file must have ID, target and input header and write with comma separator.

ID target input
A /path/to/targetA.bam /path/to/inputA.bam
B /path/to/targetB.bam /path/to/inputB.bam
C /path/to/targetC.bam /path/to/inputC.bam

narrow peaks

#!/bin/bash
#SBATCH -J peak_calling
#SBATCH --mem=10GB
#SBATCH -p unlimitq
module load containers/singularity/3.9.9
module load bioinfo/Nextflow/23.04.3
nextflow run /work/project/lpgp/Nextflow/peak_calling/ \
-profile slurm \
--input "${PWD}/design.csv" \
--gsize 2.34167e+09 \
--qvalue 0.05 \
--out_dir "${PWD}/results/"

broad peaks

#!/bin/bash
#SBATCH -J peak_calling
#SBATCH --mem=10GB
#SBATCH -p unlimitq
module load containers/singularity/3.9.9
module load bioinfo/Nextflow/23.04.3
nextflow run /work/project/lpgp/Nextflow/peak_calling/ \
-profile slurm \
--input "${PWD}/design.csv" \
--gsize 2.34167e+09 \
--qvalue 0.05 \
--broad \
--broad_cutoff 0.1 \
--out_dir "${PWD}/results"

Defaults parameters

Please refer to macs2, and deeptools for complete arguments explanation.

# design file
input = false

# bam coverage
skip_coverage = false

# bam depth
skip_depth = false

# plotfingerprint
numberOfSamples = 500000
skip_plotfingerprint = false

# MACS2 parameters
skip_macs = false
gsize = false
qvalue = 0.05
pvalue = false
broad = false
broad_cutoff = 0.1

# count
min_reps_consensus = 1

# ATAC seq
ATAC = false

# save directory
out_dir = "${PWD}/results"

References

  1. MACS: Model-based analysis for ChIP-seq [Internet]. Available from: https://github.com/macs3-project/MACS
  2. Bedtools: A powerful toolset for genome arithmetic [Internet]. Available from: https://bedtools.readthedocs.io/en/latest/
  3. bedGraphToBigWig [Internet]. Available from: http://hgdownload.soe.ucsc.edu/admin/exe/linux.x86_64.v385/bedGraphToBigWig
  4. Ramírez F, Ryan DP, Grüning B, Bhardwaj V, Kilpert F, Richter AS, et al. deepTools2: a next generation web server for deep-sequencing data analysis. Nucleic Acids Res. 2016;44:W160–165.