nextflow-pairwise-blast

This Nextflow pipeline performs pairwise BLAST alignments to support clonal identification analysis. The pipeline aligns assembled FASTA sequences against multiple reference sequences using BLAST, extracts alignment scores, and identifies the best alignment for each sample.

Pipeline Overview

The pipeline creates a Cartesian product of all assembled FASTA files and reference sequences, performing comprehensive BLAST alignment across all combinations. The final output includes alignment scores for each sample-reference combination and a summary of the best alignments.

Pairwise Alignment Workflow

The pipeline processes assembled sequences through the following key steps:

BLAST2: Performs BLAST alignment of each assembled sequence against all reference sequences
BEST_ALIGNMENTS: Identifies the best alignment for each sample based on BLAST scores

Dependencies

This pipeline requires installation of:

Nextflow: Workflow management system
Docker: Containerization platform for running pipeline processes

Docker Containers

All docker containers used in this pipeline should be publicly available and specified in the respective module files:

BLAST2: seqwell/fq_assemble:v1.0
BEST_ALIGNMENTS: ubuntu:22.04

How to Run the Pipeline

Required Parameters

The pipeline requires the following parameters:

`--assembled_fa`

Path to a directory containing assembled FASTA files (*.fasta). Each FASTA file represents an assembled sequence to be aligned against references. This can be either a local absolute path or an AWS S3 URI. If using an S3 URI, ensure your AWS credentials are properly configured in the nextflow.config file.

`--ref`

Path to a directory containing reference FASTA files (*.fa or *.fasta). Each FASTA file will be used as a BLAST reference database. This can be either a local absolute path or an AWS S3 URI. If using an S3 URI, ensure your AWS credentials are properly configured in the nextflow.config file.

`--output`

The output directory path where results will be saved. This can be a local absolute path or an AWS S3 URI. If using an S3 URI, please ensure your security credentials are configured in the nextflow.config file.

`--run_id`

A unique identifier for the sequencing run being analysed.

Profiles

Profiles can be selected with the -profile option at the command line. Common profiles include:

docker: Run pipeline using Docker containers (it is the default)
test: Run pipeline using Docker containers with parameters set to default

Example Commands

Basic Execution

A minimal execution might look like:

nextflow run \
    main.nf \
    --assembled_fa "${PWD}/path/to/assembled/directory" \
    --ref "${PWD}/path/to/references" \
    --run_id "test" \
    --output "blast2_out" \
    -resume -bg

Running Test Data

The pipeline can be run using test data with:

nextflow run \
    main.nf \
    --assembled_fa "${PWD}/tests/assembled_fa" \
    --ref "${PWD}/tests/ref" \
    --run_id "test" \
    --output "blast2_out" \
    -resume -bg

Expected Outputs

└── blast2_output
    ├── best_alignments
    │   └── test_best_alignments.csv
    └── blast2
        ├── EP_1002_A01.final.pBR322.blast.besthit.txt                               # best blast alignment for each sample
        ├── EP_1002_A01.final.pBR322.blast.txt                                       # blast results
        ├── EP_1002_A01.final.pUC19.blast.besthit.txt
        ├── EP_1002_A01.final.pUC19.blast.txt
        ├── EP_1002_A01.final.seqWell_DelwithpUCIDT-KanGoldenGate+.blast.besthit.txt
        ├── EP_1002_A01.final.seqWell_DelwithpUCIDT-KanGoldenGate+.blast.txt
        ......
        ├── EP_1002_A06.final.pBR322.blast.besthit.txt
        ├── EP_1002_A06.final.pBR322.blast.txt
        ├── EP_1002_A06.final.pUC19.blast.besthit.txt
        ├── EP_1002_A06.final.pUC19.blast.txt
        ├── EP_1002_A06.final.seqWell_DelwithpUCIDT-KanGoldenGate+.blast.besthit.txt
        └── EP_1002_A06.final.seqWell_DelwithpUCIDT-KanGoldenGate+.blast.txt

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
.github/workflows		.github/workflows
assets		assets
modules		modules
tests		tests
.gitignore		.gitignore
README.md		README.md
main.nf		main.nf
nextflow.config		nextflow.config
nextflow.sh		nextflow.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

nextflow-pairwise-blast

Pipeline Overview

Pairwise Alignment Workflow

Dependencies

Docker Containers

How to Run the Pipeline

Required Parameters

`--assembled_fa`

`--ref`

`--output`

`--run_id`

Profiles

Example Commands

Basic Execution

Running Test Data

Expected Outputs

About

Uh oh!

Releases

Packages

Languages

seqwell/nextflow-pairwise-blast

Folders and files

Latest commit

History

Repository files navigation

nextflow-pairwise-blast

Pipeline Overview

Pairwise Alignment Workflow

Dependencies

Docker Containers

How to Run the Pipeline

Required Parameters

--assembled_fa

--ref

--output

--run_id

Profiles

Example Commands

Basic Execution

Running Test Data

Expected Outputs

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

`--assembled_fa`

`--ref`

`--output`

`--run_id`

Packages