hybracter
Commands and Input CSV
Commands
hybracter install
: Installs required databases forhybracter
.hybracter hybrid
: Assemble multiple genomes from isolates that have long-reads and paired-end short reads.hybracter hybrid-single
: Assembles a single genome from an isolate with long-reads and paired-end short reads. It takes similar parameters to Unicycler.hybracter long
: Assemble multiple genomes from isolates that have long-reads only.hybracter long-single
: Assembles a single genome from an isolate with long-reads only.hybracter install
: Downloads and installs the requiredplassembler
database.hybracter test-hybrid
: Runshybracter hybrid
on a small test dataset.hybracter test-long
: Runshybracter long
on a small test dataset.
hybracter -h
_ _ _
| |__ _ _| |__ _ __ __ _ ___| |_ ___ _ __
| '_ \| | | | '_ \| '__/ _` |/ __| __/ _ \ '__|
| | | | |_| | |_) | | | (_| | (__| || __/ |
|_| |_|\__, |_.__/|_| \__,_|\___|\__\___|_|
|___/
Usage: hybracter [OPTIONS] COMMAND [ARGS]...
For more options, run: hybracter command --help
Options:
-h, --help Show this message and exit.
Commands:
install Downloads and installs the plassembler database
hybrid Run hybracter with hybrid long and paired end short reads
hybrid-single Run hybracter hybrid on 1 isolate
long Run hybracter with only long reads
long-single Run hybracter long on 1 isolate
test-hybrid Test hybracter hybrid
test-long Test hybracter long
config Copy the system default config file
citation Print the citation(s) for hybracter
version Print the version for hybracter
Input csv
hybracter hybrid
and hybracter long
require an input csv file to be specified with --input
. No other inputs are required.
- This file requires no headers.
- Other than the reads,
hybracter
requires a value for a lower bound the minimum chromosome length for each isolate in base pairs. It must be an integer. hybracter
will denote contigs about this value as chromosome(s) and if it can recover a chromosome, it will denote the isolate as complete.- In practice, I suggest choosing 90% of the estimated chromosome size for this value.
- e.g. for S. aureus, I'd choose 2500000, E. coli, 4000000, P. aeruginosa 5500000.
hybracter hybrid
hybracter hybrid
requires an input csv file with 5 columns.- Each row is a sample.
- Column 1 is the sample name you want for this isolate.
- Column 2 is the long read fastq file.
- Column 3 is the minimum chromosome length for that sample.
- Column 4 is the R1 short read fastq file
- Column 5 is the R2 short read fastq file.
e.g.
s_aureus_sample1,sample1_long_read.fastq.gz,2500000,sample1_SR_R1.fastq.gz,sample1_SR_R2.fastq.gz
p_aeruginosa_sample2,sample2_long_read.fastq.gz,5500000,sample2_SR_R1.fastq.gz,sample2_SR_R2.fastq.gz
s_aureus_sample1 | sample1_long_read.fastq.gz | 2500000 | sample1_SR_R1.fastq.gz | sample1_SR_R2.fastq.gz |
---|---|---|---|---|
p_aeruginosa_sample2 | sample2_long_read.fastq.gz | 5500000 | sample2_SR_R1.fastq.gz | sample2_SR_R2.fastq.gz |
hybracter long
hybracter long
also requires an input csv with no headers, but only 3 columns.
hybracter long
requires an input csv file with 3 columns.- Each row is a sample.
- Column 1 is the sample name you want for this isolate.
- Column 2 is the long read fastq file.
- Column 3 is the minimum chromosome length for that sample.
e.g.
s_aureus_sample1,sample1_long_read.fastq.gz,2500000
p_aeruginosa_sample2,sample2_long_read.fastq.gz,5500000
s_aureus_sample1 | sample1_long_read.fastq.gz | 2500000 |
---|---|---|
p_aeruginosa_sample2 | sample2_long_read.fastq.gz | 5500000 |
Pacbio Reads
Thanks to Brad Hart (@biobrad) for volunteering this tutorial on how to run Hybracter with PacBio long reads.
Hybracter works with PacBio long reads once they have been extracted from the PacBio BAM file.
To extract the raw fastq long read from the BAM file:
First, install pbtk: https://github.com/PacificBiosciences/pbtk
bash
conda create -n pacbio -c bioconda pbtk
conda activate pacbio
Next, extract the fastq, requiring a 'pbi' file as an index:
pbindex yourpacbiofile.bam
Then to extract the fastq file from the bam:
bam2fastq -o yourfastqfilename yourpacbiofile.bam
The output file will be:
yourfastqfilename.fastq.gz
which you can feed into any of the Hybracter input formats as above.