We specialize in long-reads and long-range technologies
In order to generated high-quality reference genome assemblies (N50 contig>1Mb, N50 scaffold>10Mb and at least QV40), we use a combination of long-read HiFi PacBio sequences for generating contigs, Bionano optical maps and Hi-C for scaffolding, and ultra-long Oxford Nanopore reads for gap-filling. We host several sequencing instruments to generate most of the data in our lab.
The Bionano Saphyr instrument generates optical genome maps. These maps are being assembled with long molecules ranging from 150kb to multiple megabases. The assembled maps can then be used to scaffold the pacbio assembly.
We use Arima Hi-C to perform a second round of scaffolding. Because this technology allows to detect long-range interactions, it is possible to scaffold entire chromosomes. We have also started to use Hi-C for haplotype phasing of the pacbio contigs.
The Oxford Nanopore PromethIon (ONT) is our latest acquisition. This technology is capable of sequencing ultra-long DNA fragments (>100kb), albeit with lesser quality than Hifi data. There are regions of the genome, Hifi data can't go trough because of sequencing bias or too long repeats, causing gaps in the assembly. We have started to use ultra-long ONT data for filling these remaining gaps.
VGP pipeline v. 2.0
This is the latest version of our VGP genome assembly pipeline (v. 2.0) that uses Pacbio Hifi data, Bionano optical mapping, and Hi-C data. We are actively working on the next version of the pipeline that will incorporate ONT data and use Hi-C data for haplotype phasing without parental information.