Hi-Plex Sequencing


UNDR ROVER - a fast and accurate variant caller for targeted DNA sequencing

Authors: Daniel J. Park, Roger Li, Edmund Lau, Peter Georgeson, Tú Nguyen-Dumont and Bernard J. Pope

Journal: BMC Bioinformatics, 2016

Topics: PCR-MPS; Hi-Plex; ROVER; Targeted sequencing; Massively parallel sequencing; Variant calling



Previously, we described ROVER, a DNA variant caller which identifies genetic variants from PCR-targeted massively parallel sequencing (MPS) datasets generated by the Hi-Plex protocol. ROVER permits stringent filtering of sequencing chemistry-induced errors by requiring reported variants to appear in both reads of overlapping pairs above certain thresholds of occurrence. ROVER was developed in tandem with Hi-Plex and has been used successfully to screen for genetic mutations in the breast cancer predisposition gene PALB2.

ROVER is applied to MPS data in BAM format and, therefore, relies on sequence reads being mapped to a reference genome. In this paper, we describe an improvement to ROVER, called UNDR ROVER (Unmapped primer-Directed ROVER), which accepts MPS data in FASTQ format, avoiding the need for a computationally expensive mapping stage. It does so by taking advantage of the location-specific nature of PCR-targeted MPS data.


The UNDR ROVER algorithm achieves the same stringent variant calling as its predecessor with a significant runtime performance improvement. In one indicative sequencing experiment, UNDR ROVER (in its fastest mode) required 8-fold less sequential computation time than the ROVER pipeline and 13-fold less sequential computation time than a variant calling pipeline based on the popular GATK tool.

UNDR ROVER is implemented in Python and runs on all popular POSIX-like operating systems (Linux, OS X). It requires as input a tab-delimited format file containing primer sequence information, a FASTA format file containing the reference genome sequence, and paired FASTQ files containing sequence reads. Primer sequences at the 5′ end of reads associate read-pairs with their targeted amplicon and, thus, their expected corresponding coordinates in the reference genome. The primer-intervening sequence of each read is compared against the reference sequence from the same location and variants are identified using the same algorithm as ROVER. Specifically, for a variant to be ‘called’ it must appear at the same location in both of the overlapping reads above user-defined thresholds of minimum number of reads and proportion of reads.


UNDR ROVER provides the same rapid and accurate genetic variant calling as its predecessor with greatly reduced computational costs.

Evaluation of germline BRCA1 and BRCA2 mutations in a multi-ethnic Asian cohort of ovarian cancer patients

Authors: Hasmad HN, Lai KN, Wen WX, Park DJ, Nguyen-Dumont T, Kang PC, Thirthagiri E, Ma’som M, Lim BK, Southey M, Woo YL, Teo SH

Journal: Gynecologic Oncology, 2015

Topics: Ovarian Cancer, BRCA1, BRCA2, Mutation Screening


Despite the discovery of breast and ovarian cancer predisposition genes BRCA1 and BRCA2 more than two decades ago, almost all the available data relate to women of European ancestry, with only a handful of studies in Asian populations. In this study, we determined the frequency of germline alterations in BRCA1 and BRCA2 in ovarian cancer patients from a multi-ethnic cross-sectional cohort of Asian ovarian cancer patients from Malaysia.

Hi-Plex targeted sequencing is effective using DNA derived from archival dried blood spots

Authors: Nguyen-Dumont T, Mahmoodi M, Hammet F, Tran T, Tsimiklis H, Kathleen Cuningham Foundation Consortium for Research into Familial Breast Cancer (kConFab), Giles GG, Hopper JL, Australian Breast Cancer Family Registry, Southey MC, Park DJ

Journal: Analytical Biochemistry, 2015

Topics: Hi-Plex, Massively parallel sequencing, Targeted sequencing, Dried blood spot, Guthrie Card, Archival DNA


Many genetic epidemiology resources have collected dried blood spots (predominantly as Guthrie Cards) as an economical and efficient means of archiving sources of DNA, conferring great value to genetic screening methods that are compatible with this medium. We applied Hi-Plex to screen the breast cancer predisposition gene PALB2 in 93 Guthrie Card-derived DNA specimens previously characterized for PALB2 genetic variants via DNA derived from lymphoblastoid cell lines, whole blood, and buffy coat. Of the 93 archival Guthrie Card-derived DNAs, 92 (99%) were processed successfully and sequenced using approximately half of a MiSeq run. From these 92 DNAs, all 59 known variants were detected and no false-positive variant calls were yielded. Fully 98.13% of amplicons (5417/5520) were represented within 15-fold of the median coverage (2786 reads), and 99.98% of amplicons (5519/5520) were represented at a depth of 10 read-pairs or greater. With Hi-Plex, we show for the first time that a High-Plex amplicon-based massively parallel sequencing (MPS) system can be applied effectively to DNA prepared from dried blood spot archival specimens and, as such, can dramatically increase the scopes of both method and resource.

Mutation screening of PALB2 in clinically ascertained families from the Breast Cancer Family Registry

Authors: Tú Nguyen-Dumont, Fleur Hammet, Maryam Mahmoodi, Helen Tsimiklis, Zhi L. Teo, Roger Li, Bernard J. Pope, Mary Beth Terry, Saundra S. Buys, Mary Daly, John L. Hopper, Ingrid Winship, David E. Goldgar, Daniel J. Park, Melissa C. Southey

Journal: Breast Cancer Research and Treatment, 2015

Topics: Breast Cancer, Bioinformatics, Genomics


Loss-of-function mutations in PALB2 are associated with an increased risk of breast cancer, with recent data showing that female breast cancer risks for PALB2 mutation carriers are comparable in magnitude to those for BRCA2 mutation carriers. This study applied targeted massively parallel sequencing to characterize the mutation spectrum of PALB2 in probands attending breast cancer genetics clinics in the USA. The coding regions and proximal intron–exon junctions of PALB2 were screened in probands not known to carry a mutation in BRCA1 or BCRA2 from 1,250 families enrolled through familial cancer clinics by the Breast Cancer Family Registry. Mutation screening was performed using Hi-Plex, an amplicon-based targeted massively parallel sequencing platform. Screening of PALB2 was successful in 1,240/1,250 probands and identified nine women with protein-truncating mutations (three nonsense mutations and five frameshift mutations). Four of the 33 missense variants were predicted to be deleterious to protein function by in silico analysis using two different programs. Analysis of tumors from carriers of truncating mutations revealed that the majority were high histological grade, invasive ductal carcinomas. Young onset was apparent in most families, with 19 breast cancers under 50 years of age, including eight under the age of 40 years. Our data demonstrate the utility of Hi-Plex in the context of high-throughput testing for rare genetic mutations and provide additional timely information about the nature and prevalence of PALB2 mutations, to enhance risk assessment and risk management of women at high risk of cancer attending clinical genetic services.

Abridged adapter primers increase the target scope of Hi-Plex

Authors: Tú Nguyen-Dumont, Fleur Hammet, Maryam Mahmoodi, Bernard J. Pope, Graham G. Giles, John L. Hopper, Melissa C. Southey, and Daniel J. Park

Journal: BioTechniques, 2015

Topics: Multiplex PCR, DNA Sequencing

Listed in the top ten BioTechnqiues peer-reviewed papers of 2015.


Previously, we reported Hi-Plex, an amplicon-based method for targeted massively parallel sequencing capable of generating 60 amplicons simultaneously. In further experiments, however, we found our approach did not scale to higher amplicon numbers. Here, we report a modification to the original Hi-Plex protocol that includes the use of abridged adapter oligonucleotides as universal primers (bridge primers) in the initial PCR mixture. Full-length adapter primers (indexing primers) are included only during latter stages of thermal cycling with concomitant application of elevated annealing temperatures. Using this approach, we demonstrate the application of Hi-Plex across a broad range of amplicon numbers (16-plex, 62-plex, 250-plex, and 1003-plex) while preserving the low amount (25 ng) of input DNA required.

ROVER variant caller: read-pair overlap considerate variant-calling software applied to PCR-based massively parallel sequencing datasets

Authors: B. Pope, T. Nguyen-Dumont, F. Hammet and D. Park.

Journal: Source Code for Biology and Medicine, 2014

Topics: Bioinformatics, Genomics, DNA Sequencing, Variant Calling


We recently described Hi-Plex, a highly multiplexed PCR-based target-enrichment system for massively parallel sequencing (MPS), which allows the uniform definition of library size so that subsequent paired-end sequencing can achieve complete overlap of read pairs. Variant calling from Hi-Plex-derived datasets can thus rely on the identification of variants appearing in both reads of read-pairs, permitting stringent filtering of sequencing chemistry-induced errors. These principles underly ROVER software (derived from Read Overlap PCR-MPS variant caller), which we have recently used to report the screening for genetic mutations in the breast cancer predisposition gene PALB2. Here, we describe the algorithms underlying ROVER and its usage.

Hi-Plex for high-throughput mutation screening: application to the breast cancer susceptibility gene PALB2

Authors: T. Nguyen-Dumont, Z.L. Teo, B.J. Pope, F. Hammet, M. Mahmoodi, H. Tsimiklis, N. Sabbaghian, M. Tischkowitz, W.D. Foulkes, Kathleen Cuningham Foundation Consortium for research into Familial Breast cancer (kConFab), G.G. Giles, J.L. Hopper, Australian Breast Cancer Family Registry, M.C. Southey, D.J. Park.

Journal: BMC Medical Genomics, 2013

Topics: Bioinformatics, Genomics, DNA Sequencing, Breast Cancer, Multiplex PCR


Massively parallel sequencing (MPS) has revolutionised biomedical research and offers enormous capacity for clinical application. We previously reported Hi-Plex, a streamlined highly-multiplexed PCR-MPS approach, allowing a given library to be sequenced with both the Ion Torrent and TruSeq chemistries. Comparable sequencing efficiency was achieved using material derived from lymphoblastoid cell lines and formalin-fixed paraffin-embedded tumour. Here, we report high-throughput application of Hi-Plex by performing blinded mutation screening of the coding regions of the breast cancer susceptibility gene PALB2 on a set of 95 blood-derived DNA samples that had previously been screened using Sanger sequencing and high-resolution melting curve analysis (n = 90), or genotyped by Taqman probe-based assays (n = 5). Hi-Plex libraries were prepared simultaneously using relatively inexpensive, readily available reagents in a simple half-day protocol followed by MPS on a single MiSeq run. We observed that 99.93% of amplicons were represented at ≥10X coverage. All 56 previously identified variant calls were detected and no false positive calls were assigned. Four additional variant calls were made and confirmed upon re-analysis of previous data or subsequent Sanger sequencing. These results support Hi-Plex as a powerful approach for rapid, cost-effective and accurate high-throughput mutation screening. They further demonstrate that Hi-Plex methods are suitable for and can meet the demands of high-throughput genetic testing in research and clinical settings.

Cross-platform compatibility of Hi-Plex, a streamlined approach for targeted massively parallel sequencing

Authors: T. Nguyen-Dumont, B.J. Pope, F. Hammet, M. Mahmoodi, H. Tsimiklis, M.C. Southey, D.J. Park.

Journal: Analytical Biochemistry, 2013

Topics: Bioinformatics, Genomics, DNA Sequencing, Multiplex PCR


While per-base sequencing costs have decreased in recent years, library preparation for targeted massively parallel sequencing remains constrained by high reagent cost, limited design flexibility and protocol complexity. To address these limitations, we previously developed Hi-Plex, a PCR-massively parallel sequencing strategy for screening panels of genomic target regions. Here, we demonstrate that Hi-Plex applied with hybrid adapters can generate a library suitable for sequencing with both the Ion Torrent and the TruSeq chemistries, and that adjusting primer concentrations improves coverage uniformity. These results expand Hi-Plex capabilities as an accurate, affordable, flexible and rapid approach for various genetic screening applications.

A high-plex PCR approach for massively parallel sequencing

Authors: T. Nguyen-Dumont, B.J. Pope, F Hammet, M.C. Southey, and D.J. Park.

Journal: BioTechniques, Vol. 55, No. 2, August 2013, pp. 69–74

Topics: Bioinformatics, Genomics, DNA Sequencing, Multiplex PCR


Current methods for targeted massively parallel sequencing (MPS) have several drawbacks, including limited design flexibility, expense, and protocol complexity, which restrict their application to settings involving modest target size and requiring low cost and high throughput. To address this, we have developed Hi-Plex, a PCR-MPS strategy intended for high-throughput screening of multiple genomic target regions that integrates simple, automated primer design software to control product size. Featuring permissive thermocycling conditions and clamp bias reduction, our protocol is simple, cost- and time-effective, uses readily available reagents, does not require expensive instrumentation, and requires minimal optimization. In a 60-plex assay targeting the breast cancer predisposition genes PALB2 and XRCC2, we applied Hi-Plex to 100 ng LCL-derived DNA, and 100 ng and 25 ng FFPE tumor-derived DNA. Altogether, at least 86.94% of the human genome-mapped reads were on target, and 100% of targeted amplicons were represented within 25-fold of the mean. Using 25 ng FFPE-derived DNA, 95.14% of mapped reads were on-target and relative representation ranged from 10.1-fold lower to 5.8-fold higher than the mean. These results were obtained using only the initial automatically-designed primers present in equal concentration. Hi-Plex represents a powerful new approach for screening panels of genomic target regions.