Duplication hotspots, rare genomic disorders, and common disease
Introduction
The development of cytogenetic techniques, including high resolution karyotyping and fluorescence in situ hybridization (FISH), in the early 1980s resulted in the identification of microdeletions responsible for Prader–Willi (15q11–q13 deletions) [1] and Smith–Magenis (17p11.2 deletions) [2] syndromes. The term genomic disorder was originally introduced to describe conditions resulting from nonallelic homologous recombination (NAHR) or unequal crossing over between segmental duplications (a.k.a. low copy repeats) [3••]. Over the next decade, continued efforts to fine-map recurrent deletions implicated NAHR for recurrent rearrangements in Charcot–Marie–Tooth disease [4•], hereditary neuropathy with liability to pressure palsies [5•], and Prader–Willi [6], Angelman [7], Smith–Magenis [8•], velocardiofacial [9], Williams–Beurens [10], and Sotos [11] syndromes as well as spinal muscular atrophy [12] and juvenile nephronophthisis type I [13] (Figure 1) to name a few. Molecular diagnosis became possible but relied on first, suspecting a specific disorder based on clinical features and second, using a targeted FISH assay for the chromosomal region to confirm the suspected diagnosis — a ‘phenotype first’ approach.
Advances in technology — most notably the introduction of array comparative genomic hybridization (CGH) and single nucleotide polymorphism (SNP) microarrays — now allow rapid evaluation of many targeted loci or the entire genome for submicroscopic deletions and duplications. A significant advantage of these approaches is that a suspected diagnosis is not necessary before performing the diagnostic test. The application of both targeted and whole-genome technologies to large series of patients with mental retardation (MR) or developmental delay [14•, 15, 16, 17, 18•, 19], autism [20, 21, 22, 23•, 24, 25], congenital anomalies [26, 27•, 28, 29], and schizophrenia [30, 31, 32•] has had several important consequences. First, the rate of discovery of novel disorders has increased dramatically. Since 2005, 18 new genomic disorders involving 12 regions of the genome have been described, more than doubling the number of disorders described in the previous 20 years (Table 1). Perhaps more importantly, whole-genome approaches have led to a remarkable shift from a ‘phenotype first’ to a ‘genotype first’ definition of genomic disorders. Whereas previously, disorders were described using clinical features, new disorders are described by their genomic rearrangement and clinical features are compared among patients after a common rearrangement is identified. As the diversity of phenotypes evaluated for pathogenic copy number changes expands, so does the phenotypic diversity associated with at least a subset of recurrent rearrangements — in fact, for some of the rearrangements described below, the ‘phenotype first’ approach would have been nearly impossible.
The underlying genomic architecture in each of the genomic disorders identified to date is similar: a stretch of unique sequence (50 kb–10 Mb) flanked by large (>10 kb), highly homologous (>95%) segmental duplications that provide the substrate for NAHR. In 2002, we used these criteria to identify rearrangement ‘hotspots’ — regions predicted to be susceptible to recurrent rearrangement based on the flanking genomic architecture [33••] — and developed a targeted array CGH assay to evaluate copy number variation in both affected and unaffected individuals. An updated map of predicted hotspots and associated disorders is shown in Figure 1; there are now 21 discrete regions of the genome that undergo recurrent rearrangement, resulting in 33 diseases, and at least 10 additional diseases are the result of NAHR in regions of the genome that are flanked by duplications but do not meet our strict definition of a hotspot.
Section snippets
Mental retardation syndromes
The majority of the genomic disorders identified before 2006 were characterized by developmental delay, learning disability, and/or MR. Interestingly, the genetic basis for MR is still unknown in well over 50% of clinical cases. Therefore, many studies have been aimed at identifying submicroscopic copy number changes in this population [14•, 15, 16, 17, 18•, 19], and it is now estimated that large microdeletions and microduplications underlie >15% of MR. We note that many potential pathogenic
Non-MR genomic disorders
Although neurocognitive and neurobehavioral diseases appear to be enriched for genomic disorders, this may simply be a result of ascertainment bias. Recent investigations of other diseases suggest that recurrent genomic rearrangements also underlie some disorders that do not include cognitive deficits as a primary phenotype. Array CGH studies of individuals with thrombocytopenia-absent radius (TAR) syndrome found that 30/30 affected probands shared a ∼500-kb deletion on chromosome 1q21.1 [27•].
Genomic disorders defying syndromic classification
One of the most intriguing developments over the past two years has been the discovery of at least three new recurrent microdeletions that are enriched in multiple neuropsychiatric diseases but elude syndromic classification. Although each microdeletion was first identified in a series of individuals with similar phenotypes, the application of whole-genome copy number variation analysis to a wider range of neurocognitive disorders has revealed unprecedented phenotypic diversity.
Genomic hotspot model of common and rare disease
A slight majority of the rearrangements that have been shown to be disease-causing are mediated by segmental duplications. This is simply a consequence of the fact that duplicated sequences promote recurrent rearrangements (Figure 3) requiring far fewer patients and controls to be tested in order to prove pathogenicity when compared to large copy number variants (CNVs) not flanked by segmental duplications. The wide range of phenotypes associated with rearrangements of 16p11.2, 1q21.1, and
Future directions and conclusions
As we forge ahead in this ‘genotype first’ era of rapid CNV discovery, we should anticipate the need to screen large disease cohorts (10 000–50 000 affected individuals) in order to assess the pathogenicity of other rare CNVs, especially those not flanked by segmental duplications. Some of these numbers may be achieved by leveraging CNV datasets from seemingly disparate disease cohorts (i.e. autism, MR, schizophrenia, and epilepsy). Until such large supracollaborations are established, targeting
References and recommended reading
Paper of particular interest, published within the period of review, have been highlighted as:
• of special interest
•• of outstanding interest
Acknowledgments
We thank Ginger Cheng for assistance in preparation of Figure 1. We apologize to our colleagues whom we could not cite due to the limited number of allowed references. Dr. Eichler is an investigator of the Howard Hughes Medical Institute and is supported in part by the NIH grant HD043569. Dr. Mefford is supported in part by NIH grant HD043376.
References (58)
Genomic disorders: structural features of the genome can lead to DNA rearrangements and human disease traits
Trends Genet
(1998)- et al.
DNA deletion associated with hereditary neuropathy with liability to pressure palsies
Cell
(1993) - et al.
Chromosome breakage in the Prader–Willi and Angelman syndromes involves recombination between large, transcribed repeats at proximal and distal breakpoints
Am J Hum Genet
(1999) - et al.
A common molecular basis for rearrangement disorders on chromosome 22q11
Hum Mol Genet
(1999) - et al.
Molecular definition of the chromosome 7 deletion in Williams syndrome and parent-of-origin effects on growth
Am J Hum Genet
(1996) - et al.
Targeted genomic microarray analysis for identification of chromosome abnormalities in 1500 consecutive clinical cases
J Pediatr
(2006) - et al.
Discovery of previously unidentified genomic disorders from the duplication architecture of the human genome
Nat Genet
(2006) - et al.
Recurrent 16p11.2 microdeletions in autism
Hum Mol Genet
(2008) - et al.
High frequency of submicroscopic genomic aberrations detected by tiling path array comparative genome hybridisation in patients with isolated congenital heart disease
J Med Genet
(2008) - et al.
Recurrent reciprocal genomic rearrangements of 17q12 are associated with renal disease, diabetes, and epilepsy
Am J Hum Genet
(2007)
Clinical and molecular delineation of the 17q21.31 microdeletion syndrome
J Med Genet
Evolutionary toggling of the MAPT 17q21.31 inversion region
Nat Genet
3q29 microdeletion syndrome: clinical and molecular characterization of a new syndrome
Am J Hum Genet
Recurrent reciprocal 1q21.1 deletions and duplications associated with microcephaly or macrocephaly and developmental and behavioral abnormalities
Nat Genet
Recurrent rearrangements of chromosome 1q21.1 and variable pediatric phenotypes
N Engl J Med
A recurrent 15q13.3 microdeletion syndrome associated with mental retardation and seizures
Nat Genet
Microdeletion/duplication at 15q13.2q13.3 among individuals with features of autism and other neuropsychiatric disorders
J Med Genet
Clinical and cytogenetic survey of 39 individuals with Prader–Labhart–Willi syndrome
Am J Med Genet
Interstitial deletion of (17)(p11.2p11. 2) in nine patients
Am J Med Genet
DNA duplication associated with Charcot–Marie–Tooth disease type 1A
Cell
Inter- and intrachromosomal rearrangements are both involved in the origin of 15q11–q13 deletions in Prader–Willi syndrome
Am J Hum Genet
Homologous recombination of a flanking repeat gene cluster is a mechanism for a common contiguous gene deletion syndrome
Nat Genet
Fifty microdeletions among 112 cases of Sotos syndrome: low copy repeats possibly mediate the common deletion
Hum Mutat
De novo and inherited deletions of the 5q13 region in spinal muscular atrophies
Science
Characterization of the NPHP1 locus: mutational mechanism involved in deletions in familial juvenile nephronophthisis
Am J Hum Genet
Diagnostic genome profiling in mental retardation
Am J Hum Genet
A new chromosome 17q21.31 microdeletion syndrome associated with a common inversion polymorphism
Nat Genet
Array CGH in patients with learning disability (mental retardation) and congenital anomalies: updated systematic review and meta-analysis of 19 studies and 13,926 subjects
Genet Med
Cited by (174)
En Route to Completion: What Is An Ideal Reference Genome?
2022, Genomics, Proteomics and BioinformaticsMolecular drivers of human cerebral cortical evolution
2020, Neuroscience ResearchCitation Excerpt :Segmental duplications tend to occur in specific locations of the genome, becoming into the hot-spots of copy number variation (CNV) (Liu et al., 2012; Malhotra and Sebat, 2012; Sudmant et al., 2013). Interestingly, these CNV hot-spots are frequently associated with congenital neurodevelopmental and psychiatric diseases, suggesting that HS gene duplications may have impacts on brain development and function, and constitute another significant driver of brain evolution (Coe et al., 2012; Dennis and Eichler, 2016; Grayton et al., 2012; Kaminsky et al., 2011; Mefford and Eichler, 2009; Stankiewicz and Lupski, 2010; Sudmant et al., 2010; Weischenfeldt et al., 2013). This fits well with a traditional hypothesis that a phenotypic evolution is driven by gene duplication (Ohno, 1999, 1970).