Background: Friedreich ataxia (FRDA) is typically caused by inheriting an expanded GAA triplet-repeat (100 – 1500 triplets) in intron 1 of the FXN gene from both parents. Using longread whole genomic sequencing we recently identified long tracks of GGA triplets within the expanded GAA sequence, constituting a novel class of pathogenic expanded composite repeat in FRDA. Most pathogenic composite alleles are missed by standard PCR-based diagnostic testing for FRDA, leading to mis-genotyping in patients, and false-negative heterozygous carrier determination.
Objectives: To determine the true prevalence and variety of pathogenic composite alleles in FRDA by characterizing the precise sequence of the expanded alleles in a large prospective series of patients. To overcome the expense and unwieldy nature of whole genome longread sequencing by developing a robust, logistically-feasible, and cost-effective testing strategy to accurately detect and characterize these missing pathogenic alleles in FRDA.
Results: We developed an optimized workflow of long-range PCR plus longread deep sequencing (Oxford nanopore; PromethION) of repeat-containing amplicons to detect and sequence all expanded composite alleles. This permitted accurate genotyping and heterozygous carrier identification. In a prospective series of 112 unrelated patients, we found that ~20% of people with FRDA have at least one pathogenic expanded composite allele, most of which had been missed by commercial testing. Among the variety of composite alleles observed, the most common non-GAA repeat sequences included tandem GGA and GAAGGA repeats. Other minor sequence interruptions in the expanded GAA repeat were detected in a further ~10% of patients.
Conclusions: Expanded FXN alleles with substantial non-GAA interruptions are pathogenic and prevalent in FRDA. Approximately 15% of FRDA patients have one such pathogenic allele that has been (and continues to be) missed by standard genetic diagnostic testing. A more accurate diagnostic testing
strategy is needed for accurate genotyping. We describe a robust and cost-effective workflow.