Background: The muscular dystrophies are inherited, progressive muscle disorders characterized by destruction and wasting of muscle tissues. Standard variant analysis of next generation sequence (NGS) data is commonly used to discover pathogenic mutations underlying rare genetic disorders. NGS studies of limb girdle muscular dystrophies (LGMDs) typically identify pathogenic mutations in approximately 40% of cases. We developed a bioinformatics pipeline to screen existing NGS data for potentially aberrant novel essential splice sites (PANESS) that are not identified by current standard variant analyses.
Objective: We have expanded the initial pilot study of the PANESS pipeline to evaluate NGS data from a total of 25 families that remained unsolved after standard variant analyses.
Results: In the pilot study of three families, the PANESS pipeline identified a homozygous ATP2A1 variant (NC_000016.9:g.28905928G>A; NM_004320.4:c.1287G>A:p.(Glu429=)) that was predicted to cause the omission of exon 11. The expanded study of NGS data from 22 families has identified an average of 315 PANESS variants in NGS data from each family. We selected a total of 50 variants in 20 families for initial confirmatory studies. Of these, 4 PANESS variants appear to be potentially pathogenic. In silico and functional analyses of these variants are underway. One of these variants is particularly notable. A heterozygous, in-frame deletion (ClinVar: Pathogenic) in exon 4 of CAPN3 had previously been identified in both the proband and unaffected father. The PANESS variant is a maternally inherited, heterozygous, intronic variant that is predicted to cause inclusion of 19 bases resulting in a frame shift and premature stop. Additional PANESS variants may be selected for further studies in families in which a potential pathogenic mutation has not yet been identified.
Conclusions: This study demonstrates the utility of the PANESS pipeline to detect cryptic mutations in existing NGS data across a broad range of phenotypes and inheritance patterns. The list of PANESS variants and affected genes can be used in pathway, gene ontology, and other in silico analyses to prioritize the list of candidate variants prior to laboratory characterization of their biological effect. The PANESS pipeline increased the yield of standard NGS studies in this cohort, identifying potentially pathogenic splice variants in 20% of previously unsolved cases evaluated to date.