Preclinical efficacy evaluation in mouse models of human diseases is an important component of drug development. The quality and reproducibility depend on the thoroughness of the preclinical study, including the design, execution, analysis, and reporting of the data. Regulatory agencies such as the US FDA and EMA require robust and reliable data in animal models. The most frequent reason for failure of a trials could be due to 1) lack of rigor in preclinical trial design, 2) poor predictive power of disease models, 3) questionable targets, 4) poor control for potential bias, or 5) variable reporting standards. In the past five years, we have evaluated over 144 interventions in mouse models of Duchenne Muscular Dystrophy and Limb Girdle Muscular Dystrophy 2B using a well characterized set of functional testing (grip strength measurement (GSM), in vitro force contractions), and various histological analyses. These studies were conducted with more than 3,000 mice, using preclinical standard operating procedures, appropriate blinding, randomization and statistical analysis. As a result of rigorous practice, reproducible data have been achieved year-to-year in outcomes such as body weights, GSM and in vitro force. Many of these outcomes were performed by more than one experimenter yet achieved a very tight coefficient of variance (%CV). Analysis of data gathered between 2014-2019 showed that the bodyweight had the %CV of 10% for wild type (WT) and 9% for MDX with n = 243, whereas the GSM, which is experimenter dependent technique, showed %CV of 5.9% for WT and 8% for MDX with n=204 animals, with more than one experimenters performing the assay. The gold standard technique such as in vitro force measurement also showed a very tight %CV of 10% for WT and 8.8% for MDX with n = 202 animals. These results show if we follow well standardized methods, even the most variable in-life functional and behavioral assay will yield acceptable reproducibility and robustness.