Sovaris Aerospace, MetaboLogics, and the Infectious Disease Research Center at Colorado State University recently co-sponsored a presentation by David Broadhurst, Ph.D. entitled, “Better Data by Design: The Complex World of Post-Genomic Clinical Biomarker Discovery.”

Many post genomic clinical studies including metabolomics, transcriptomics, proteomics, and other high-content or high-throughput, ‘omic experiments are set up such that the primary aim is the discovery of biomarkers that can discriminate, with a certain level of certainty, between nominally matched ‘case’ and ‘control’ samples. However, it is unfortunately very easy to find markers that are apparently persuasive, but that are in fact entirely spurious. Easily avoidable bad practice in experiment design and execution, can occur at many levels prior to statistical modeling, including bias in patient selection, inconsistent bio-banking, poor choice in clinical endpoint, lack of understanding of the limitations analytical platforms, inadequate sample size, inappropriate choice of statistical methods, inadequate model assessment, and minimal and ineffective lab based quality assurance protocols. Many studies fail to take these key factors into account, and thereby fail to discover anything of true significance, or more seriously, report spurious findings that prove impossible to validate. This presentation summarizes these problems from a practical perspective, and provides pointers to assist in the improved design and evaluation of biomarker discovery experiments, with the emphasis on clinical metabolomics studies.

Dr. Broadurst is Assistant Professor of Biostatistics in the Department of Medicine, University of Alberta, Canada. Prior to moving to the U of A, Dr. Broadhurst was part of Dr. Douglas Kell’s Bioanalytical Sciences Group at the University of Manchester, where he was lead biostatistician for the Human Serum Metabolome Project (HUSERMET, In particular he helped advance the use of untargeted metabolic profiling in understanding human pathology, with a focus on GC-MS and LC-MS platforms. This included the development and validation of bio-analytical methodologies, the development of strict biobanking and experimental design protocols, and promoting the need for extremely rigorous statistical analysis.