Machine Learning Applied to Precision Medicine & Human Factors

Machine learning (ML) is a subset of artificial intelligence.  Machine learning is rooted in statistical models and algorithms that rely on patterns and inference, rather than explicit instructions in order to perform a specific task.  Typically, ML uses sample data (known as training data or training sets), and constructs a mathematical model used to make predictions or decisions. In general, the process governing these “decisions” (and predictions) does not have to be programmed to specifically perform the task in question.  As such, machine learning can be viewed as a set of techniques or methods that are used to enable artificial intelligence.  At Sovaris Aerospace, we use machine learning as one tool to more fully understand the complex dynamics of molecular data, such as genome, epigenome, transcriptome, proteome, metabolome, microbiome, and others.

We also work to integrate this type of data with physiologic and other phenotypic data.  In some cases, the phenotype data may be derived from performance metrics, such as performance in space, performance at altitude, or performance in a professional athlete.  In other cases, the phenotype data may be derived from a clinical condition, such as concussion or cognitive decline.  The statistical approaches we apply combine univariate and multivariate analysis, which include unsupervised and supervised methods.  Sovaris Aerospace utilizes the various complex tools of machine learning, but recognizes the importance of humans being constantly in the loop of interpretation, optimization, and refinement, which becomes especially important when interventions derived from such data are being designed or recommended.

Among the machine learning tools employed by Sovaris are artificial neural networks.  Neural networks are a set of algorithms, modeled loosely after the human brain, that are designed to recognize patterns.  Artificial neural networks are generally designed for spotting patterns in data, which may include classification (classifying data sets into predefined classes), clustering (classifying data into different  categories), and prediction (using past information to predict future outcomes).  There exist the more commonplace single-layer neural networks and the more complex deep-learning networks.  Deep-learning networks are distinguished from single-hidden-layer neural networks by their depth, which more precisely means that there is a greater number of node layers through which data must pass in a multi-step process of pattern recognition.  Generally, more than three layers (including input and output) qualifies as “deep” learning.

 

Figure 1  General Structure of an Artificial Neural Network. This represents a simple ANN, described as a Feedforward Artificial Neural Network.  Inputs enter the network. The coefficients, or weights, map that input to a set of estimates that the network makes at the output layer. 

 

Figure 2  The input layer consists of 1) essential inputs, 2) conditionally-essential inputs, 3) non-essential inputs, or 4) a combination of one or more of the preceding.  Examples of essential inputs include essential nutrients, which the human body cannot synthesize and must be obtained from the diet (e.g. methylcobalamin, amino acids, essential fatty acids).  Examples of conditionally-essential inputs include those that can be synthesized by the human body, though their synthesis can be impaired by deficits of a limiting precursor (e.g. alpha-linolenic acid to docosahexaenoic acid; cysteine to glutathione). The end product is not classically considered to be essential, but lack of sufficient precursor may render it necessary to provide the end product in its preformed state.  Examples of non-essential inputs include drugs (e.g. promethazine) or chemicals (e.g. toluene).  In dynamic systems, multiple inputs converge to influence the phenotype.

 

Understanding Biological Meaning

One of the crucial features of our work is the determination of biological meaning from complex data sets.  Complex data analytics generate a wide array of patterns that are produced from molecular data, physiologic data, environmental data, anthropometric data, geospatial data, disease diagnostic data, and many others.  It is imperative that any machine learning application be able to accurately identify patterns of variance and characterize those features that account for such variance.  Identification of these features is followed by the application of pathway analysis and network analysis, which allows us to better describe the biological meaning of a given set of findings.  It is from this assessment of biological meaning that 1) new studies can be designed, 2) new tools of assessment can be envisioned, and 3) novel treatments or countermeasures can be conceived.

This fundamentally requires a human in the loop as part of an iterative process, until such time that the data inputs to such a system are sufficiently complete and the algorithms that process such data are thorough enough to provide automated recommendations in response to such an analysis.

 

Disease Diagnosis or Pattern Analysis?

The terms machine learning, deep learning, and AI used in medicine are often applied to disease diagnosis and prescription of medical treatment (drugs, surgery, etc.).   Sovaris Aerospace operates in the field of Precision Medicine.  However, our focus is less on diagnosis of disease and more on pattern analysis that can be leveraged to develop tools that can facilitate resiliency at the system level.  From this complex pattern analysis will emerge precision recommendations aimed at furthering human health and performance in individuals in the extreme environments in which we operate.  Even when we examine the patterns in a person with a diagnosis (or disease), our focus is on identifying patterns of variance, whereby metabolic resiliency can be optimized in the context of a diagnosis.

 

Algorithms: Novel vs. State-of-the Art Methods

Those who work in the fields of machine learning and artificial intelligence have to balance asserting the novelty of their tests in the context of what are the current best practices (state-of-the-art) in the appropriate scientific fields.  Novelty is what gives a company a perceived advantage in the marketplace, while using current best practices generates confidence and a sense of reliability.  First, Sovaris Aerospace and its teams apply the reliable and repeatable methods known to the field of multivariate analysis and machine learning.  Second, we have developed novel insights and capabilities, which have been developed through working on complex field and test conditions, ranging from small (N=2) to large (N > 5 million) population studies.  These include, but are not limited to large-scale medical studies, military Special Forces, human spaceflight (low Earth orbit), suborbital space flight, rare diseases, and others.

 

Standards

Our team members also sit on national and international standards committees, which set quality control and quality assurance standards for multi-scale omics, machine learning, bioinformatics, and precision medicine.  Not only are we kept abreast of current best practices through interaction with such expert committees, but our team members are also actively involved in recommending standards for such best practices in their respective fields.

While these standards are always evolving, it is these standards that establish a core of best practices across these very complex disciplines.  We believe that it is important that the corporate teams who build molecular and informatics tools for commercial use have experience in the related professional societies and standard-setting organizations.  As such, Sovaris Aerospace team members have roles in a range of standards committees that establish best practices in the fields of multi-scale omics, bioinformatics, machine learning, health intelligence, and human safety.