Categorizing evolutionary processes using statistical learning techniques and a modification of Tajimas D Statistic
Ian P. Kotthoff* and Nathan B. Wikle
Dr. Anton Weisstein and Mr. Pamela J. Ryan, Faculty Mentors
Tajimas D is a statistical test used to infer the evolutionary processes acting in a population based on patterns of genetic diversity. This test may be used to distinguish the evolutionary mechanisms of neutral evolution, population subdivision, negative frequency-dependent selection, and purifying section that are present in a population. Similarities in Tajimas D statistic for different evolutionary mechanisms require the use of various summary statistics and statistical learning techniques to create an accurate predictive model. In this study the use of Tajimas Dsyn and Dnon statistics as predictor variables for distinguishing evolutionary mechanisms will be explored, with particular emphasis placed on their relevance to random forests. This study seeks to demonstrate that the implementation of random forests can lead to an accurate predictive model for observations influenced by population subdivision, purifying selection and negative frequency-dependent selection. Further improvements are needed to distinguish between the neutral evolutionary model and population bottleneck.
Keywords: Mathematical Biology , Mathematical modeling , Evolution , Tajimas D, Random forests , Data mining , Genetics , Predictive modeling
Topic(s):Mathematical Biology
Presentation Type: Oral Paper
Session: 112-3
Location: MG 1096
Time: 8:30