DATA SCIENCE WEEK

Department of Mathematical Sciences, Purdue University Fort Wayne

December 4 – 6, 2019

The advancement of technology and specifically the availability of a greater than ever computational power have given rise in recent years to the conditions necessary for the birth of the new research area of Data Science.
Data Science is an interdisciplinary field which joins together ideas and tools from Mathematics, Statistics, Computer Science, and Software Engineering, among others, and which bases its most important conclusions on evidence coming from data.
Data Science has revolutionized research methodology in health science, biology, economics, social sciences, climatology, and several more of the social and hard sciences. The results produced by this new paradigm have been extraordinary and can be seen in our everyday life. Personalized Medicine, Self-Driving Cars, fingerprint/image recognition, and GPS navigation are all products of the Data Science revolution.
Beautifully enough, these exciting results and the remarkably effective techniques of Data Science are not only largely accessible to non-experts, undergraduate and graduate students, and practitioners, but also benefit from this interaction of researchers coming from academia, industry, government organizations, and local communities.
The Data Science Week is intended to bring together the different populations in the metropolitan Fort Wayne region that work in Data Science and to demonstrate to our students the important role that Data Science plays in our everyday life and research. The Data Science Week will be the perfect occasion to launch a new inter-departmental Data Science Institute at Purdue University, Fort Wayne.
Registration
Registration is completely free. Please use the link https://purdue.ca1.qualtrics.com/jfe/form/SV_cA3qCmDpeMvovpr to register no later than December 2, 2019. If you have questions, please feel free to send an email to Dr. Alessandro Selvitella (aselvite@pfw.edu) or Dr. Yihao Deng (dengy@pfw.edu). We look forward to seeing you at the Data Science week!
Wednesday - December 4, 2019
5:00PM5:15PMLaunch of Data Science Cluster of ScholarsKettler Hall 216
5:15PMMovie and PizzaKettler Hall 216
Thursday - December 5, 2019
8:30AM8:55AMCoffee
9:00AM9:25AMFaculty Presentation – Dr. Chand ChauhanKettler Hall 216
9:30AM9:55AMFaculty Presentation – Dr. Yihao DengKettler Hall 216
10:00AM10:25AMFaculty Presentation – Dr. Zesheng ChenKettler Hall 216
10:30AM10:55AMCoffee Break
11:00AM11:25AMFaculty Presentation – Dr. Bin ChenKettler Hall 216
11:30AM11:55AMFaculty Presentation – Dr. Kathleen FosterKettler Hall 216
12:00PM12:25PMFaculty Presentation – Dr. Alessandro SelvitellaKettler Hall 216
12:30PM2:30PMLunch Break
2:30PM2:55PMCoffee, Tea and PastriesWalb Union G21
3:00PM4:30PMKeynote Speech – Dr. Bo LiWalb Union G21
5:00PM7:00PMCash Bar and ReceptionWalb Union International Ball Room North
Friday - December 6, 2019
8:30AM8:55AMCoffee
9:00AM9:20AMStudent Team Presentation – Justin Asher, Khoa Tan Dang, Peter Klopfenstein, Maxwell Masters, Jucoen YeaterKettler Hall 146
9:20AM9:40AMGraduate Student Presentation – Xiao YuanKettler Hall 146
9:40AM10:00AMStudent Team Presentation – Nguyen Nguyen, Xiao Yuan, Linh Le, Yanxuan (Anna) Liang, Hang Fey (Chris) ChanKettler Hall 146
10:00AM10:25AMCoffee Break
10:30AM11:50AMStudent Poster PresentationKettler Hall 2nd Floor
12:00PM1:25PMPanel DiscussionKettler Hall 146
1:30PMStudent Award and Closure
Keynote Speech
Climate, Health, and Statistics
Abstract
I will give a brief overview of my research followed by three detailed examples. As my first example, I describe our progress on the reconstruction of paleoclimate using different data sources. We have made a number of methodological developments that have enhanced our understanding of both climate reconstruction and its uncertainty quantification via Bayesian hierarchical models. My second example addresses one year ahead prediction of US county-specific HIV incidence rates using publicly available data that are abundant in space but scarce in time. We developed new spatially varying autoregressive models compounded with conditional autoregressive spatial correlation structures and compared their predictive ability to that of a number of linear mixed models. Motivated by my climate studies, the last example concerns new statistical methods for comparing two spatio-temporal random fields. We propose a multiple testing procedure for detecting local differences in the characteristics of two spatio-temporal random fields by explicitly taking spatial information into account. Finally, I will briefly discuss some future research interests.
December 5, 2019 @ 3:00PM
Faculty Presentations
A New Percentile Based Test
Abstract
Inference, such as hypothesis tests, with reasonable accuracy from extremely limited data is a challenge. In many situations it is common to have only two available sample percentile values of a response with no control on the choice of the available percentile pairs. In such situations, one may want to estimate other pertinent information about the population such as the mean, standard deviation, and eventually derive a test of the mean, based on the two available sample percentiles. We propose a test for the mean of a normal distribution based on two sample percentile values. The exact and asymptotic distributions of the proposed test statistic are investigated. The power and the robustness of the proposed test are under review.
December 5, 2019 @ 9:00AM
Copula Models for Dependent Data Analysis
Abstract
Dependent data appears naturally in statistical analysis. In this presentation, I will introduce a powerful yet flexible tool, namely copula models, to analyze dependent data. Some basic mathematical background and development of copulas will be discussed, and a real life example using family data will be presented.
December 5, 2019 @ 9:30AM
Detect M Giants in Space Using XGBoost
Abstract
In optical bands, the spectra of M giants often overlap with those of M dwarfs due to their similarities, especially for low or moderate resolution spectra. In this work, we use a machine learning method, eXtreme Gradient Boosting (XGBoost), to discern M giants from M dwarfs for spectroscopic surveys. We found that our XGBoost prediction model achieves 99.79% overall accuracy and 96.87% recognition precision for M giants, outperforming the other three popular machine learning algorithms (i.e., SVM, random forests, and ELM). Moreover, the important feature bands for distinguishing between M giants and M dwarfs are accurately identified by the XGBoost method through evaluating and quantifying the importance of each feature in spectra. This research is joint work with Dr. Zhenping Yi from Shandong University, Weihai.
December 5, 2019 @ 10:00AM
Deep Convolutional Neural Network and its Applications in Image Processing
Abstract
Deep learning has changed the landscape of artificial intelligence in the past five years. It has been quickly embraced by academia and industry, and applied to various areas from image classification, object recognition to creative artwork. The presentation will discuss recent progresses in convolutional neural network, one of the most active areas in deep learning, and its applications in industry and biomedical image processing and analysis.
December 5, 2019 @ 11:00AM
The Data Science revolution in Biomechanics: traditional statistical tests vs modern machine learning methods in the study of lizard locomotion
Abstract
Extraordinary advancements in computing power have facilitated the development and application of sophisticated statistical analyses to biological fields such as genomics, ecology, and evolution. However, even now, when powerful hardware and software tools have never been more accessible and despite significant advancements in statistical theory, physiological branches of biology, like biomechanics, seem to be stuck in the past, with the ubiquitous and almost exclusive use of classical univariate statistics. In this talk, I will discuss how more modern machine learning methods impact and revolutionize the extraction and analysis of biomechanical data. This will be discussed in the context of lizard locomotion and contrasted with the results of classical univariate analyses performed with a team of undergraduate students from Prof. Selvitella��s STAT 340 class at PFW.
December 5, 2019 @ 11:30AM
Fantastic Results of a Term of Graduate and Undergraduate Data Science Research at PFW
Abstract
In this talk, I will present the main content of the research done by the group of students I had the pleasure to supervise in my classes Stat 340 and Stat 512 during Fall 2019. It has been a really exciting term for everybody and the results obtained by the students are excellent. The projects done in both classes have the potential to develop into longer term collaborations with faculty members, local companies, and government organizations and deserve the greatest attention.
December 5, 2019 @ 12:00PM
Student Presentations
Brain Age Prediction using Machine Learning Techniques
Abstract
Can your brain tell us how old you are? In this talk, we will discuss the so called Brain Age problem, and how we addressed this fundamental question in neuroscience, using machine learning methods.
December 6, 2019 @ 9:00AM
Detecting Pair-copula Dependence using Convolutional Neural Network
Abstract
Modelling dependence in high-dimensional data is often a challenging task due to the multivariate nature and complex dependence structure. Recently, vine copula has become popular in modeling such data: it uses pair-wise copula as basic "building block" to construct a hierarchical structure to model the likelihood function. Since the pair-copula is the fundamental component in the building of the hierarchy, accurate detection of the pair-wise dependence becomes crucial. In this project, convolutional neural network is used to recognize the dependence pattern in pair copula for better accuracy, and neural network architecture is also investigated for optimal performance.
December 6, 2019 @ 9:20AM
Shiny Apps for Healthy Mom + Baby
Abstract
For Healthy Mom + Baby Datapalooza Competition, we were challenged to provide analyses and tools to aid Indiana governor's goal to reduce infant mortality rate. Mastodon team produced 2 interactive tools using R and Shiny platform. One tool helps users interact with multiple data columns and gain insights on counties that need improvements, while the other one allows importing data of cases to predict the risk of infant mortality.
December 6, 2019 @ 9:40AM
Student Posters
Biology
Board Games
Climatology
Computer Science
Economics
Engineering
Health Sciences
Neuroscience
Social Sciences
More...
Presenter
1Nguyen Nguyen
2Xiao Yuan
3Linh Le
4Christopher Jacobs
5Hang Fey Chan
6Jacob Rife
7Justin Asher, Khoa Tan Dang, Maxwell Masters, Peter Klopfenstein, Jucoen Yeater
8Zac Kaiser, Kimberly Flores, Kristina Ponce, Jared Grabau, Amber Powers
9Kent Sellers
10Dr. Noor Borbieva
11Ha Le
12Rachel Gawsyszawski
13Josh Cripe, Andy Speck, Frank De La O
14Maxwell Masters
Poster Presenter
Reception
Gluten Free
Vegetarian
Dr. Peter Dragnev
Dr. Adam Coffman
Dr. Kathleen Foster
Dr. Alessandro Selvitella
Dr. Yihao Deng
Nguyen Nguyen
Xiao Yuan
Yanxuan Liang
Linh Le
Christopher Jacobs
Hang Fey Chan
Jacob Rife
Dr. Amal Khalifa
Dr. Mark Jordan
Dr. Nichaya Suntornpithug
Saranya Suntornpithug
Dr. Chao Chen
Dr. Jack Li
Dr. David Liu
Dr. Behin Elahi
Dr. Todor Cooklev
Dr. Daniel Yorgov
Dr. Zesheng Chen
Justin Asher
Zac Kaiser
Kimberly Flores
Kristina Ponce
Jared Grabau
Amber Powers
Dr. Promothes Saha
Dr. Hongli Luo
Kent Sellers
Dr. Yvonne Zubovic
John Osowski
Dr. Noor Borbieva
Liam Carolan
Ha Le
Dr. Chand Chauhan
Rachel Gawsyszawski
Maxwell Masters
Keenan M.L. Mack
Avdhesh Chandra
Adio Abdoulaye
Peter Klopfenstein
Dr. Alan Legg
Kenda Barnes
Nate Olson
Josh Cripe
Andy Speck
Frank De La O
Hsin-Chieh, Wang
Elliot Nesler
Richard Appiah
Ajibola Fakiyesi
Mohamed Khouya
Justin Fisher
Khoa T Dang
Yifei Pan
Adio Badru
Eric Frempong
Stephen Owusu
Enoch Fedah
Hong Phuc Nguyen
Dr. Alessandro Selvitella
Dept. of Mathematical Sciences
Purdue University Fort Wayne
Dr. Yihao Deng
Dept. of Mathematical Sciences
Purdue University Fort Wayne
Last updated: 12/16/2019