Andrea Simeon (born Mihajlović), BioSense Institute, Novi Sad, Serbia
Abstract: Microbiome has been massively associated with different diseases and disorders. To identify individual microorganisms and their abundances across samples, different sampling, sequencing and preprocessing techniques could be considered. This leads to different input feature sets (views) to learn predictive models through machine learning (ML) approaches. ML models aid in finding the associations between microbiome and disease. Standard (single view) ML models are not capable of dealing with multiple views at once, and thus they were upgraded to fit multi-view datasets (e.g. Adaboost and Multi-view Adaboost). Moreover, microbiome data comes from various sources and often view incompleteness is inevitable. Existing classifiers, even multi-view, cannot be directly used because they cannot work with incomplete views and in multi-class settings. To the best of our knowledge, there is no multi-view boosting algorithm for multi-class classification with incomplete views.
The proposed algorithm is the extension of an existing multi-view boosting algorithm based on multi-arm bandits, now able to work in multi-class setting and with incomplete views (views with missing sample representation). At each iteration, it proclaims one view as the winning using adversarial multi-arm bandits and uses its predictive information to update the final model weights and prediction in a boosting process. Three data sets were created from several microbiome studies and used to examine the performance of the proposed algorithm. One of the experiments showed a 7% increase in F1 score compared to a single view classifier, while the other one showed 54%. The application domain is not restricted to microbiome data. Further work will involve examinations in other domains.
Biography: Andrea Mihajlovic is a Junior Research Assistant at BioSense Institute, and PhD student in Computer Science, Faculty of Sciences, both at University of Novi Sad, Serbia, with a background in applied mathematics (BSc) and data science (MSc). She is interested in digitalization and Artificial Intelligence (AI) for improved health and disease status assessment based on diverse omics data. Currently, she is focused on applying AI techniques in microbiome studies and exploring different preprocessing pipelines for analyzing amplicon and shotgun sequence data.