Semantic unification and search of bioinformatics databases

Aleksandar Veljković, University of Belgrade, Faculty of Mathematics, Serbia

Abstract: Associating biological data from different sources provides a holistic view of a domain and enables finding patterns in data that are otherwise difficult or impossible to observe by only analyzing isolated biological entities. The key issues for creating connections between data objects are the variety of biological data formats, data organization schemas, and data access methods. Connecting data from different databases can be challenging, as an entity from one database may not have the same properties or identifiers as the same entity described in another database. While some databases contain a variety of entity identifiers from various databases, the search is limited to exact property matching, and complex queries using multiple metadata attributes are not possible. To overcome these issues, a novel data framework  BioGraph enables linking and retrieving information from heterogeneous interconnected biological data. The model was tested and generated a knowledge graph using metadata from five distinct public datasets, DisProt, HGNC, Tantigen 2.0, IEDB, and DisGeNET. The resulting graph interconnects more than 17 million nodes, of which 2.5 million individual biological entity objects with over 4 million relationships. The software system allows searching and retrieving patterns and retrieving matching results from the knowledge graph using a user-friendly interface. To complement the model, a tool and a web interface were developed. The tool and corresponding packages can be deployed locally as a standalone system, enabling offline execution of queries.

Biography: Aleksandar Veljković is a Ph.D. student and teaching assistant at the Faculty of Mathematics, University of Belgrade. He is a member of the bioinformatics research group at the Faculty of Mathematics and treasurer of the BIRBI organization. His domains of research include data science, bioinformatics, and cryptography.