Ontology Development and Text Mining Solutions

A biomedical publisher enriches its ontology to improve indexing, search and accessibility.

Client Overview

The client was a leading scientific publisher of academic journals, magazines and research databases across specialized domains. The publisher approached Molecular Connections to improve search and accessibility of research articles in selected biomedical journals of the publisher.


The client through user feedback understood the need for revamping the search and improving accessibility of research articles across few of its journals. Molecular Connections understands the need for consistent ontologies as a key for transforming semantic technologies and accelerating scientific research. Precise modeling of a complex domain is essential for enabling better search and discovery, and accessibility. The development of ontologies still remains a critical factor to enable semantic applications.

The main challenge in an ontology development project (whether it is automatic or semi automatic) is not only developing a comprehensive core, but also customizing it for end point application. Molecular Connections offered a solution of transforming the search and retrieval strategies by enriching already existing client ontology and by improving article tagging systems.


Key Components:

  • A data-driven approach was implemented to develop a dynamic ontology.

  • Specificity and sensitivity of the orthogonal terms was improved by interlinking of different concept terms.

  • Definitions and synonyms of the candidate terms were added for better navigation through the ontology and improved search.

The project was taken up in two phases, first – text mining was carried out on a small set of articles to identify candidate terms for the ontology. Core ontology was built by enriching already existing publisher’s ontology followed by addition of new candidate terms.

The publisher’s content and other relevant resources were mined and Molecular Connections team of subject matter experts identified both the synonyms and the preferred terms encompassing the entire scientific literature. The ontology was enriched from ~700 terms to a set of ~2000 basic terms. The ontology was standardized extensively with reference to public ontologies for better integration. These were carried out within a period of 30 days. Molecular Connections ontology management system and state-of-the-art text mining system helped accelerate the development process.


Molecular Connections worked with the SMEs and the publisher to review and redefine the ontology, which were delivered to the publisher for integration into their platform.

Precise and formal definition of select biomedical concepts

Terms were added for good indexing and searching of the scientific articles

Molecular connections worked with the publisher to ensure the development of complex rule base for indexing the entirety of the publishers content