A deep indexing approach for better search and retrieval of domain intensive content

Client:

  • One of the largest world’s authority for chemical information who delivers the most current, complete, secure, and interlinked datasets for scientific discovery

Business Needs Addressed

  •  A deep and rich chemistry database where researchers can search by key concepts and substances, chemical names or structures, reactions, procedures and/or by organics structure
  • Providing indexed key concepts and substances for Non-English journals

 

 

Challenges:

Building and updating a  research discovery tool with enhanced searchability and discoverability through key concepts and substances involved in research from many scientific disciplines, including biomedical sciences, chemistry, engineering, materials science, agricultural science, and more!. The platform should also be  easy-to-use and power  the research process by offering: access to current, high-quality scientific information, content indexed by scientists, with speedy access to more than a century of scientific information.

 

Additionally, the need for a rich Chemical reaction database with search on topics, chemical names or structures, structure drawings, functional group transformations.  Accessibility of relevant substances, reactions, and its information, to determine what is patented in a particular area, find alternative preparation methods, to decide between purchasing and preparing a substance

Solution:

Molecular Connections deployed graduates and post grads  in Chemistry, both applied and physical science to frame rules for the selection of key words that represent concepts and chemical substances relating to the novelty of the scientific journals, and doctorates and subject matter experts in areas of chemistry to interpret and correlate the reaction mechanism  and to validate the most appropriate reactions analysis from research articles.

For processing scientific journals in Asian languages like Chinese, Japanese and for Russian, MC developed a hybrid multilingual indexing solution using technology-enabled automation processes with statistical methods and Natural Language Processing (NLP) rules. Further MC team developed a knowledge repository of unique concepts that recur frequently in scientific journals and also which contains instant solutions for different variants of reactions procedures handled by authors (polymer , peptides, synthesis etc) through client feedback and queries.

 

MC analysed the unified process and segregated the process as indexing and reaction analysis. Indexing output became the input for the reaction analysis which has been integrated in work flow platform.

Technology Role in Excerption and Curation

 

MC automated by deploying a robust technology platform with three modules – auto indexing of substances, writing up of reactions for indexed substances and structures. The platform facilitated the auto indexing of key concepts and substances based on the thesaurus developed. It also provides the auto indexed key words under the headings of Title, Abstract, Tables and Figures for easy enhancement. Additionally, the platform provides option for including new key words in thesaurus for its learning. Indexing output serves as the input for reaction analysis which has been integrated in work flow platform. Frequently used Markush structures, reaction participants like solvent, catalyst, reagent are in-built in the platform as drop down options.

 

Distributed Workflow:

 

To process the high volume, MC workflow platform was hosted on in-house and cloud for seamless and continuous processing without interrupting the production process. This helped to meet the deadline irrespective of the volume

 

Benefits:

  • MC processes support the client’s need of fast to market paradigm for content.
  • Amidst growing scientific content the client is able to meet the demand for better precision with enhanced searchability and discoverability through key concepts and substances with reactions and structures
  • A technology powered Indexing platform with content curated by scientists, leads to cost and time savings.

Get In Touch

Required fields are marked with an asterisk(*)

By submitting your email address, you acknowledge that you have read the Privacy Statement and that you consent to our processing data in accordance with the Privacy Statement (including international transfers). If you change your mind at any time about wishing to receive the information from us, you can send us an email message using the Contact Us page.

Best Company for Women in INDIA

Top 100 Best Company for Women in INDIA 2020

Corporate Development Center

Heritage Building,
#59/2, Kaderanahalli, 100 feet road, Banashankari 2nd Stage,
Bangalore 560070,
Karnataka, India

Tel: +91 80 2669 0145 
Email: info@molecularconnections.com

© 2022 Molecular Connections Pvt. Ltd.