Abstraction services for articles patents, books, news
Academic and scholarly literature has been growing at an exponential rate. Analysts believe the production of literature will double over the next few years as the COVID-19 pandemic has necessitated for more and more research.
Abstraction & Indexing services is the answer to manage this explosion of information. Today it becomes even more vital to incorporate A&I services that can improve precision in discipline-specific searching for expert researchers and boost the discoverability of journal articles. Indexes and abstracts add a human-curated layer to what computers can do.
Molecular Connections’ A&I services are technology-enabled and are curated by subject-matter experts who understand the subject area, and the resultant metadata captures the essence of the subject. Our proprietary machine-aided indexing technology delivers high-quality results in record time. Molecular Connections has been involved in abstracting & indexing for over two decades, helping publishers and organizations add immense value to well-known science databases.
Advantages of A&I over full-text search :
- The degree of relevance is much higher compared to databases without A&I.
- Summarization is useful to many users who do not have time to read complete articles.
- Can enhance searchability and discoverability through key concepts, materials, substances, and keywords.
- Saves research time and creates greater user engagement.
Molecular Connections A&I reach
Patents, Journal Articles, Books, Standards, Dissertations, Input as PDF, MS Word, HTML, XML
Around 6 million patent abstracts, including non-English patents
Over 100k abstracts of medical case reports
Over 200k book chapters abstractsL
Over 100k news articles and 20k+ product reviews abstracted
Non-textual Indexing: Multimedia, tables & graphs
Deep Domain Indexing: Chemical reaction, property data
Image processing involving sequence extraction using OCR technology, parse the documents and giving a structure to the final output.
Molecular Connections A&I reach
Patents, Journal Articles, Books, Standards, Dissertations, Input as PDF, MS Word, HTML, XML
Around 6 million patent abstracts, including non-English patents
Over 100k abstracts of medical case reports
Over 200k book chapters abstractsL
Over 100k news articles and 20k+ product reviews abstracted
Non-textual Indexing: Multimedia, tables & graphs
Deep Domain Indexing: Chemical reaction, property data
Image processing involving sequence extraction using OCR technology, parse the documents and giving a structure to the final output.
Molecular Connections’ Technology EDGE
Molecular Connections’ Technology EDGE
- Object extraction & indexing
- Cited reference parsing
- Specialized handling of a variety of input and output formats
- Content web retrieval for the purpose of article – A&I and metadata creation
- Creation of a standardized consistent XML format that will become the final output to the client for ingestion
We support publishers and content aggregators in the following ways through our A&I services
Indicative Abstracts
Describes what the document indicates in terms of topic and methodology, without providing the key content present in the article.
Informative Abstracts
Provide a condensed view of the entire content in the full-text document, culling out the key topics and concepts covered
Structured Abstracts
Created in a structured format with pre-defined headings that truly represent the way the full text is organized
Enhanced Abstracts
Picks out the key knowledge points that are helpful for decision making using domain expertise and inferences.
Video Transcript Abstracts
Extracting transcripts representing the theme of the video
Conceptual Indexing
Indexing key ideas/substances related to the novelty of the article/journal.
Image Indexing
Extracting keywords from captions that represent the core concept/theme of the image.
Controlled Indexing
Terms extracted from a controlled vocabulary and can either be actual keywords extracted from the caption or conceptually related terms assigned from the controlled vocabulary.
Multimedia Indexing
Providing terms for tables, graphs, video, etc
Chemical Structure Indexing
Indexing chemical structures from journal articles or patents
MC has developed the following proprietary A&I Platforms for processing content at scale.
MC Miner™
MC Miner™ is a state-of-the-art text mining engine developed by the data scientists of Molecular Connections. The platform allows mining information from a range of unstructured data sources to create structured outputs which can be used to develop custom knowledgebases.
Key Features :
- Can be also used as an indexing engine for developing an ontology
- APIs for interacting with external systems
- Supports concurrent access
- Inbuilt load balancer to manage concurrent access
- Custom MC Miner™ methods / algorithms
- Entity Extraction
- Topic Modelling
- Sentiment analysis
- Text Classification & Summarization
Application areas:
- Biology (Gene, Proteins, Species, Disease, Processes), Chemistry, Physics, Economics, Production Engineering
MC Parse A parsing engine whose key features are:
Non-XML parse:
- A state of art non-XML parse for extracting text from PDF, doc, and other formats.
- Various PDF Structures (Dual column, Special Header elements)
- Patent Offices/Full text Journals
- Inconsistent Section Markers
- Tables spanning across columns
XML Parse
XML parse consists of a library of parse based on xml formats of all major publishers, ncbi based formats like JATS, NLM Xml, PubMed, SciELO XML , etc. It can be customized for any new format in a very short span of time.
Areas of Application:
- Research Articles
- Patents
- Reports
- Manuscripts
MCAPS™
MCAPS™ (Molecular Connections Curation and Annotation Professional System) is a proprietary workflow system evolved through our expertise in life science curation.
Key features:
- Simplifies managing and exchanging heterogeneous data
- Facilitates guideline development & management for various curation and annotation projects
- Standard way to communicate data across projects and solves the data integration problem
- Makes report development easier
- Monitor project development in real-time
Application areas:
- Management of indexing and text mining workflows
- Data querying, analytics, and visualization
- Product development and rapid prototyping
Case Study
Deep Indexing of Chemistry Content
Molecular Connections A&I reach
Molecular Connections’ Technology EDGE
- Detailed metadata capture at the document level
- Object extraction & indexing
- Cited reference parsing
- Specialized handling of a variety of input and output formats
- Content web retrieval for the purpose of article - A&I and metadata creation
- Creation of a standardized consistent XML format that will become the final output to the client for ingestion
We support publishers and content aggregators in the following ways through our A&I services
Indicative Abstracts
Describes what the document indicates in terms of topic and methodology, without providing the key content present in the article.
Informative Abstracts
Provide a condensed view of the entire content in the full-text document, culling out the key topics and concepts covered
Structured Abstracts
Created in a structured format with pre-defined headings that truly represent the way the full text is organized
Enhanced Abstracts
Picks out the key knowledge points that are helpful for decision making using domain expertise and inferences.
Video Transcript Abstracts
Extracting transcripts representing the theme of the video
Conceptual Indexing
Indexing key ideas/substances related to the novelty of the article/journal.
Image Indexing
Extracting keywords from captions that represent the core concept/theme of the image.
Controlled Indexing
Terms extracted from a controlled vocabulary and can either be actual keywords extracted from the caption or conceptually related terms assigned from the controlled vocabulary.
Multimedia Indexing
Providing terms for tables, graphs, video, etc
Chemical Structure Indexing
Indexing chemical structures from journal articles or patents
MC has developed the following proprietary A&I Platforms for processing content at scale.
XML parse consists of a library of parse based on xml formats of all major publishers, ncbi based formats like JATS, NLM Xml, PubMed, SciELO XML , etc. It can be customized for any new format in a very short span of time.
Areas of Application:
Research Articles
Patents
Reports
Manuscripts
Various PDF Structures (Dual column, Special Header elements)
Patent Offices/Full text Journals
Inconsistent Section Markers
Tables spanning across columns
MC Miner™ is a state-of-the-art text mining engine developed by the data scientists of Molecular Connections. The platform allows mining information from a range of unstructured data sources to create structured outputs which can be used to develop custom knowledgebases.
Key features:
- Can be also used as an indexing engine for developing an ontology
- APIs for interacting with external systems
- Supports concurrent access
- Inbuilt load balancer to manage concurrent access
- Custom MC Miner™ methods / algorithms
- Entity Extraction
- Topic Modelling
- Sentiment analysis
- Text Classification & Summarization
Application areas:
- Biology (Gene, Proteins, Species, Disease, Processes), Chemistry, Physics, Economics, Production Engineering
MC Parse A parsing engine whose key features are:
Non-XML parse:
A state of art non-XML parse for extracting text from PDF, doc, and other formats
Management of indexing and text mining workflows
Data querying, analytics, and visualization
Product development and rapid prototyping
MCAPS™ (Molecular Connections Curation and Annotation Professional System) is a proprietary workflow system evolved through our expertise in life science curation.
Key features:
- Simplifies managing and exchanging heterogeneous data
- Facilitates guideline development & management for various curation and annotation projects
- Standard way to communicate data across projects and solves the data integration problem
- Makes report development easier
- Monitor project development in real-time
Application areas:
Case Study
Deep Indexing of Chemistry Content
Interested in Abstraction & Indexing Services?
More Services
Digital Transformation
Software Development and Consulting Services
MC SAT
MC IDENTIFY
MC INSIGHTS
MC Review Recommender
Our Work: Select Case Studies
Deep Indexing of Chemistry Content
Discovering the right journal with Molecular Connections’ Journal Recommendation Service
An efficient keyword recommendation Service for a leading society publisher
ML Based Topic Alerts For Society Members
Empower your content with smart data
Improving Content Discovery
Let's Connect
Have questions or need a demo? Please fill out the form, and our team will reply shortly.
For general inquiries, please visit our Contact Page.