Every stage of the publishing process, from authors doing research on topics to publish, to figuring out which journals to submit to, the submission and receipt of the manuscript at the publisher, to editorial decisions on reviewing and accepting it, and once published, tracking reader behavior and usage generates a lot of data. A lot of this data is unstructured, locked inside submission platforms, internal workflow tools used by the publisher or vendor, in emails and in online portals which are the platforms for searching/browsing and retrieval employed by researchers looking for the scientific content.
Many in the scholarly and academic publishing industry see themselves as “content intermediaries’ rather than data driven “product” companies. Publishers need to pay attention to the data behind the platforms, technology implementations and hosting providers. Some of these are …
- How to manage (across vendors and partners), and utilize the data that is central in their business?
- How to analyze that data and serve it up to a new type of customer, one that expects digital, personalized products as standard?
Today data availability is not the issue, but a wholistic view of all the stages of the publishing process which generate data and the possible” insights that can be garnered ‘is the issue. Editorial teams within smaller publishers lack this insight which can cripple their editorial strategy and deployment of funds, leading to lower ROI. So, does this mean investing in expensive 3rd party branded tools to manage this data flow and generate analytics for better business outcomes? Not necessarily.
It is possible to create an analytics layer using APIs that can work with existing submission platforms, publishing vendors and online hosting platforms that publishers already use today. Such an analytics layer will be a “live dashboard” for editors and publishing heads to potentially see who the authors are (even across preprint servers), where (institution) are they coming from, what happened post submission, where was the article published if rejected by the publisher and could it have been transferred to another journal/ Open access platform.
For example, an Editor may feel the need to check if there is an article already published on a given topic and may want to trigger a broad search using free text. Or what are the topics in the manuscripts coming in and do we need a new journal in that space? So, the framework need not be just about data generated in the publishing process showcased as a dashboard but could support editorial strategy and business decisions.
So, we can define Data Driven Publishing as a framework to use, to produce and deliver content in a user-centred way.