We are pleased to announce that we’re teaming up with Hypothes.is to put forward a proposal to the Open Science Prize to mine and annotate the biomedical literature – using and producing loads of open data along the way.
A growing number of open data resources are either directly cited in the biomedical literature or have an indirect link to the content of articles or other research outputs. Unfortunately these links are often not visible to readers and if the article is behind a paywall they could be invisible to the vast majority of the population, including many researchers.
We plan to automatically mine and openly annotate the biomedical literature with intelligent identifiers for data such as genes, species and many dataset citations. ContentMine will extract the facts and Hypothes.is will display them on the online document. Through this, we’ll create an index of facts as open data that can be combined with manual annotations from the community of Hypothes.is and ContentMine users. This development and linking of two existing early-stage services will lead to a powerful and rich user opportunity to examine facts in context and look for connections and correlations centred around identifiers.
In the spirit of openness, we’re discussing the proposal on discuss.contentmine.org and collaboratively drafting via Google Docs. We’re appreciative of any volunteers who would like to help!
You can get involved by:
- Joining the discussion thread here.
- Contributing to the proposal draft.
- Joining the ContentMine github community.