How contentmine will extract millions of species

We are now  describing our workflow from extracting facts from the scientific literature on http://contentmine.org/blog . Yesterday Ross Mounce and I hacked through what was necessary to extract species from PLoSone. Here’s the workflow we came up with:

 

Ross has described it in detail at http://contentmine.org/blog/AMI-species-workflow and you should read that for the details. The key points are:

  • This is an open project. You can join in; be aware it’s alpha in places. There’s a discussion list at  https://groups.google.com/forum/#!forum/contentmine-community . Its style and content will be determined by what you post!
  • We are soft-launching it. You’ll wake up one day and find that it’s got critical mass of people and content (e.g. species). No fanfare and no vapourware.
  • It’s fluid. The diagram above is our best guess today. It will change. I mentioned in the previous post that we are working with WikiData for part of “where it’s going to be put”. If you have ideas please let us know.
Advertisements

Published by

steelgraham

Scotland's (main, but not only) #OpenScience #OpenAccess #OpenData #OpenSource #OpenKnowledge & #PatientAdvocate Loves blogging http://figshare.com/blog Glasgow, Scotland.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s