We launch The Content Mine In Vienna, Interviews, Talks and our first public Workshop

Last week was one of the most exciting in my life – but also among the hardest I have worked. I travelled from Budapest to Vienna to be the guest of the Austrian Science Fund (FWF) and to give a lecture: http://www.fwf.ac.at/en/aktuelles_detail.asp?N_ID=597 .. I changed the title to “Open Notebook Science” in honour of the late Jean-Claude Bradley and to promote his ideas. My talk’s on Slideshare: [http://www.slideshare.net/petermurrayrust/open-notebook-science].

Before that I had given two interviews – one to ORF ( http://en.wikipedia.org/wiki/ORF_%28broadcaster%29 ) , the Austrian public Broadcasting network Österreichischer Rundfunk Here’s the interview – I haven’t seen a translation but web translaters give a reasonable version http://science.orf.at/stories/1740033/ I explained why science was important beyond the walls of academia and why we needed to liberate scientific knowledge.

Then the “launch” of The Content Mine ( http://contentmine.org ), my Shuttleworth Fellowship project, which aims to extract 100,000,000 facts from the scientific literature. The philosophy is not that *I* do this but that *WE* do this. To do that we have to:

  • have reliable, compelling, distributable software. That’s hard. But we’ve got one of the best small teams in the world – it would be harder to think of a better one. That’s because we are developer-scholars – we are not only very experience in the coding and design of information , but we are also expert in our own right in our fields (Chemistry, Phylogenetics, Plant Genetics, and Informatics/ScholarlyPublishing). That means we know where we are going, know what works (or rather what *doesn’t* work!) and know who else in the world is doing similar stuff. And because I’m funded by the Shuttleworth Foundation there’s a guarantee  that we won’t get bought by Elsevier or Macmillan or Thomson-Reuters. I wouldn’t swap any of the team for ten million dollars – that’s how important they are to my life.
  • show YOU how to become part of US. The goal is to create a community. We’re in very good touch with Wikimedia, Mozilla, Software Carpentry, OpenStretMap, Open Knowledge, Blue Obelisk, Apache, so our community will be recognisable in that environment. And also think of WellcomeTrust, Austrian Science Fund, RCUK, NIH, to get a feel for how we relate to science funders. We’ve only been going 3 months so we want to see a community evolve rather than design it prematurely. When it’s strong and energetic it will start to suggest where we should be going organisationally. We also work closely with domain repositories such as PubChem, EuropePubMedCentral, Treebank, Dryad, Crystallography Open Database, etc.
  • At present we are reaching out through workshops. We’re doing several this summer – Edinburgh, Berlin/OKFest, Wikimania, OK Brazil, and one or two more yet to be finalised. We’re informed by the Software carpentry philosophy, where we ru a workshop for a sponsor, and during the workshop train apprentices. Then these apprentices wll be able to help run new workshops and then perhaps their own workshops. So although Michelle and I ran this workshop, there will be later ones with different leaders.

So we ran our first public workshop on 2014-06-04 at  Institute of Science and Technology Austria (IST Austria) We advertised it as:

Workshop with Peter Murray-Rust and Michelle Brook: “Can we build an intelligent scientific reader?”

Venue: IST Austria, Am Campus 1, 3400 Klosterneuburg
Time: 4th of June 2014, 10:00 a.m. – 4 p.m. (ballroom)
Participants: 10 places are still available (first come, first serve)
Registration: send an email (incl. first name, surname, institution, email) asap but until 30/5/2014 to falk.reckling@fwf.ac.at

Workshop Description
The workshop will be suitable for anyone interested in biological science and not frightened of installing and running pre-prepared programs and data (following written guidance and with support from those present in the room). The aim is to introduce computational methods for processing scientific papers, enabling analysis of multiple papers in a rapid fashion. These techniques include how to download multiple files, extract concepts and facts from the literature and figures, using Natural Language Processing and Computer Vision.

Technical expertise required
Very little expertise is required beyond general use of a computer. Much more important is a willingness to learn and experiment. However we will ensure options are made available for those who are confident/technically able, including providing opportunities to develop their own tools for analysis.

We got 18 brave people, mainly compsci but also bioscientists and it went well. Michelle is getting formal feedback. We’re hard at work taking our own criticism on board (Michelle collected a very thorough set of observations). It was hard work, but we now know we can do it and it works. The main emphasis was on understanding the concept (with highlighter pens and paper!), scraping, extraction, and how to work as a community. We’ve got attendees who want to folow up on how they can use it! That’s the philosophy.

Then the next day an all-day hack run by OKFN Austria (Stefan Kasberger and Peter Kraker (Panton Fellow) – http://okfn.at/2014/05/19/content-mining-meetup-with-peter-murray-rust/. A wonderful hackspace (metalab), couches, soft drinks + honour payment, bits of kit lying around – grafitti – you know the sort of thing.

And then at the end 4 invited speakers (including PMR). We are very impressed by OKFN Austria – the day drew perhaps 25 people. And a lovely city.

But Exhausting! At the  end I crashed for a long night. (In writing my Shuttleworth Quarterly report I was aksed “What was your greatest loss during this quarter?” Answer: SLEEP!

Much more to come – a hackday in Edinburgh next week to be announced later today.


Published by


Scotland's (main, but not only) #OpenScience #OpenAccess #OpenData #OpenSource #OpenKnowledge & #PatientAdvocate Loves blogging http://figshare.com/blog Glasgow, Scotland.

