Shuttleworth Gathering Budapest, Content Mine Dogfood

Twice a year the Shuttleworth Fellowship meets in a Gathering – could be anywhere in the world (subject to a minimum travel costs algorithm). This is my first and we are in Budapest – one of Europe’s loveliest cities. (I’ve been here before, luckily, as our programme has been very full and we only got out once formally for a river cruise.

It’s Chatham House Rule so no details but see our web page for the 13 fellows. This is one of the most coherent, inspiring, groups I have ever been in. So much is common ground – we agree on doing Open, the questions are why? what and how? and we’ve explored those. I’ve found so much in common – we are in the area of liberating knowledge and inspiring innovation , mixed with democracy and justice. I’m finding out about how to build communities, annotation, education while being able to help with computer vision, information extraction, metadata, etc.

We each ran a 75 minute slot on “eating our own dogfood”. NOT a lecture. We had to bring the practice of our project and ask the others – everyone – to grok it and hack it. Often this was in small groups and so for mine we had 5 groups of 5. Here’s my rough summary with comments:

  • Why are we doing ContentMining? economics, openness/democracy, innovations, disruption.  Hargreaves

Very useful discussion (as would be expected)

  • Manual markup (highlighters) of two articles

Worked very well. Lots of questions about “should we mark this?”. 

  • Demo (PMR) of semantic content  (chemistry)

  • Crawling exercise (manual)

Good involvement. “Why doesn’t publisher X have an RSS feed?”, etc.

  • Scraping exercise (manual and software)

Again worked very well

  • Extraction (software and manual design)

Mainly concentrated on manual markup but showed chemical tagger, etc.

  • Where are we going?

 

I deliberately put far too much in – so people could test the software worked, etc. But the main idea was to see how non-biologists managed. I chose a paper on evolutionary biology of Lions in Africa and everyone got the point. In fact it reinforced how needlessly exclusive scientific language is. The first part of the introduction could be rewritten without loss to read something like

“African Lions are dying out because of hunting and environment change. DNA analyses show that lions in different parts of Africa have evolved in different ways. By studying the DNA and historical specimens we can understand the evolution and perhaps use this for conservation.”

There wasn’t enough time for everyone to run the software – deliberately – but we got very useful feedback.  I shall be tweaking it over the weekend to make sure it’s working for our Vienna workshop.

Advertisements

Published by

steelgraham

Scotland's (main, but not only) #OpenScience #OpenAccess #OpenData #OpenSource #OpenKnowledge & #PatientAdvocate Loves blogging http://figshare.com/blog Glasgow, Scotland.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s