The ContentMine Workshop at Bath

On Tuesday 28th July, the ContentMine team held an introductory content mining workshop for biologists, at the University of Bath.

With primarily internet-based advertising of the course we had signups for the event from all over the world including notably, Sri Lanka and Jordan. So I’m glad all the content we went over is captured on the event Etherpad and subsequent links within it. Some, if not most of the material may well be followable from afar.

Everyone wants to do content mining

In terms of in-person attendees we had palaeontologists, ecologists, medics, phylogeneticists and biodiversity informaticists. Attendees were also pleasingly diverse in career stage: from undergraduate, to masters, postdocs, established scientists, and people working outside academia. The majority of course were PhD students and this was the demographic I was most hoping to target.

Pro’s and con’s of using a virtual machine

Setup on the morning was surprisingly smooth for the majority because I had emailed out the instructions for installing VirtualBox and running the virtual machine (VM) over a week before the actual workshop day. I resolved several enquiries & niggles via email in that period so that almost everyone came to the workshop with a working VM. Two participants had laptops that were only capable of running 32-bit operating systems, so I made an alternate VM just for them. I had 4 USB sticks pre-loaded with VIrtualBox installers and the VMs on hand as well that morning, so I think we prepared well in that aspect.


Socializing after the workshop
Socializing after the workshop


The workshop itself went by in a blur. Demonstrating command-line approaches to an audience with mixed levels of experience of bash shell scripting is really fascinating.

In one memorable mistake, I asked everyone to type ami then TAB-complete to check that all the ami-plugins were installed on their machines. Many participants reported back seeing very different output to that which was projected on my demonstration screen. It quickly transpired that having seen me hit ‘ENTER’ after typing in most of my commands, a few were pressing ENTER rather than TAB. My mistake, as the teacher, I should have made this clearer! I have already thought of a solution to this we can use in future workshops: virtual keyboards displayed on-screen (see below for an example) – to display exactly what keys the instructor is pressing! If not for the whole workshop, but just for running through basic shell scripting at the start, I think this might be a ‘must’.

Virtual Keyboard in Action
Virtual Keyboard in Action


It’s probably too early to tell how useful the participants found the workshop. We have yet to analyse the feedback forms in full but I take great heart from what one participant said:

“I wish I knew about these tools two years ago before I had to write-up my thesis!”

With the new community discussion site now up and running: we hope to spark more eureka moments like this!

My sincere thanks go to Matt Wills, Mark MacGillivray, Chris Kittel, Stefan Kasberger, Graham Steel, Richard Smith-Unna, Jenny Molloy, and everyone else at the ContentMine, including of course Peter Murray-Rust for making this happen. A lot of work went into this workshop. I hope it showed.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s