The importance of Figures and Captions

amber

What’s the first thing that you look at in a scientific paper?

The title? probably.

The abstract? maybe.

The figures and their captions?

For many readers [1], yes! Many disciplines have a very long tradition of figures rich in detail and information. Introductory and summary data is often found (only) in figures. Results are again often shown in figures.  I’m hypothesising that for many readers a quick glance at the figures and their captions would give a very good indication of whether they wanted to read it.

Here’s a typical figure [from http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0111303]

So how can we find what the message of a figure is? Put yourself in the position of a blind human or a machine and pretend you can’t see the image. The answer is the figure caption. So here are some examples, taken at random from today’s PLoSONE papers [use the links as acknowledgments].

Can you get a feel for whether you would want to look at a given image? In specialist fields it’s even clearer; here are some captions from two  Open Access articles from Elsevier’s Phytochemistry:

  1. Analysis of recombinant SvPAL by SDS–PAGE. (SDS-PAGE is a gel electrophoresis technique).
  2. Phylogenetic tree of the phenylalanine ammonia-lyase proteins of Arabidopsis. (Phylogenetic analysis).
  3. qRT-PCR analysis of the level of expression of SvPAL1, SvPAL2, SvPAL3 and SvPAL4 in A) willow young leaves, stem, phloem, xylem, mature leaves and root tissue and B). (Molecular biology of plants: “level of expression”)
  4. Subcellular localisation of SvPAL2. (A)–(F) An overview of the YFP fusion constructs is shown on the left, with the corresponding transient expression in tobacco epidermal cells is shown on the right. (Image very likely to be a photomicrograph).
  5. List of primer sequences used for cloning and semi-qRT-PCR. (Molecular biology).
  6. The main maca glucosinolate and metabolites analysed in this study. (Chemical structures).
  7. Amide and fatty acid HPLC profiles of fresh and dry maca. (HPLC is a chromatographic technique).
  8. Profiles of selected storage and secondary metabolites during traditional open-field drying of whole maca hypocotyls. (Metabolism)

If you are interested in phytochemistry (chemistry of plants) – and I am, see later posts – these papers are worth reading. It’s also possible to subclassify the first to molecular biology and the second to chemical metabolism.

So…

CONTENTMINE REQUEST: I WANT CLASSIFY FIGURES BY THE TEXT IN THEIR CAPTIONS

What’s this??

We’ve agreed that new software functionality needs communal agreement and that we should therefore blog our suggestions. As Mark MacGillivray puts it “if it’s not worth blogging about it’s not worth doing”. So I think it would be valuable for:

  • all figures and captions to be extracted from all papers. This is technically reasonably easy – most publishers use the word “Figure” and some enlightened ones use JATS-XML or HTML5 to tag the figures (<figure>) and even the caption (<figurecaption>). They will be grouped as a special category in Norma (figure-caption pair). This is already well under way with our “sectioning” programme.
  • classification of figures by machine-learning and/or humans. This is harder. We need vocabularies before we can classify things, and there is no universal scientific vocabulary (Wikimedia is the best). This is what I’d like us to explore.

How might we do that? This needs to be a communal discussion – not just my guess. And we’d love input from everyone – not just ContentMine colleagues. So:

 

WOULD YOU LIKE US TO CLASSIFY FIGURES BY THEIR CAPTIONS SO YOU KNOW WHAT PAPERS TO READ?

and

DO YOU KNOW OF PREVIOUS WORK IN THIS AREA THAT WE COULD BUILD ON?

and

WOULD YOU LIKE TO HELP?

 

 

 

[1] And readers are everywhere NOT just in universities. In hospitals, patient groups, small businesses, policy makers and young and old curious minds. And if you have to pay 35 USD to read a paper – like Jack Andraka’s parents, and find it’s not relevant, that’s terrible.

Advertisements

Published by

the bear

I have another blog in real life...

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s