At the moment (August 2015), only two countries (UK and Japan) have a clear exception in their laws allowing users with bona fide access to digital materials to carry out content mining on those materials for non-commercial purposes without having to pay any fees, ask permission or go through any bureaucratic hoops. In addition, arguably under the laws of some countries, such as USA and Israel, the fair dealing/fair use copyright exception is sufficiently broadly worded to mean that content mining will be acceptable under many circumstances. There are plans to introduce a specific content mining exception in Eire, and separately on an EU basis, but these initiatives are at an early stage and indeed may not occur at all.
Thus, the current legal framework is restricting the uptake of content mining in many countries compared with more permissive regime in the UK, with corresponding implications for research and competitiveness. At a time when many areas of research and other scholarly activity are becoming more collaborative – and a typical research project can be funded by multiple funders in different countries – ideally, copyright law around the world should be harmonised to permit content mining. But that is something for the future.
So what is the legal position for those people who wish to copy, and then subject the copied materials to content mining in countries other than UK, USA, Israel or Japan? As a rule, the copying for is likely to be infringement, as a substantial amount of copyright material owned by third parties (typically, but not invariably, scholarly publishers) is being copied, and there is no specific exception to copyright that applies to the mining activities. It is possible that a fair dealing/fair use type of exception for scholarly research exists in the country, but unless it is extremely broadly worded, large-scale content mining is at best risky legally.
The only way, then, that researchers in these countries can undertake content mining is by relying on an explicit licence from the rights owners that allows them to undertake such activities. These licences can be broken down into three headings:
- Materials that are available under Open Access (whether in a journal or a repository) with a Creative Commons licence can be content mined without any problems and without having to ask for permission. But researchers should make a careful note of the type of Creative Commons licence on offer. Thus, for example, a CC licence with NC means you cannot exploit the results of the mining for commercial purposes. Always ensure you abide by whatever terms are imposed by the CC licence.
- Some publishers offer their materials to subscribers for content mining, typically without the researchers having to ask them for specific permission each time they want to carry out the mining. A good example is Elsevier – see here. Again, if relying on such licences, the researcher should make sure they obey the terms imposed by the publisher. It has to be said that some of the publisher standard content mining licences are highly restrictive and may not suit researchers’ needs.
- In those cases where what is required for content mining is not covered by either 1 or 2 above, the researchers will have to approach the rights owner directly seeking permission to copy and subsequently content mine the materials. The researcher will need to explain fully why they want to undertake the work (the rights owner will be particularly interested in whether the researcher wishes to use the materials for commercial purposes), when the copying will take place and what of their materials the researcher wants to subject to analysis. The rights owners may agree without charging, might agree but for a fee, might say “no”, or might not reply at all. If they do agree to the copying and mining, they are likely to impose their specific terms and conditions. If they say “no”, or simply don’t reply, then the researcher must not copy and mine those materials.
As will, be obvious, the situation is far more complicated in those countries without an exception to copyright than in, say, the UK. Not merely are the researchers required to have to identify what they want to analyse between the three different approaches, but they will have to abide by the no doubt differing terms imposed by the different regimes. It is therefore difficult to understand the rights owners’ arguments that “licences are simpler than an exception to copyright”. Because of the potential complexity (and associated time and effort in resolving the issues) of undertaking content mining in countries outside those with clear exceptions, one can only hope that exceptions to copyright for content mining become the norm worldwide in the future.