In the United States, the ability to incorporate the works of others in your research is protected by fair use. In other countries, you have to ask permission. A new study indicates that the restrictive laws of many European countries means they risk falling behind in data mining research.

Presented last week at the annual conference for the European Policy for Intellectual Property, a study looked at the share of data mining research compared to restrictive copyright laws and found — in accordance with common sense — that the more allowances countries made in their copyright laws for academic research, the greater their share of data mining research.


Current European Union legislation requires anyone using copyrighted material to get permission before using it, even if they’re legally accessing it. Any exception for academic research is optional. In the United States, fair use allows copyrighted material to be used without permission or payment in certain circumstances. § 107 specifically names “teaching (including multiple copies for classroom use), scholarship, or research” as examples of areas where this is legal.

What the researchers in this study — Christian Handke, of Erasmus University Rotterdam, Lucie Guibault, of the University of Amsterdam, and Joan-Josep Vallbé, of the University of Barcelona — did was compare a nation’s data mining output to its permissiveness. Using Thomson Reuters’ Web of Science database, they found 18,441 data mining articles from 1992-2014. They used the ratio of articles on data mining to total articles published in that time period, and found that countries with more permissive laws — like the United States — had a ratio three times the size of countries with more restrictive laws — like Germany.

The researchers concluded:

Our results suggest that in the case of academic research and DM, the adverse consequences of copyright protection on the creation of new information goods are greater than the benefits. DM research often draws on many input works to which others hold copyrights. Copyright exemptions or limitations could promote this type of research, at least to enable DM of input works that have been publicly financed.

The idea behind copyright is supposedly that it incentivizes creation by allowing the creators to make a profit on the effort expended. The reason copyright isn’t absolute is so that it doesn’t end up disincentivizing creation by restricting access to material. In the case of data mining — which by necessity often includes databases created by others — that’s exactly what appears to be happening. The more steps and costs involved, the less research gets done.

