Europe is joining forces with the US and Australia in a bid to change the scientific publication culture and underpin open access through the formation of the Research Data Alliance (RDA), an international body set up to promote the development of new infrastructures, standards and tools for sharing and mining research outputs.
Following the inaugural meeting of the RDA in Gothenburg last week, the Director of Advanced Cyberinfrastucture at the US National Science Foundation, Alan Blatecky told Science|Business, “The atmosphere at the launch was very positive. The resounding message was 'Let's get working on data exchange today.’”
John Wood, the EU Co-Chair of the RDA was equally impressed with the practical approach. "The aim is to ensure that when scientists want access to the data of their peers, this data is available for them in a format that they can use."
Data exchange benefits all researchers
The RDA has a long and difficult agenda, but at its heart is a mission to unlock the innovation potential of research data. This means enabling greater collaboration between scientists of all disciplines through the use and re-use of data.
At present the prevailing practice amongst researchers is to publish results but not the data underlying their research. According to the EU-funded Parse-Insight project in a study published in 2010, only 25 per cent of researchers share their research data openly.
Similarly, SIM4DM (Support Infrastructure Models for Research Data Management), another EU-funded project set up to investigate how to improve policies for managing research, concluded that researchers are often hesitant about sharing their data in the belief that others may unfairly benefit from their work. In addition, “The process of preparing data for sharing is also perceived as labour-intensive and, in the absence of clear citation mechanisms, difficult,” according to SIM4DM.
However, it is also the case that the answer comes to those who share. As Francine Berman, US Co-Chair of the RDA says, "Many complex problems can only be addressed by combining information from multiple data collections. For example, a question such as “What areas sustain the most risk in an earthquake?” might involve sensor data, population data, data on neighbourhoods and structures, data from earthquake simulations, etc."
The global nature of modern science also accelerates the need for data exchange: as Blatecky notes, "Science is no longer bound by geographical constraints, it is an international sector. That is why data also needs to be shared on an international level."
The structure of the RDA reflects the recognition within the scientific community of the need for data exchange. "The RDA works from the ground up. It invites scientists from all disciplines to come together to find their own solutions for sharing their data,” Blatecky said.
Open access to publicly-funded research
The objectives of the RDA are in tune with the increasing clamour for open access to research papers from the public R&D funding organisations. The European Commission committed itself to the Open Data movement in 2012, saying, “Information already paid for by the public purse should not be paid for again each time it is accessed or used, and ... it should benefit European companies and citizens to the full.”
In other words, publicly-funded research outputs need to be available online, at no extra cost, via self-sustaining networks.
Neelie Kroes, Vice President of the European Commission responsible for the Digital Agenda has pledged to progressively open access to the research data and called on national funding bodies to do the same.
Similarly, in February, US President Barack Obama directed federal agencies with more than $100 million in R&D funding to develop plans to make the results of federally funded research freely available to the public within one year of publication – and at the same time to require researchers to better account for and manage their data.
The UK government has led the way here, and from 1st April all research funded through the country’s research councils must be made freely and openly available, to anyone around the world.
Don’t reduce the commercial value of research
This move towards open access to data is being met with opposition and foot-dragging, not least from the scientific publishers. But it is also stirring emotions amongst companies that take part in collaborative R&D projects that have a mixture of public and private funding.
Klaus Dieter Axt, Director of Public Affairs at DigitalEurope says while the industry body has no objection to open access to publicly funded scientific publications, open access to the underlying data is a different story. “A requirement to make research data "open" may remove the commercial value for the company undertaking the research. Projects under Framework Programme 7 or Horizon 2020 are not 100 per cent funded, and companies often invest a significant proportion of their IP rights into the research, especially in the ICT sector," he said.
It remains to be seen how this conflict between the public's need for access to research it funded through taxes and the commercial motivation for undertaking the research will be resolved.
It is not just about improving the public's access to the research, but also about increasing the return on this investment. This is where the huge financial gains to be made from effective data exchange are relevant.
Data is the new oil
Funding groups now expect a greater financial return from their investment in research and many see open and free access to data as the way to achieve this. Or as Kroes put it in a recent speech, "Data is the new oil."
Much of this financial potential comes from the ability of data exchange to drive innovation, according to Berman. "Many recent innovations are dependent on the availability of data as well as the ability to transform data into useful information, decisions, strategies, and products.”
Not only that, the opportunity cost of not sharing data has economic impacts as well. “Although they are hard to quantify, they are not hard to conceptualise: slower progress on understanding and developing treatments for disease, slower progress on monitoring and addressing environmental and energy challenges, etc,” said Berman.
A recent study on Danish SMEs shows that without speedy access to scientific research results, it takes such firms on average 2.2 years longer to develop or introduce new products. Access to data is of equal importance to the public sector, with a recent report by the consultants McKinsey indicating that big data could generate value of more than $300 billion every year in the US healthcare sector alone.
Data efforts worldwide
The RDA is one part of the international push to secure the maximum economic benefits from the data pouring out of research. Another example is the EU iCordi project, with a brief to chart, demonstrate and drive convergence between emerging data infrastructures. A key component of the EU's internal data efforts is the development of the European Research Area, a version of the common market for research, scientific knowledge and innovation.
In March 2012, the Obama administration announced a $200 million "Big Data Research and Development Initiative,” which aims to advance the technologies needed for effective analysis and use of data.
Task Ahead for the RDA
Existing research data infrastructure is mostly institutional or national, and often split by discipline. This will need to change if the full benefits of data are to be realized, according to Wood. Another challenge is posed by the rise of interdisciplinary research. As a result, "Infrastructure frameworks are needed to ensure interoperability between various data sources and to support analysis,” Berman said.
At present, each discipline uses its own standards and tools, and the RDA will aim to make data exchange across disciplines easier by "encouraging researchers and scientists to come together to develop best practices," said Blatecky.
The development of standards will enable biologists to better understand the data produced by physicists, and economists, and thus enhance the collaboration needed to tackle social challenges. It is unclear how much this project will cost, with Berman saying, "Exact costs are hard if not impossible to quantify - what does the internet cost? - but it is clear that the benefits will greatly outweigh the costs.”
Information infrastructure (including software, hardware, data professionals, etc.) is not free and the “research data bill” must be part of a sustainable approach to driving innovation. With billions being invested in data worldwide, it is likely that the RDA will form a relatively small part of the expenditure.