A new Science|Business report finds costs may be lower than expected for the EU’s ambitious plans to better connect researchers and their data across the continent – and examines ways to fund it
How should Europe fund its lofty plans for an open science cloud?
A far-reaching and multi-faceted undertaking, the European Open Science Cloud (EOSC) initiative aims to provide Europe’s 1.7 million researchers and 70 million students and professionals in science and technology with easy access to other researchers’ data, and to a wide range of computing resources.
But bringing about this open science nirvana will cost money and someone will have to pay. And it won’t be the taxpayer: The Commission has promised EU member states that they won’t need to find new money and the EOSC will be self-sustaining by 2020.
As yet, no comprehensive cost-benefit analysis of the EOSC exists. That’s partly because no one actually knows how much Europe spends on managing scientific data today and partly because it is impossible to anticipate the economic impact of the scientific breakthroughs that may or may not be catalysed by the EOSC. Ultimately, the initiative may need to be underpinned by, as yet unspecified, business models that will enable the science cloud to be self-sustaining.
In a paper, Science|Business considers the potential costs and funding sources for this ambitious project – and concludes that data management expenses could ultimately amount to just 1% or 2% of total research costs, and fit inside existing science funding schemes. Previous estimates put the figure at as much as 5%.
The report results from a December 2017 meeting of our Cloud Consultation Group, a gathering of members of the Science|Business university-industry network with a special interest in the science cloud. They include experts from ETH-Zurich, CERN, Microsoft, Amazon, the European Space Agency and others. The resultant report, while drawing on the members’ expertise, is ultimately a statement of Science|Business and does not necessarily reflect the views of individual members. Similarly, Science|Business in 2017 published two other reports, on the rationale behind the cloud project and suggestions for its governance. Further reports will examine the business models, skills and other aspects of the cloud project.
The newest paper starts to address some of the thorny questions of costs and payments. It begins by outlining how European science is employing the cloud today and the different categories of costs involved in establishing the EOSC. It then identifies the potential efficiency benefits associated with a move to cloud computing and open data sharing, before considering some of the longer-term economic benefits that could arise from an open science cloud. The paper concludes by looking at the potential sources of funding and making some recommendations for the EOSC’s many stakeholders.
The costs of building the EOSC
Although the EOSC will initially harness the existing digital infrastructures used by European science, it will require further investment. There are essentially seven significant, but partially overlapping, categories of costs associated with the EOSC:
- Employing cloud-computing services: The cost of getting data into the cloud and storing some of it for decades, and the cost of using cloud computing resources to access and analyse scientific data, including the necessary connectivity.
- Opening up scientific data: The implementation of data management plans to make research data findable, accessible, interoperable and re-usable (FAIR principles).
- Federation of existing scientific data infrastructures with new provisioning schemes, such as the cloud or specialised facilities, and the development of nodes to link existing national data centres, European e-infrastructures, external providers and research infrastructures.
- Development of specifications for application interoperability (APIs), data portability and data sharing: To enable data to be shared across disciplines and infrastructures, more standardisation of meta-data and, perhaps, the actual data itself will be needed.
- Creation of search tools: New software will be required to enable scientists to search, browse and access research data.
- Creation and maintenance of a secure environment: The European Commission envisions a suitable certification scheme will be designed at EU level to guarantee security, data portability, and interoperability in compliance with legal requirements. Such a scheme will need to be flexible enough to enable the EOSC to keep pace with the evolution of scientific research.
- The governance of the EOSC process: The EOSC will need a full time executive body that can oversee federation, long-term funding, sustainability, data preservation and stewardship. See the consultation group’s report, Governing the European Open Science Cloud, for recommendations on how the EOSC could be run.
The economic benefits of the EOSC
By putting cutting edge computing resources at the fingertips of researchers, the open science cloud could bring about a step change in productivity. The availability of computing resources should no longer be a bottleneck. If, as commercial cloud providers say, only a small percentage of European science is taking advantage of so-called hyper-scale cloud technologies today, there is enormous scope for a transformation in the way in which researchers share and analyse data. The implementation of the EOSC could catalyse widespread adoption of hyper-scale cloud computing by European science.
Ultimately, the EOSC could have a profound impact on European scientists’ capabilities, giving them access to a multitude of platforms, software tools, algorithms and data that they can’t access today. By creating a safe and seamless environment for sharing research data, the EOSC could bring about a step change in scientists’ productivity. As a result, researchers in both the public and private sectors will be able to conduct new kinds of experiments and research, with a lower level of risk, which could ultimately yield major economic benefits. The net effect would be to breathe new life into existing investments and draw new money into European science, creating a virtuous circle that fuels investment in innovative businesses and new public services. See the group’s report, The Case for the Cloud, for more on how the EOSC could transform European science.
Funding the EOSC
The European Commission has allocated €260 million for the federation of the existing scientific data infrastructures. In theory, that could be supplemented by a further €12 billion per annum, made up of the approximately €10 billion a year spent on data infrastructures for science conducted in European universities and other publicly-owned facilities, plus 1% of the €200 billion a year of public money spent on scientific research. Although the EOSC’s high level expert group estimated up to 5% of research budgets may have to be allocated to data management, the Science|Business group believes 1%-2% may ultimately be sufficient.
Research funding agencies could make a small percentage of each grant available as credits that can be spent on any kind of cloud service (so long as it meets the EOSC’s technical/security/privacy criteria). This approach would help drive competition between cloud providers – the researcher’s IT specialists would spend the credits with the provider offering the best value. Although research funders should insist that grantees make their data open and compatible with the EOSC, the grants should be agnostic about what cloud services they use to make their data findable, accessible, interoperable and reusable.
To maximise the effectiveness of the money spent on the EOSC, investments in the initiative should be driven by demand, rather than a “build it and they will come” mentality. Demand is likely to be particularly strong for platform-as-a-service capabilities, which can help to significantly reduce the effort required to develop the algorithms and software researchers need for their projects. Where possible, depending on the scientific discipline, the EOSC should not require data to be transferred from one place to another – it is more efficient to store data in a single location and perform analytics in that location, rather than create multiple copies of a large data set.
Monetising the EOSC
Over time, the EOSC could also generate its own income stream by serving the needs of the private sector. Although the EOSC is intended to make research data free at the point of use for scientists, commercial entities could be required to pay to access data within the EOSC framework once their usage rises beyond a specific threshold. A points-based application process, designed to gauge the public value of the project in the broadest sense, could be used to determine the thresholds that apply to each entity.
However, there are many other ways in which the EOSC could be monetised, so the business model will need to be carefully conceived and refined over time. This will be the subject of a future report.
Ultimately, the data within the EOSC could underpin an ecosystem of commercial services, just as the satellite data being captured by the European Space Agency is being used as the basis for commercial offerings. Given the value that the EOSC could bring to private sector research and product development, it could potentially build up a substantial revenue stream over time.
However, another school of thought argues that the EOSC may not need to generate any revenues, as it will become self-sustaining in the same way that open source software is maintained by its community of users (typically with some support from large technology companies). In this scenario, individual researchers, empowered to employ which ever platform makes most sense to them, will then be doing nearly all their work using publicly-developed and widely-shared mobile workloads. As scientists re-use and enhance each other’s workloads, they will be improving and expanding the EOSC, which will take on a life of its own akin to that of the open source software movement.
Download the report