To date, the Commission has funded around 50 separate projects to create ‘a world wide web for research data’. Starting in June, a new public private partnership aims to streamline this system to make it a truly Open Science Cloud
The European Open Science Cloud (EOSC) is entering a new phase this year as the EU continues its mission to federate the myriad of data sharing systems around the continent, to make it easier for researchers to pursue fully open science.
The cloud formally launched in 2018, and since then Horizon 2020 has funded about 50 projects, investing €320 million by 2020.
These bottom-up projects laid the groundwork for the cloud and involved stakeholders from all over Europe. But the EOSC vision has become defused and more complex. “We are now trying to enter what we call the convergence phase,” Karel Luyben, the president of the EOSC Association and former Rector Magnificus of the Delft University of Technology, told Science|Business.
The means for doing this will be a co-programmed partnership between the European Commission and the EOSC Association, which brings together research providing organisations, research funders, service providers, and other organisations. Together with many other European stakeholders, the association has set out a roadmap for realising the open science cloud.
While it technically launched three years ago, the EOSC is only operational on paper. The partnership is aiming to fully consolidate and deploy the system by 2030, to serve two million researchers.
The Commission will put €490 million towards the endeavour in the next seven years. Ten times more funding is likely to come from elsewhere.
Luyben says the open science cloud is a continuous process. There is no set date when it will be officially operational or serving a certain number of researchers. It is more like the world-wide web, a constantly evolving network of resources. “When was the world wide web there? There is no specific date. And the same is for EOSC,” he said.
The difference is that EOSC will connect data repositories relevant for research around Europe, and later on, around the world. Each will have to be based on the FAIR Data Principles of findability, accessibility, interoperability, and reusability.
The challenge is ensuring all research data fit the principles and can be accessed through the network. Today, most research data is not online and not easily accessible to researchers, but rather stored away in different systems, under a myriad of divergent standards and methodologies.
Connecting them in a network is a daunting task. The next big step in tackling it is creating the EOSC Core, a means to discover, share, access and re-use data and services across different data infrastructures.
Much like the internet uses the standard protocols to exchange and manipulate files, the EOSC needs a FAIR open standard for mixing its data packages.
“At the Core of all this, you can make data into small particles – packages. You encapsulate those, you develop a standard, and then you can send them around, share them and combine them again. If it is done using the FAIR principles, then the right software in our machines is needed to make all this possible,” said Luyben.
A €35 million procurement call for building the EOSC Core is foreseen to open next year, with hopes to set it up in the following two years.
Laying down the base
One €30 million project funded under the EOSC umbrella, EOSC-hub, laid part of the foundation for the future EOSC and its core. It mobilised providers from several EU research infrastructures, set common rules for participation and integrated their services under one roof, the EOSC Portal.
The result is a prototype that has already served more than 20,000 researchers. “From the architectural point of view, it’s equivalent in terms of functionality,” Tiziana Ferrari, EOSC-hub coordinator and director at the EGI Foundation, a federation of cloud providers and data centres, told Science|Business.
It functions like a huge digital research community. One side of it is the providers, the other –the researchers. “But the two overlap to a large extent,” says Per Öster, EOSC-hub project director and director of business insights and growth at CSC, the Finnish IT centre for science.
Öster describes the Portal as a shop window. It is a concrete result of the project but much of the action happened behind it, such as finding solutions for federation, defining rules of participation and service management frameworks.
Somewhat trivial tasks, such as defining the metadata for service description, were key milestones for the team. But to make services available, they first needed to make them compatible in the way they are described and presented to the user. “This was definitely one of the major hurdles,” said Ferrari. “Now we have a cohesive way to present services to the user.”
The project ended in March but other initiatives in the EOSC universe are already building on its results. “We have been building the pillars of the bridge. Now that we have them, we can build the road on top it and we can have cars,” said Ferrari.
No borders in science
The partnership aims to serve two million researchers by the end of the decade. Serving European researchers is the first frontier but the network will continue growing indefinitely.
“My prediction is the following: if 20 years from now, 50% of the relevant research data would be as FAIR as possible at that moment in time, I would be happy. I’m talking worldwide here,” Luyben said.
The concept of a Europe-wide open science cloud was first floated in 2015. Since then, the idea has undergone major transformations. Now, all four words making up the acronym are technically wrong, says Luyben.
First, the cloud will not exclusively serve Europe, but rather aim to connect research data around the world once the European structure is set up. Second, it will not be about open data, as some data cannot be open. It will have to be FAIR. Third, it will not be limited to science; it will give access to all sorts of information. Fourth, it is not a cloud storing information in a few big datacentres but rather a network connecting repositories and data storage facilities around the world.
It is also a necessary development as the world moves towards more open science. Without the Commission’s efforts and funding, Luyben believes, the science cloud would still come about. But it may be less European and more American.
Similar initiatives are springing up around the world in Australia, Canada and the US. “Ultimately, it’s not about Europe,” says Luyben. “It is no use making an EOSC if it is not involved in a network worldwide.”
But both at EU and global level the success of such projects is contingent on researchers’ willingness to share their data, foster an open science culture and adopt the FAIR principles.
Today, most researchers are hesitant to share their data with no reward and often misinterpret what FAIR means. This can change but it means transforming the way researchers work and are rewarded. “It is not a process that you can do over night or top-down,” says Luyben. But it may be one of most difficult pieces of the puzzle.