Data is a critical raw material for research, innovation and value creation. Making more data available and tapping into Europe’s unused data resources while taking ownership of it benefits European researchers and innovators in AI but is also a key factor for reducing Europe’s critical dependencies and strengthening technological sovereignty and security. With the Data Union Strategy, the EU must be determined to foster an ecosystem of data management, computing platforms, skills, and software and services that lead to stronger competitiveness and value created for Europe.
The EuroHPC joint undertaking and AI factories are a significant elevation of computing capacities and a concrete example of strategic resource pooling for world-class, European horizontal digital infrastructures. To make Europe a truly competitive AI continent, investing in computing capacity isn’t enough. Joint investments in common European data platforms are also needed. These must be seen as strategic infrastructures that serve research, industry and SMEs, enabling European R&D and business to grow. Such common platforms can help European companies to build their services and products and enhance commercialisation and scaling. A key feature of these platforms is to keep data in Europe and enable value creation for European researchers, companies and governments. This must be a priority. Similar thinking as in building Europe’s computing capacity must be applied to data and data infrastructures.
The Data Union Strategy should indeed be considered a part of a broader, coherently evolving ecosystem that builds on already existing work and initiatives. The EU needs to build on existing efforts, including the EuroHPC supercomputers, European Open Science Cloud (EOSC), thematic research data repositories and the Open Web Search initiative, aiming for an interoperable European HPC, data and AI ecosystem where high-quality and open-source data can be fed into EuroHPC supercomputers. In addition, skills, tools and policies for data management should be integrated and promoted at institutional level. The Open Web Search—a European open web index—should be an integral part of the ecosystem, to channel data for European use in an open and transparent way, reflecting the diversity of European cultures and languages and supporting the development of European language models, as in projects such as the OpenEuroLLM. The market potential of a European open web index is projected to reach €4.5 billion within a decade (link to the study), demonstrating that this would make a real difference in boosting Europe’s data economy.
In the context of AI factories, the Data Labs initiative can complement and boost these efforts to link data to computing especially for industry innovators, and it is crucial to make sure these efforts are developed in coherence with other parallel European initiatives. The LUMI AI Factory provides datasets as a service and through the Open Web Search as raw material for research and innovation purposes.
Europe cannot achieve a competitive data economy through only infrastructure, but ultimately Europe’s success rests on people—their skills, creativity, and technological understanding. Systemic efforts and structural measures for how European people can become more talented in data management and curation, programmes like AI Skills Academies and constant re-skilling are crucial, alongside fostering tech literacy as a cross-cutting skill. To build a strong competence base, the opportunities and challenges of technology across all sectors must be addressed. The systemic nature of data must be reflected at policy level: All AI initiatives should align with the Union of Skills Strategy to ensure a coherent, integrated approach. Robust competence development, supported by infrastructures like AI Factories, when designed and implemented correctly, can boost Europe’s data economy and make Europe a global data talent hub.
The current regulatory framework is fragmented and should be reviewed so that it reflects the systemic nature of data,AI, and digitalisation overall, and aligns with the EU’s long-term strategic objectives rather than making sporadic interventions. Guidance on implementation and compliance as well as clarity on definitions, roles and responsibilities are needed to avoid unnecessary risk aversion related to data-based innovation. Another challenge is the inconsistency of implementation and national variations, such as the differing enforcement of the GDPR across member states. Harmonisation of regulatory implementation is urgent to remove uncertainties. To ensure better functioning of the Single Market, the national implementation of data regulation overall should be better coordinated from the European level to promote consistent interpretation and application.
In the global context, the EU should drive open and responsible international data standards, e.g. in identification and access protocols and metadata models and align its own initiatives with them. In parallel, building alliances with like-minded international partners could stimulate trust and data availability for trustworthy, responsible, open and human-centric AI. To prevent data leakage from Europe to third countries, legal safeguards are necessary, but they are not enough. Concrete measures, such as to support sensitive data management, are of the essence.
It is vital that Europe strengthens its role in the global data economy by ensuring seamless and secure access for European researchers, companies and governments to international data flows, through European infrastructure ecosystems, data and skilled peole. By putting a coherent ecosystem of data, computing, and human capital at the center of digital technology policy, the EU can create the foundation for strengthening research excellence, innovation and commercialisation of service development towards true European alternatives that attract and secure users and data flows.
This article was first published on 4 July by CSC.