The Synthetic hEalthcare dAta goveRnanCe Hub (SEARCH) launches today and aims to accelerate healthcare innovation by generating FAIRified synthetic data for use in AI/ML models.
Today sees the launch of the Synthetic hEalthcare dAta goveRnanCe Hub (SEARCH); a multi-disciplinary initiative focused on creating synthetic healthcare data and facilitating secure data sharing across the biomedical ecosystem.
The launch marks a significant leap in healthcare research, driving advancements in digital health and AI-powered diagnostics through cutting-edge synthetic data generation and federated learning approaches.
SEARCH will address critical challenges in healthcare data access by creating a platform that enables secure, privacy-preserving searching, sharing and analysis of multimodal healthcare data. Co-ordinated by Trinity College Dublin, – via the Trinity Translational Medicine Institute (TTMI), based at St James’s Hospital - this initiative brings together a consortium of 26 cross-sectoral partners across Europe, including synthetic data experts, healthcare providers, and solution developers, to unlock new opportunities in data-driven healthcare innovation. Funded under the Innovative Health Initiative Joint Undertaking (IHI JU), SEARCH boasts an initial budget of over €15.2 million.
SEARCH aims to accelerate healthcare innovation by generating FAIRified synthetic data for use in AI/ML models, enabling large-scale data collaborations while preserving privacy and compliance with regulatory standards.
SEARCH will deliver reliable methodologies for synthetic data generation (SDG) that meet the highest standards of accuracy and applicability, significantly increasing the availability of interoperable datasets. These datasets will be used to develop AI-based tools that support diagnostics, personalised treatment, and predictive health outcomes, improving patient care while reducing privacy risks.
Professor Aideen Long, Director, Trinity Translational Medicine Institute (TTMI) and Project Lead, SEARCH, said: “SEARCH offers an unparalleled opportunity to accelerate research and clinical innovation. By providing high-quality, FAIR synthetic datasets that mimic real-world healthcare data, we can empower researchers, clinicians, and industry to collaborate like never before. This opens the door for faster drug discovery, more personalised treatments, and the ability to create new, evidence-based healthcare policies—all without compromising patient privacy. As an educator, I see the synthetic datasets as invaluable tools for training, allowing students and professionals alike to sharpen their data analysis skills while maintaining the highest standards of patient confidentiality.”
Professor Dimitris Iakovidis, University of Thessaly, Greece, said: “SEARCH represents a paradigm shift in healthcare by enabling the generation and sharing of robust synthetic data across diverse healthcare use cases. Our approach harnesses the power of federated learning and advanced SDG methods to create synthetic datasets that replicate the statistical properties of real-world data, while ensuring patient privacy. This will empower healthcare providers and researchers with high-quality data to fuel next-generation AI and precision medicine tools.”
Combining data clean rooms and federated learning allows multiple institutions to collaborate, drawing insights from decentralised data sources, while ensuring that patient data remains securely stored at its source. SEARCH’s innovative synthetic data generation techniques will not only democratise access to healthcare data but will also create a foundation for new diagnostic and therapeutic tools powered by AI/ML.
Key Objectives and Innovations
- Next-Generation Synthetic Data: SEARCH leverages deep generative models to create realistic synthetic replicas of healthcare data (EHRs, genomics, medical signals, and radiological imaging), replicating the performance of real-world data while maintaining privacy.
- Federated Learning and Data Clean Rooms for Privacy & Scale: By keeping patient data securely in its original location, SEARCH’s privacy- preserving framework framework enhances collaboration across healthcare sectors while protecting sensitive information. This fosters AI model development and the wider adoption of new healthcare tools.
- Accelerating AI Innovation: SEARCH will enable the development of cutting-edge AI-powered decision-support tools by providing gold-standard synthetic datasets for benchmarking biomedical AI solutions, fuelling faster diagnostic tools, and creating new personalised healthcare approaches.
Groundbreaking Impact
SEARCH will play a pivotal role in revolutionising healthcare, contributing to faster innovation, shorter time-to-market for new digital health interventions, and better personalised treatments. Through synthetic data methodologies, SEARCH will uncover insights into cardiovascular, gastrointestinal, and gynaecological diseases, while ensuring the protection of patient privacy.
SEARCH’s efforts will complement real-world healthcare by validating synthetic datasets through clinical studies. This unique combination of synthetic data and privacy-preserving architecture has the potential to drive AI/ML innovations that enhance patient outcomes, improve diagnostics, and pave the way for new public-private collaborations in healthcare.
Trinity College Principal Investigators (Prof. John O’Leary, Dr. Cara Martin and Dr. Sharon O’Toole) are validating synthetic data in studies focused on gynaecological cancers. By demonstrating its utility in screening, early diagnosis and precision medicine, this work will showcase the transformative potential of synthetic data while ensuring privacy and accuracy in clinical decision-making.
This article was first published on 1 October by Trinity College Dublin.