The Federated EGA is a global resource for discovery of and access to sensitive human omics and associated data consented for secondary use, through a network of human data repositories to accelerate biomedical research and improve human health. The Federated EGA network was launched in September 2022 with five inaugural nodes, and since 2023 seven operational nodes can share data across national borders in adherence to European and national laws.

A few weeks after FEGA's official launch, in November 2022, the European Genomic Data Infrastructure (GDI) project was kicked-off . This European Commission co-funded project, coordinated by ELIXIR, is aimed to deliver federated, sustainable and secure data infrastructure to access genomic and related phenotypic and clinical data across Europe. This project supports the aim of the 1+MG initiative (25 EU countries, Norway and UK) to enable personalised medicine and health through a shared framework and infrastructure for securely accessing and integrating high quality genomic data and other health data across borders.1+MG will be an integral component of the European Health Data Space (EHDS) for secondary use (Healthdata@EU) as an authorised participant.

How are Federated EGA and GDI similar?

Given their shared visions, the Federated EGA and European GDI networks have a lot in common.

To begin, they share the same overall goal of establishing networks of “nodes” that host sensitive human data within a jurisdiction and connecting these nodes in a global network to support data discovery and promote human genome data access for research and healthcare. FEGA and GDI are both initially focused on enabling access to human genomic data, while FEGA ambition is to expand the scope to clinical research data and other omics data in the future.

FEGA and GDI are both built on open and interoperable software solutions, a subset of which are based on the LocalEGA components. FEGA and GDI implementation solutions are based on international community standards, for example those developed by the Global Alliance for Genomics and Health, which contributes to making them interoperable. Both FEGA and GDI allow for the nodes to make use of any solution that is fully compatible.

As illustrated in Figure 1, a final key point is the significant overlap of institutions involved in envisioning and operating FEGA and GDI nodes, with the clear wish to keep them interoperable in the future.

What distinguishes Federated EGA from GDI?

Despite being largely similar, there are some differences between the FEGA and GDI networks, which we aim to clarify in this post.

The first difference is the governance model that the nodes will operate to comply with national laws and GDPR. In the FEGA network, nodes have taken inspiration from the EGA data access model, where the infrastructure is a data processor of the hosted data and data controllership remains with the originating Data Access Committees (i.e. Data controllers) for each dataset. On the other hand, in the GDI network the controllership of datasets will be transferred to a 1+MG European Digital Infrastructure Consortium (EDIC) legal entity created by the Member States (MS), who will make data access decisions, with data holders having veto powers for their datasets. Importantly, FEGA nodes have the flexibility to choose another model to fit with their data protection framework, including becoming data controllers for their datasets.

The second difference is the inclusion criteria for data. While FEGA nodes are designed to accept almost any type of omics data in need of control access (e.g. genomics, transcriptomics, genotyping, single cells sequencing, patient-tracked metagenomics), GDI nodes are initially, but not only, focused on accepting whole genome and exome sequencing data and affiliated data from sources such as (i) the Genome of Europe use case of the 1+MG initiative which specifically aims to fulfil the mission of building “a European network of national genomic reference cohorts of at least 500,000 citizens”) (ii) data collection of other types of genomic data identified by countries through the 1+MG dashboard and (iii) genomic data coming from data holders which would need to fulfil EHDS requirements.

The third difference is the maturity of the software stacks. The FEGA network provides a set of software for data and metadata submission, storage, permissions management, and file distribution (the LocalEGA package). GDI is building a complete set of open source reference for the five functionalities covering the full data life cycle (a few more compared to the LocalEGA) including federated processing (analytics, AI/ML), which is still under active development. Notably, one of the GDI functionalities - storage and interfaces - can be satisfied by using the LocalEGA storage solution, highlighting the ability of FEGA and GDI to be interoperable. In both FEGA and GDI, nodes are allowed to use alternative solutions as long as they are fully interoperable with the networks. Figure 1 provides a simple overview of these described commonalities and differences.

schematic overview
Figure1: schematic overview of the main commonalities and difference among a FEGA and a GDI node

Can the same institute run a FEGA and a GDI node at the same time?

The answer is Yes! We believe this is entirely possible and encouraged. As long as 1+MG requirements are continuously fulfilled in the implementation. The same trained personnel could operate the infrastructure and leverage the same national funding to run a FEGA and a GDI node. Several European nodes, especially GDI vanguard nodes like Norway, Sweden, Finland and Spain are following this model. The idea is that a lot of the work can be reused, given the overlapping scope and the interoperable technology. Thus, the hosted datasets can be discovered via the two catalogues, perhaps accessible under the same or different governance models. From a FEGA node perspective, the GDI datasets could simply be a subset of the datasets hosted in the FEGA node, which have their specific governance as any other.

What's next?

So, the amazing teams of people building the Federated EGA network and the European GDI network are working together to create infrastructures able to provide secure access to human genomic and associated data around Europe. Because we all know how much human genomics can improve healthcare and precision medicine. And we all want to collaborate to make it happen.

This article has been reviewed collectively by members of the EGA and the GDI coordination team.