The Beacon Project is a Global Alliance for Genomics & Health (GA4GH) initiative that enables genomic and clinical data sharing across federated networks. The project is working toward developing regulatory, ethics and security guidance to ensure proportionate safeguards for distribution of data according to the GA4GH-developed “Framework for Responsible Sharing of Genomic and Health-Related Data”. Within this Nature Biotechnology Correspondence a description of the Beacon protocol and how it can be used as a model for the federated discovery and sharing of genomic data is offered to the community.

A Beacon is defined as a web-accessible service that can be queried for information about a specific allele. A user of a Beacon can pose queries of the form “Have you observed this nucleotide (e.g., C) at this genomic location (e.g., position 32,936,732 on chromosome 13)?” to which the Beacon responds with either “yes” or “no.” In this way, a Beacon allows allelic information of interest to be discovered by a remote searcher with no reference to a specific sample or patient, thereby mitigating privacy risks.

The Beacon API (represented as a RESTful web application) provides a technical specification that a Beacon server must implement. The specification is open source and available online here. To simplify the process of lighting a Beacon, a free, open-source reference implementation of the latest specification has been developed.

GA4GH is promoting different levels of data access (open, registered, and controlled) for convenience and for compatibility across its projects. Each so-called access tier has distinct visibility and requirements for authorization. For example, ‘open access’ Beacons are accessible to anonymous users of the internet, whereas ‘registered access’ Beacons are accessible to registered users (for example, bona fide researchers and clinicians) who have agreed to a set of conditions of data use.

Many of the largest genomic archives, such as dbGaP, the European Genome-phenome Archive and the European Variation Archive, have provided access to variation data through Beacons for some or all of their datasets. Beacons can be interconnected through the Beacon Network, a directory and search engine for Beacons. Although individual Beacons answer the question “Have you observed this allele?”, the Beacon Network answers the question “Who has observed this allele?”. Beacons can be freely registered to the Beacon Network and can be searched independently or in aggregate with other connected Beacons.