Open and Computable Bioinformation

Learn More

Phenopackets Are …

A Vision

An open standard for sharing disease and phenotype information will improve our ability to understand, diagnose, and treat both rare and common diseases. A Phenopacket links detailed phenotype descriptions with disease, patient, and genetic information, enabling clinicians, biologists, and disease and drug researchers to build more complete models of disease. The standard is designed to encourage wide adoption and synergy between the people, organizations and systems that comprise the joint effort to address human disease and biological understanding.

A Technology

Phenopackets are represented as PXF (Phenotype Exchange Format) files, which may be encoded in JSON or YAML. Each packet associates a list of phenotypic abnormalities with a disease and patient, including details about age, sex, onset, and evidence. PXF uses standard ontologies to ensure interoperability between diverse sources and consumers, to simplify text-mining, and to enable machine reasoning. Software libraries supporting PXF have been written for Java, Python, and Javascript, and the open standard makes it easy to adapt to other languages, systems and applications.

A Community

Phenopackets is an evolving standard jointly developed by researchers, clinicians, curators and authors. Journals, model organism databases, medical data repositories and commercial efforts are encouraged to adopt Phenopackets as a way to improve their use and publication of detailed and computable characterizations of disease. This will enable new modes of treatment and drug discovery, including translational research, precision medicine, and automated pipelines revealing knowledge in existing publications and databases.

The Vision of Phenopackets

Using Phenopackets to communicate bioinformation ensures that knowledge is liberated and useable by the existing and nascent computational pipelines, databases and journals. This enables new possibilities for research, diagnosis and treatment. Some of the features of Phenopackets are listed below.

  • Searchability, Comparability
  • Accessible outside paywalls and private data sources
  • Attributable and Citeable
  • Interoperable and Computable
  • Exchangeable across contexts and disciplines

The Technology of Phenopackets

Phenopackets are defined using a protobuf schema that allows implementations to be automatically generated for many languages. The phenopacket-schema build process automatically produces language bindings for Java, Python and C++.

Source code and examples of Phenopackets technology may be found in the following GitHub repositories:

The Community of Phenopackets

Phenopackets were designed by, and intended to be used by, the diverse community of researchers, data modelers, computer scientists, bioinformaticians, environmental scientists, and clinicians dedicated to maximizing the value of existing and new data.


Documentation and References

  • Processing phenotype data using Phenopackets-API and PXFTools
         BOSC 2016      Slides       Video
  • Phenopackets: Making phenotype profiles FAIR++ for disease diagnosis and discovery
         Force2016      Slides