A knowledge graph to interpret clinical proteomics data

Publikation: Bidrag til tidsskriftTidsskriftartikelForskningfagfællebedømt

Dokumenter

  • Fulltext

    Forlagets udgivne version, 3,91 MB, PDF-dokument

Implementing precision medicine hinges on the integration of omics data, such as proteomics, into the clinical decision-making process, but the quantity and diversity of biomedical data, and the spread of clinically relevant knowledge across multiple biomedical databases and publications, pose a challenge to data integration. Here we present the Clinical Knowledge Graph (CKG), an open-source platform currently comprising close to 20 million nodes and 220 million relationships that represent relevant experimental data, public databases and literature. The graph structure provides a flexible data model that is easily extendable to new nodes and relationships as new databases become available. The CKG incorporates statistical and machine learning algorithms that accelerate the analysis and interpretation of typical proteomics workflows. Using a set of proof-of-concept biomarker studies, we show how the CKG might augment and enrich proteomics data and help inform clinical decision-making.

OriginalsprogEngelsk
TidsskriftNature Biotechnology
Vol/bind40
Sider (fra-til)692–702
ISSN1087-0156
DOI
StatusUdgivet - 2022

Bibliografisk note

Funding Information:
We thank all members of the Proteomics and Signal Transduction Group (Max Planck Institute) and the Clinical Proteomics Group (Novo Nordisk Foundation Center for Protein Research), especially E. Voytik, L. Schweizer, L. Drici, N. Skotte and P. Treit, as well as A. Deshmukh and D. Samodova from the Novo Nordisk Foundation Center for Basic Metabolic Research, and P. Charles (Oxford Big Data Institute) for their help testing the code and providing feedback. Data for the COVID-19 study were provided by the Massachusetts General Hospital Emergency Department COVID-19 Cohort (Filbin, Goldberg and Hacohen) with Olink Proteomics: https://www.olink.com/ mgh-covid-study/. Data used in the glioblastoma study were provided by the Clinical Proteomic Tumor Analysis Consortium (NCI/NIH). This project was supported by Novo Nordisk Foundation grants (NNF14CC0001 and NNF15CC0001). F.C. acknolwedges the European Union’s Horizon 2020 Research and Innovation Program (Marie Skłodowska-Curie individual fellowship under grant agreement 846795). We thank the Python community for the excellent scientific libraries developed and maintained and also Neo4j for providing a community version of their graph database and its community for helping improve the platform. The authors thank Life Science Editors for their help with the editing of the manuscript.

Publisher Copyright:
© 2022, The Author(s).

Antal downloads er baseret på statistik fra Google Scholar og www.ku.dk


Ingen data tilgængelig

ID: 292072660