NetworkHub: a one-stop-shop to retrieve and use protein-protein interaction network data in Bioconductor

NetworkHub: a one-stop-shop to retrieve and use protein-protein interaction network data in Bioconductor


Author(s): Lotta Wagner,Federico Marini

Affiliation(s): Institute of Medical Biostatistics, Epidemiology and Informatics, University Medical Center of the Johannes Gutenberg University Mainz, Mainz, Germany



Proteins are crucial for regulating and maintaining cellular functions, often acting in a concerted manner in physiological and pathological contexts. The cell interactomes can be influenced by the temporal chronology, spatial relationships between interaction partners and various external factors, and have been the object of many initiatives (including IntAct, STRING, BioGRID and many more) aiming to detect, collect and curate large sets of protein-protein interactions (PPI). Despite their potential critical role in better understanding, interpreting, and contextualizing a wide spectrum of omics data, the Bioconductor ecosystem currently lacks a unified approach to efficiently access,process, and make use of such databases. The NetworkHub package, which we are currently developing, aims to provide users the functionality to retrieve (and cache), process and prepare such PPI networks (and their accessory information), in a way that they can be readily incorporated into many downstream operations for transcriptome and proteome data analyses. Moreover, NetworkHub will include functions to efficiently visualize these network data objects, leveraging interactive widgets that can also embed information directly from commonly used containers in Bioconductor (e.g. SummarizedExperiment and its derivative classes). By providing an ideal one-stop-shop to access up-to-date network level information, NetworkHub can empower researchers to obtain more comprehensive insights into the processes under investigation. At the same time, this can serve as a data resource for package developers to access the latest snapshots of these databases, instead of providing static snapshots for them (e.g. when designing a method that requires such data as input). This can also become the foundation layer to enable the interface between R/Bioconductor to larger frameworks for biomedical knowledge graphs, such as BioCypher.