Skip to contents

CRAN_Status_Badge JOSS Paper rOpenSci Peer-Reviewed DOI codecov metacran downloads CII Best Practices Lifecycle: stable

Overview

dbparser is an rOpenSci peer-reviewed R package that parses and integrates major pharmacological databases into standardized, analysis-ready R objects called dvobjects (drugverse objects).

Pharmacological databases use incompatible formats and structures, forcing researchers to write custom parsing scripts — a process that consumes 60–80% of analysis time. dbparser eliminates this bottleneck with unified parsing functions, chainable merge operations, and a consistent output structure that enables reproducible, cross-database analyses.

With recent updates, dbparser has evolved into an integration engine, allowing you to merge mechanistic data (DrugBank) with real-world phenotypic data (OnSIDES) and drug-drug interaction risks (TWOSIDES).

Installation

# From CRAN (stable)
install.packages("dbparser")

# From GitHub (development)
# install.packages("pak")
pak::pak("ropensci/dbparser")

Supported Databases

DrugBank (The Mechanistic Hub)

DrugBank is a comprehensive database containing detailed drug, pharmacological, and target information. As both a bioinformatics and a cheminformatics resource, DrugBank combines detailed drug data (chemical, pharmacological, pharmaceutical) with comprehensive drug target information (sequence, structure, pathway). More information can be found here.

If you find errors with any DrugBank version, please submit an issue here.

OnSIDES (Adverse Drug Events)

OnSIDES provides adverse drug events extracted from thousands of FDA drug labels using machine learning.

  • Parser: parseOnSIDES()
  • Input: Directory containing OnSIDES CSV files

TWOSIDES (Drug-Drug Interactions)

TWOSIDES provides data on adverse events arising when two drugs are taken together.

Quick Start

Parse a Single Database

library(dbparser)

# Parse DrugBank
drugbank_db <- parseDrugBank("data/drugbank.xml")

# Parse OnSIDES
onsides_db <- parseOnSIDES("data/onsides/")

# Parse TWOSIDES
twosides_db <- parseTWOSIDES("data/TWOSIDES.csv.gz")

Integration Pipeline

The power of dbparser lies in its ability to chain parsers and mergers together. Here is how you can build a complete pharmacovigilance dataset:

library(dbparser)
library(dplyr)

# 1. Parse the raw databases
drugbank_db <- parseDrugBank("data/drugbank.xml")
onsides_db  <- parseOnSIDES("data/onsides/")
twosides_db <- parseTWOSIDES("data/TWOSIDES.csv.gz")

# 2. Build the Integrated Knowledge Graph
#    DrugBank serves as the hub. Chain the merges.
final_db <- drugbank_db %>%
  merge_drugbank_onsides(onsides_db) %>%
  merge_drugbank_twosides(twosides_db)

# 3. Analyze Results
head(final_db$integrated_data$drug_drug_interactions)

For a detailed case study, see the Integrated Pharmacovigilance Vignette.

The dvobject Structure

dvobject is a unified, compressed format for pharmacological data — an R list object that preserves complex relational hierarchies while enabling consistent access patterns.

For a single database (e.g., DrugBank):

  • drugs: list of data frames containing drug information (synonyms, classifications, etc.) — the only mandatory component
  • salts: data frame of drug salt information
  • products: data frame of commercially available drug products worldwide
  • references: data frame of articles, links, and textbooks about drugs or CETT data
  • cett: list of data frames containing targets, enzymes, carriers, and transporters information

For a merged database (Integrated Pharmacovigilance):

When databases are merged using merge_drugbank_onsides or merge_drugbank_twosides, the dvobject becomes a nested structure:

  • drugbank: The mechanistic hub
  • onsides: Side-effect data (from FDA labels)
  • twosides: Drug-drug interaction data
  • integrated_data: Enriched tables bridging databases (e.g., linking DrugBank IDs to OnSIDES adverse events)
  • metadata: Detailed provenance for all contained datasets

Research Impact

dbparser has enabled 10+ peer-reviewed publications in leading journals:

Domain Journal Reference
Alzheimer’s Drug Repurposing Nature Scientific Reports Parolo et al. (2023)
COVID-19 Therapeutics Pharmaceutics Pérez-Moraga et al. (2021)
Pan-Cancer Biomarkers Briefings in Bioinformatics Mercatelli et al. (2022)
Pathway Modeling Computer Methods and Programs in Biomedicine Hammoud et al. (2025)
Clinical Trial Analysis Frontiers in Pharmacology Namiot et al. (2023)

📊 50,000+ CRAN downloads | Featured in the CRAN Epidemiology Task View

For the full list, see our JOSS paper.

Ecosystem

Package Description Links
dbdataset Pre-parsed DrugBank datasets ready for analysis GitHub
covid19dbcand COVID-19 drug candidate datasets GitHub
periscope2 Shiny framework for interactive dashboards CRAN

Citation

If you use dbparser in published research, please cite our JOSS paper:

Ali et al., (2026). dbparser: An R Package for Parsing and Integrating
Pharmacological Databases. Journal of Open Source Software, 11(118),
9950, https://doi.org/10.21105/joss.09950
citation("dbparser")

If you find dbparser useful, consider ⭐ starring the GitHub repository and sharing it with colleagues.

Enterprise Support

For custom database integrations, enterprise support, training, or deployment assistance — dbparser is maintained by Interstellar Consultation Services.

📧

Contributing

We welcome contributions! Please review our Contributing Guide.

Please note that the dbparser project is released with a Contributor Code of Conduct. By contributing to this project, you agree to abide by its terms.