Photo by Joel Fulgencio / Unsplash

Case study: gget’s new Open Target module

Case Studies Jan 30, 2025

This blog post is part of a series that will explore applications and expansions of the Open Targets informatics ecosystem, particularly the Open Targets Platform and Open Targets Genetics through conversations with our users.

Sam Wagenaar is a high school senior, who, during his internship last year with the Pachter Lab at the California Institute of Technology (Caltech), developed the gget opentargets module under the mentorship of Laura Luebbert and Joe Rich. 

Next to two profile photos, text reads: Samuel Wagenaar is a high school senior. During a summer internship in 2024 with the Pachter lab, Sam developed the gget opentargets module under the mentorship of Laura and Joe. Joseph M Rich is a fourth-year USC-Caltech MD-PhD student in the Pachter Lab working on a novel algorithm to detect carcinogenic variants in the RNA sequencing data.

“Sam has an exceptional talent in software engineering, and his contributions reflect a deep understanding of both the technical and biological aspects required for bioinformatics tool development,” says Laura Luebbert, now a postdoctoral fellow in the Sabeti lab at the Broad Institute of MIT and Harvard and Harvard University. Laura developed and first published gget during her PhD in the Pachter Lab, and continues to serve as its primary developer and maintainer.

We chatted to the team about the module, and how it facilitates access to Open Targets Platform data.

A photo of a man and a woman holding a sword and smiling at the camera, captioned: Lior presents Laura with her Doctor's Sword after her PhD thesis (which included gget) defense in March 2024. Text next to the image reads: Dr Laura Luebbert recently completed her PhD in computational biology in the Pachter lab at Caltech. She is now a Postdoctoral Fellow in the Sabeti lab at the Broad Institute of MIT and Harvard and Harvard University. Laura developed and first published gget during her PhD and continues to serve as its primary developer and maintainer. Prof. Lior Pacther is the Bren Professor of Computational Biology at Caltech.

What is gget?

gget (https://www.gget.bio) is a free, open-source command-line tool and Python package designed to enable efficient querying of large genomic databases, such as Ensembl, UniProt, and NCBI. Since its initial release in May 2022, gget has evolved to also support more complex workflows, such as sequence alignments and running protein prediction models like AlphaFold.

gget consists of a collection of modules, each enabling researchers to perform common tasks in genomics, transcriptomics, and proteomics data analysis in just one line of code without exceeding the computational capabilities of a laptop. 

Through its ease of use and minimal requirements, gget was designed to increase the reproducibility as well as accessibility of genomic data queries and workflows. Each gget module requires minimal arguments, provides clear output and operates from both the command line and Python environments, such as JupyterLab, maximising ease of use and accommodating novice programmers.

What can the new Open Targets module be used for?

The new gget opentargets module allows users to communicate directly with the Open Targets database from a Python or command line environment. Amongst other tasks, gget opentargets can quickly find diseases and drugs associated with a specific gene.

For example, to find drugs associated with the human IL13 gene (Ensembl ID ENSG00000169194*), a cytokine that plays an important role in allergic inflammation and immune response to parasite infection, you can use:

# Install gget
!pip install gget

# Python
import gget
gget.opentargets('ENSG00000169194', resource='drugs', limit=10)

# Command line
!gget opentargets ENSG00000169194 -r drugs -l 10

* You can use the gget info and gget search modules to convert between gene names and Ensembl IDs.

The gget opentargets module has many applications. Joe is a fourth-year USC-Caltech MD-PhD student in the Pachter Lab, and he is working on a novel algorithm to detect carcinogenic variants in RNA sequencing data. The gget opentargets module plays a key role in enabling researchers to interpret the identified variants.

Was there anything particularly difficult or unexpected in the process of creating gget?

The process of creating gget involved several challenges and surprises. One of the main difficulties was adapting to frequently changing database structures, as well as handling the different APIs and data organisations for each database. 

An unexpected positive outcome, however, was the tremendous user interest—gget has been downloaded over 150,000 times since its first release—and the overwhelmingly positive response from the bioinformatics community.

Where do you think the limits of this application are? Are you planning to do any additional work on this?

To maintain a simple user interface, gget limits some advanced functionalities accessible through the Open Targets web interface. Future work may include expanding database support and adding functionalities based on user feedback while ensuring that the tool remains accessible and easy to use.

The complete manual of the gget opentargets module, including additional examples, is available here:
English: https://pachterlab.github.io/gget/en/opentargets.html
Spanish: https://pachterlab.github.io/gget/es/opentargets.html

Tags