Case study: gget’s new Open Target module
This blog post is part of a series that will explore applications and expansions of the Open Targets informatics ecosystem, particularly the Open Targets Platform and Open Targets Genetics through conversations with our users.
Sam Wagenaar is a high school senior, who, during his internship last year with the Pachter Lab at the California Institute of Technology (Caltech), developed the gget opentargets module under the mentorship of Laura Luebbert and Joe Rich.
“Sam has an exceptional talent in software engineering, and his contributions reflect a deep understanding of both the technical and biological aspects required for bioinformatics tool development,” says Laura Luebbert, now a postdoctoral fellow in the Sabeti lab at the Broad Institute of MIT and Harvard and Harvard University. Laura developed and first published gget during her PhD in the Pachter Lab, and continues to serve as its primary developer and maintainer.
We chatted to the team about the module, and how it facilitates access to Open Targets Platform data.
What is gget?
gget (https://www.gget.bio) is a free, open-source command-line tool and Python package designed to enable efficient querying of large genomic databases, such as Ensembl, UniProt, and NCBI. Since its initial release in May 2022, gget has evolved to also support more complex workflows, such as sequence alignments and running protein prediction models like AlphaFold.
gget consists of a collection of modules, each enabling researchers to perform common tasks in genomics, transcriptomics, and proteomics data analysis in just one line of code without exceeding the computational capabilities of a laptop.
Through its ease of use and minimal requirements, gget was designed to increase the reproducibility as well as accessibility of genomic data queries and workflows. Each gget module requires minimal arguments, provides clear output and operates from both the command line and Python environments, such as JupyterLab, maximising ease of use and accommodating novice programmers.
What can the new Open Targets module be used for?
The new gget opentargets module allows users to communicate directly with the Open Targets database from a Python or command line environment. Amongst other tasks, gget opentargets can quickly find diseases and drugs associated with a specific gene.
For example, to find drugs associated with the human IL13 gene (Ensembl ID ENSG00000169194*), a cytokine that plays an important role in allergic inflammation and immune response to parasite infection, you can use:
# Install gget !pip install gget # Python import gget gget.opentargets('ENSG00000169194', resource='drugs', limit=10) # Command line !gget opentargets ENSG00000169194 -r drugs -l 10
* You can use the gget info and gget search modules to convert between gene names and Ensembl IDs.
The gget opentargets module has many applications. Joe is a fourth-year USC-Caltech MD-PhD student in the Pachter Lab, and he is working on a novel algorithm to detect carcinogenic variants in RNA sequencing data. The gget opentargets module plays a key role in enabling researchers to interpret the identified variants.
Was there anything particularly difficult or unexpected in the process of creating gget?
The process of creating gget involved several challenges and surprises. One of the main difficulties was adapting to frequently changing database structures, as well as handling the different APIs and data organisations for each database.
An unexpected positive outcome, however, was the tremendous user interest—gget has been downloaded over 150,000 times since its first release—and the overwhelmingly positive response from the bioinformatics community.
Where do you think the limits of this application are? Are you planning to do any additional work on this?
To maintain a simple user interface, gget limits some advanced functionalities accessible through the Open Targets web interface. Future work may include expanding database support and adding functionalities based on user feedback while ensuring that the tool remains accessible and easy to use.
The complete manual of the gget opentargets module, including additional examples, is available here:
English: https://pachterlab.github.io/gget/en/opentargets.html
Spanish: https://pachterlab.github.io/gget/es/opentargets.html