How to add a new data source to the Open Targets Platform: an update

Open Targets Platform Mar 29, 2019

In parts 1 and 2 in my series of Open Targets blog posts, I described how to add a new data source to the Open Targets Platform. If you need a refresher, go back and read them now!

Due to some improvements made by the Open Targets Platform development team in release 19.02, the process of adding a new data source is now more straightforward: it is much simpler to configure both the pipeline and the REST API to recognise a new source, without modifying any code.

As before, I will use my data source, genomics_england_tiering as an example. You may remember from my previous posts, the data type is genetic_association.

Configuring the pipeline

To configure the pipeline, you need to make a couple of changes to the mrtarget.data.yml config file.

First, add an entry to the input-file: list that points to the compressed JSON file containing your evidence. In my case this was:

- /usr/src/app/evidence/genomics_england_tiering.json.gz

Note the - is part of the YAML list syntax

Next you will need to define what data type your new data source is by adding it to the datasources_to_datatypes: list at the end of the file.

My list ended up looking like this:

datasources_to_datatypes:
  expression_atlas: rna_expression
  phenodigm: animal_model
  chembl: known_drug
  europepmc: literature
  ...
  phewas_catalog: genetic_association
  progeny: affected_pathway
  sysbio: affected_pathway
  genomics_england_tiering: genetic_association

Note: I have truncated the list to save space

Save your data config file and you are done.
Now you can run the pipeline as normal.

Configuring the REST API

You can add your new data source to the REST API by simply adding an environment variable, CUSTOM_DATASOURCE. In my case I set it via:

export CUSTOM_DATASOURCE=genomics_england_tiering:genetic_association

Again, no code changes needed.

Note if you are running the API via Docker, you will need to specify this environment variable via -e

And that is it!

Note: the configuration of the web application is as described in the original How to add a new data source to the Open Targets Platform - part 1 and How to add a new data source to the Open Targets Platform - part 2 blog posts

Please get in touch with your comments; my contact details are below. For any Open Targets Platform related queries, please email their Support team.

Recommended for you

Release Notes

Open Targets Platform 25.06 has been released!

a month ago • 7 min read

Release Notes

Open Targets Platform 25.03 has been released!

4 months ago • 9 min read

Open Targets Platform

A step-change in common disease genetics in the Open Targets Platform

4 months ago • 3 min read

Case study: an Open Targets Platform MCP server

Open Targets Platform 25.06 has been released!

An atlas of tissue-specific protein-protein associations helps to prioritise targets for drug discovery

Open Targets Platform 25.03 has been released!

How to add a new data source to the Open Targets Platform: an update

Configuring the pipeline

Configuring the REST API

Tags

Glenn Proctor

Recommended for you

Open Targets Platform 25.06 has been released!

Open Targets Platform 25.03 has been released!

A step-change in common disease genetics in the Open Targets Platform

Case study: an Open Targets Platform MCP server

Open Targets Platform 25.06 has been released!

An atlas of tissue-specific protein-protein associations helps to prioritise targets for drug discovery

Open Targets Platform 25.03 has been released!

Configuring the pipeline

Configuring the REST API

Tags

Subscribe to our newsletter

Glenn Proctor

Recommended for you

Open Targets Platform 25.06 has been released!

Open Targets Platform 25.03 has been released!

A step-change in common disease genetics in the Open Targets Platform