/ genetic associations

Adding a data source to the Open Targets Platform: an update

In parts 1 and 2 in my series of Open Targets blog posts, I described how to add a new data source to the Open Targets Platform. If you need a refresher, go back and read them now!

Due to some improvements made by the Open Targets Platform development team in release 19.02, the process of adding a new data source is now more straightforward: it is much simpler to configure both the pipeline and the REST API to recognise a new source, without modifying any code.

As before, I will use my data source, genomics_england_tiering as an example. You may remember from my previous posts, the data type is genetic_association.

Configuring the pipeline

To configure the pipeline, you need to make a couple of changes to the mrtarget.data.yml config file.

First, add an entry to the input-file: list that points to the compressed JSON file containing your evidence. In my case this was:

- /usr/src/app/evidence/genomics_england_tiering.json.gz

Note the - is part of the YAML list syntax

Next you will need to define what data type your new data source is by adding it to the datasources_to_datatypes: list at the end of the file.

My list ended up looking like this:

datasources_to_datatypes:
  expression_atlas: rna_expression
  phenodigm: animal_model
  chembl: known_drug
  europepmc: literature
  ...
  phewas_catalog: genetic_association
  progeny: affected_pathway
  sysbio: affected_pathway
  genomics_england_tiering: genetic_association

Note: I have truncated the list to save space

Save your data config file and you are done.
Now you can run the pipeline as normal.

Configuring the REST API

You can add your new data source to the REST API by simply adding an environment variable, CUSTOM_DATASOURCE. In my case I set it via:

export CUSTOM_DATASOURCE=genomics_england_tiering:genetic_association

Again, no code changes needed.

Note if you are running the API via Docker, you will need to specify this environment variable via -e

And that is it!

Note: the configuration of the web application is as described in the original How to add a new data source to the Open Targets Platform - part 1 and How to add a new data source to the Open Targets Platform - part 2 blog posts>

Please get in touch with your comments; my contact details are below. For any Open Targets Platform related queries, please email their Support team.

Glenn Proctor

Glenn Proctor

Glenn is an independent consultant. He worked on the Ensembl project, then led software development at Eagle Genomics. He helps clients with product advice, cloud strategy & software implementations.

Read More