Due to some improvements made by the Open Targets Platform development team in release 19.02, the process of adding a new data source is now more straightforward: it is much simpler to configure both the pipeline and the REST API to recognise a new source, without modifying any code.
As before, I will use my data source,
genomics_england_tiering as an example. You may remember from my previous posts, the data type is
Configuring the pipeline
To configure the pipeline, you need to make a couple of changes to the
mrtarget.data.yml config file.
First, add an entry to the
input-file: list that points to the compressed JSON file containing your evidence. In my case this was:
- is part of the YAML list syntax
Next you will need to define what data type your new data source is by adding it to the
datasources_to_datatypes: list at the end of the file.
My list ended up looking like this:
datasources_to_datatypes: expression_atlas: rna_expression phenodigm: animal_model chembl: known_drug europepmc: literature ... phewas_catalog: genetic_association progeny: affected_pathway sysbio: affected_pathway genomics_england_tiering: genetic_association
Note: I have truncated the list to save space
Save your data config file and you are done.
Now you can run the pipeline as normal.
Configuring the REST API
You can add your new data source to the REST API by simply adding an environment variable,
CUSTOM_DATASOURCE. In my case I set it via:
Again, no code changes needed.
Note if you are running the API via Docker, you will need to specify this environment variable via
And that is it!
Note: the configuration of the web application is as described in the original How to add a new data source to the Open Targets Platform - part 1 and How to add a new data source to the Open Targets Platform - part 2 blog posts
Please get in touch with your comments; my contact details are below. For any Open Targets Platform related queries, please email their Support team.