/ Open Targets Platform

Open Targets in the time of Covid-19

Allow me to give my best wishes to all our Open Targets users, collaborators and colleagues in this extraordinary time of the Covid-19 pandemic. As I write this, the Wellcome Genome Campus is closed to all but essential workers and the UK is in lock down.

It's hard to think about science in the middle of an international emergency where people are fighting for their lives. However science is a major part of the solution for the future and it is heartening to see the research community come together to seek avenues for treatments for Covid-19 disease (for instance see Figure 4 of this report). Colleagues on the Genome Campus have been involved in some of these efforts, for instance see this preprint, and are actively pursuing other ways in which we can contribute including developing a platform to share Covid-19 research data. EMBL-EBI was also involved in developing this explainer of the current Covid-19 epidemic for a general audience. From an Open Targets perspective we are thinking hard about where our research programme and approaches could be useful. We have seen already that some researchers have made use of the Open Targets Platform as the starting point to identify drugs interacting with host targets. The rapid release of these analyses as preprints is appropriate to the seriousness of the pandemic, but it is worth remembering that formal peer review is still required.

This flurry of activity got us thinking about how well the Open Targets Platform deals with coronavirus disease and specifically Covid-19. As the Platform concentrates on human targets, this is a use case that is outside our usual main focus and the first thing we identified is that we don't yet have a term for Covid-19 because of the newness of the disease. This is something we will fix through our collaboration with the Experimental Factor Ontology team (see this github ticket). While viral genes are not represented in our Platform, there is useful information. There are several disease terms related to coronavirus diseases. For example, earlier coronavirus disease data is present including severe acute respiratory syndrome (SARS) but there is no term for Middle East Respiratory Syndrome (MERS). There is also data for a broader ontology term "coronavirus infectious disease". As an aside it is worth noting that disease terms are brought into the Platform primarily if data exists from our data providers for that term. So these data have arisen naturally through our data acquistion process. We will also need to improve the structure of the ontology in this area (see above).

We can next think about which human targets are associated with these diseases. The data we have is primarily driven by associations of host proteins with SARS and comes from clinical trials of drugs or text mining. Looking at the handful of drugs, these clinical trials are primarily investigating improvements to patient management while under ventilation, so while clearly important, they are not about treating the viral disease directly. The text mining is more comprehensive and includes papers on ACE2, the receptor that SARS-CoV and SARS-CoV-2 (the Covid-19 virus) use to infect human cells, and TMPRSS2, a protease that facilitates virus-cell membrane fusions. Both of these targets have been proposed as possible places to intervene with drugs in viral infection. However there are a host of other proteins listed which is useful to survey previous literature but doesn't give much prioritisation. It is worth noting that the use of text mining in the Platform helps to provide coverage of the literature in a systematic way, as was intended, picking up useful papers as they appear, even though we have not focussed on this in the past.

Finally we can look specifically at some of the host proteins that have been suggested as potential interventions points. ACE2 has GWAS evidence as a potential target for lung function and smoking behaviour. Target tractability assessment suggests there is already a compound in clinical trials but unfortunately the clinical trial evidence doesn't make it into the platform. As it turns out this is because of our recent changes to the ontology which dropped a general term for cardiovascular disease in favour of a subdivsion into cardio and vascular, leading to the trial for Cilazapril being dropped. We will correct this. Baseline gene expression data gives no indication that ACE2 is expressed in lung which is puzzling (there is no relevant data for nose or throat). Conversely TMPRSS2 is shown as expressed in the lung and has somatic mutation and genetic evidence for a role in prostate cancer. Tractability assessment favours an antibody approach. The Open Targets Platform also provides data from GTEx which does show some low expression of ACE2 in the lung but more expression of TMPRSS2. A more detailed analysis of GTEx data has been blogged about elsewhere. A comprehensive analysis would also include single cell expression data, which is not currently available in the Open Targets Platform. However the Human Cell Atlas project has recently described cellular profiles of the expression of these genes in very relevant tissues see preprint. Incorporation of these single cell expression profiles in healthy and disease samples would definitely enhance the Platform.

In summary, although infectious disease is not an area that we have prioritised in Open Targets, since we focus on host targets, we recognise the importance of our platform to inform drug discovery and we are working to incorporate new information and use our tools to support the community in the fight against this pandemic. As the resources in EMBL-EBI and elsewhere curate the rapidly increasing evidence around Covid-19 disease and host targets this will provide us the opportunity to improve our coverage and provide more detailed data. We know our colleagues at the Sanger Institute are working hard to contribute towards the COVID-19 efforts and we will work with them closely to integrate relevant data into the Platform. Genetic evidence from Biobanks on the roles of host genes in clinical outcomes in Covid-19 would be particularly relevant here and could be processed through our Open Targets Genetics Portal. Finally this exploration has reiterated to me that the Open Targets Platform is a great way to rapidly browse the data associated with targets in disease and act as a starting point for more in depth consideration of specific targets.

Main image source : https://www.bing.com/covid/local/unitedkingdom