Meet Yakov Tsepilov, our new Genetics team leader

Life at Open Targets Aug 7, 2023

Open Targets recently appointed Yakov Tsepilov as the new Genetic Analysis team leader. Yakov joined the team last year as a statistical geneticist. We chat to him about what led him to Open Targets and what’s in store for Open Targets Genetics.

What is your background?

I’m from Russia; I was born in the north of Siberia, a region that is a major producer of oil. Following my interest in natural sciences, I moved to Novosibirsk (central Siberia) as soon as I was able, to study general biology, with a focus on statistical genetics and bioinformatics.

I am an academic scientist through and through: I’ve been working in academia since university. I am excited about the opportunity Open Targets offers to interact with science outside of academia.

What’s a genetics problem you’d love to solve?

My PhD was about non-additive effects of genes in the human metabolome (the collection of metabolites present in human blood). My aim was to develop new statistical frameworks to help with the analysis of non-additive models that affect human traits such as recessive, dominant or overdominant models.

The vast majority of results from genome wide association studies (GWAS) — including everything in Open Targets Genetics — are based on an additive model, in which we assume that the risk of the disease is proportional to the number of risk alleles in an individual.

We know very little about non-additivity in complex traits — how alleles interact — and gene interactions, probably because the additive model works fairly well to explain what we observe. Studying other genetic models could help us fill in the gaps of what we can’t explain with the classic model.

For me, it’s not just about the statistics or methodology, it’s about the biological nature of non-additivity: monogenic diseases are easy to interpret in this context, but as soon as we talk about more complex diseases, it becomes problematic. Ronald Fisher was the first to explore this problem in 1920s, and he, Sewall Wright, and John Haldane debated the nature of dominance in complex traits throughout their career, probably until the end of their lives. So it’s a debate that has been going on for a hundred years now.

During my PhD I developed two new methods (and published the two best papers of my career so far), but after that I stopped working with non-additivity and switched to the canonical model. To be clear, I haven’t solved the problem of the nature of dominance; I contributed some information to this debate, and I hope someone else who is smarter than me can handle this problem — and good luck to them!

What led you to Open Targets?

A combination of circumstances: I was looking to relocate because of the war, and I saw an ad for a statistical geneticist which perfectly fitted my skills and scientific interests.

The role was contributing to Open Targets Genetics, which I was familiar with, and I was excited to join because I had some ideas of how to improve it… It’s like that famous anecdote of the person who wants to fix an issue so he joins the company, fixes the issue, then quits (laughs). But I aim to stay!

Not only haven’t you left — you’re now leading the team! What is on your to-do list in this new role?

The aim of the Open Targets informatics ecosystem is to aggregate the world’s genetic association data, and analyse it systematically to better understand disease biology, and make those insights available to the community. The challenge for us is — within the increasingly vast range of available data and rapidly evolving analysis methods — deciding how and where to implement the most relevant frameworks to further this aim.

Under Maya Ghoussaini’s leadership, Open Targets Genetics has already made large strides in this direction, for example implementing the Locus-to-Gene (L2G) score. For me, this new role is a chance for us to evaluate and refresh our pipelines to improve the platform, and our ability to develop it for the future.

Our current focus is to streamline our pipelines to optimise clumping, colocalisation, and fine-mapping. This will speed up our processes, and allow us to experiment with improvements to L2G.

What can people come talk to you about?

Genetics, statistical genetics, and mountains.

Any mountains in particular?

All mountains, but preferably those higher than 1,000m.

Mountains are my passion. I know that Cambridge — one of the flattest places in the UK — probably wasn’t the most optimal choice for me, but it just shows my dedication and commitment to Open Targets (laughs).

I love everything to do with mountains — hiking, alpinism, skiing, everything.

What are the best mountains?

My favourite mountains were the Pamir mountain range in Kyrgyzstan, probably the most famous mountains in post-Soviet countries (pictured in the cover image). It has some stunning peaks — the Lenin peak, Khan Tengri, Jengish Chokusu — and a number of over 7,000m. It’s a famous region for people who love alpinism, especially if you want to climb Everest: a 7,000m peak is good practice for an 8,000m one, and Kyrgyzstan is one of the easiest, safest, and cheapest choices for such practice. Moreover Kyrgyzstan has the best samsa (a type of savoury pastry) in the world — so it’s definitely worth the trip!