promise big data

A network diagram showing protein interactions inside a cell carousel. (Credit: Dr. Bissan Al-Lazikani / The ICR)

An international team led by researchers from the University of Cambridge and Merck & Co have built the first detailed genetic map of human proteins. The work, which was published in Nature today, was carried out in the hopes of improving understanding of many different diseases and identifying possible new drug targets to enable better medical treatments. The study involved identifying almost 2,000 genetic associations to nearly 1,500 proteins present in human blood plasma.

In the past, it has been difficult to study the proteome in this manner because the techniques necessary to robustly study many blood proteins simultaneously were not available. Now, the researchers involved were able to use a novel technology called SOMAscan, which was developed by SomaLogic and is capable interrogating many different proteins at the same time. Through this approach, the team were able to measure 3,600 proteins within blood samples from 3,300 participants, before combining that proteomic information with their genetic data.

“Compared to genes, proteins have been relatively understudied in human blood, even though they are the ‘effectors’ of human biology, are disrupted in many diseases, and are the targets of most medicines,” said Adam Butterworth, PhD, senior author of the study and a University Lecturer at the University of Cambridge. “Novel technologies are now allowing us to start addressing this gap in our knowledge.”

By combining data from both the genome and the proteome, the team was able to identify previously unknown links between genes and protein expression, potentially helping us to build a stronger idea of how diseases may develop.

“Thanks to the genomics revolution over the past decade, we’ve been good at finding statistical associations between the genome and disease, but the difficulty has been then identifying the disease-causing genes and pathways,” said James Peters, PhD, another author of the study. “Now, by combining our database with what we know about associations between genetic variants and disease, we are able to say a lot more about the biology of disease.”

The collaboration with researchers at Merck & Co was fundamental in demonstrating the utility of the data in drug development. For example, better understanding of links between the genome and proteins present in the blood may enable drug developers to identify or avoid potential side effects of new medications by identifying which biological pathways they will influence. Further, by linking specific proteins to selected genes, it may be possible to develop drugs that target proteins originating from known disease-linked genes.

“We are so pleased to participate in this collaboration, as it is a great example of how a public private partnership can be leveraged for research use in the broader scientific community,” said Caroline Fox, MD, Vice President and Head of Genetics and Pharmacogenomics at Merck and Co.

The data and results generated in the study are being made freely available for the global community. In time, the team hopes that others will be able to use this work to improve their own investigations into the proteome.

“Our database is really just a starting point,” said Benjamin Sun, first author of the paper. “We’ve given some examples in this study of how it might be used, but now it’s over to the research community to begin using it and finding new applications.”