Our guest contributor Dr Neil Lamb continues his fortnighty Shareable Science Blog with an extract from the annual guidebook ‘A Year in Genetics as Told by Tomorrow’s Textbooks. Neil is the Vice President for Educational Outreach at the HudsonAlpha Institute for Biotechnology and Shareable Science will explore how genetics is relevant to people in their everyday lives. 

One of the first things you learn when studying DNA is that it’s made up of four letters — GCAT. The letters stand for guanine, cytosine, adenine and thymine, the four chemicals that link and pair to create the genetic code of every living thing on earth.

But how did all of life come from pairs of G-C and A-T? Was it a random occurrence, or is there some fundamental reason that only these four nucleotides build all of life as we know it?

An interdisciplinary group of scientists led by Steven Benner says they’ve now got a definitive answer to that question. In a paper published in Science, researchers argue that G, C, A and T are not the only possibilities for constructing a genetic code, but represent what was assembled from the chemicals available at the time. The scientists have successfully created four new nucleotides that function just as well — B, S, P and Z. With all the letters together, they call their expanded genetic language hachimoji, stemming from “hachi” which means eight in Japanese and “moji” which means letter.

These additions to the genetic code may lead to new diagnostic tools, tailor-made therapies and novel ways to solve our data storage challenges.

What do we know about these new nucleotides?

It’s important to emphasize these nucleotides are synthetic. They weren’t discovered in nature; they were made in a lab.

Don’t let that downplay the significance of their creation. Researchers put a lot of effort into rigorously testing B, S, P and Z for functionality.

For example, if you want to use DNA to store information, then the nucleotides need to follow predictable rules. These new nucleotides needed to reliably pair. They do — B with S, P with Z. Researchers confirmed this over hundreds of molecules of the synthetic DNA.

They also wanted to illustrate that this synthetic DNA could hold up in the sequences of living things. The scientists demonstrated that the famous double helix structure of DNA would remain stable when the synthetic bases were incorporated, something that had stymied earlier attempts at expanding the genetic alphabet.

The team also showcased the synthetic DNA’s ability to perform DNA’s most essential task — transcribing RNA, a transient copy of the DNA instructions that cells use to build proteins. Storing information is one thing, but for DNA to serve its natural purpose, it needs to be able to transfer that information. These new nucleotides are capable of writing RNA, just like GCAT before them.

Does this mean new kinds of life?

If you let yourself indulge in the science fiction-y side of genomics, you can quickly arrive at hachimoji creating a whole new kind of life. Of course, if there were to be life on other planets, it could very well have evolved from a different set of chemicals than what we have on Earth. But those premises are as distant as the planets where they’d originate. This synthetic DNA only functions in a test tube and isn’t incorporated inside any living organism.

However, hachimoji still has important implications for the life all around us.

For example, the group that published this paper on hachimoji has previously shown that strands of DNA that included the Z and P bases were better at binding to cancer cells than comparable sequences created from the four natural nucleotides.

The ability to create DNA-based detectors could lead to major breakthroughs for both diagnostics — by making cancerous cells easier to identify — and treatment — by making those cells easier to attack.

The researcher at the head of this group has already created a company that commercialises synthetic DNA for diagnostics.

Separately, scientists have long dreamed of ways to create customized proteins that execute a specialized task. Having twice as many genetic building blocks at their disposal could significantly improve the design options for tailor-made biological based drugs and therapies.

What else could we do with more DNA letters?

One aspect of DNA that frequently gets overlooked by the public is its potential use as a storage tool. After all, nature already uses DNA to store all of the information that makes up life on Earth. If we could harness that power, it would come with extraordinary benefits.

One challenge that goes underappreciated is how we will continue to keep up with ever-growing demands for digital storage space. Humanity generated more data between 2015 and 2017 than in all of human history before it. DNA could help address that problem.

In 2017, scientists developed an approach that could store 215 petabytes (215 million gigabytes, if that gives you some perspective) in a single gram of DNA. That’s about the weight of a paperclip. Using that system, we could theoretically store all of humanity’s data — all of it, as in all of human history — in a container the size and weight of a pair of pickup trucks.

That efficiency could be doubled or better when you expand the language of DNA. Twice as many letters means a much better economy of space for encoding information.

Not only does this have appeal for storage space reasons, it also offers greater permanence. DNA doesn’t fade or go obsolete like other storage methods. As long as it’s kept in a cool dry place, it can last for hundreds of thousands of years.

[If you’d like to learn more about DNA storage, check out my Biotech 201 talk on the subject.]

Our ever-evolving language

We’re constantly learning new things about DNA and the secrets it holds. Our study of DNA creates opportunities for medical treatments tailored specifically to individuals, targeting their unique health ailments using solutions that work more effectively for their genetic makeup. It also opens the doors for crops that can survive a changing climate to meet the world’s increasing demands.

It’s incredible what nature has achieved with just these four letters, and we’re just scratching the surface in understanding how that language contributes to the amazing diversity of life on Earth. It’s mind-boggling to ponder the implications of a genetic code that is twice as large.

A Year in Genetics as Told by Tomorrow’s Textbooks

This article is an extract from The HudsonAlpha Institute for Biotechnology’s Educational Outreach team annual guidebook. It contains all the genetic developments of the last year that should be in a student’s textbook but likely won’t make it there because of the speed with which the field of genetics moves and the slow process of textbook creation and adoption. The guidebook is HudsonAlpha’s way to keep educators and their students current.

The annual guidebook tells short, compelling narratives about what I consider to be the most fascinating field of research. Every year, geneticists make discoveries that change the way we understand the world around us, and sometimes they even change the world itself.

Without further ado, please enjoy this year’s Guidebook.