DNA helicesBy 2017 human beings will generate more than 16 trillion gigabytes of digital data. Legal files, medical records, financial transactions, multimedia files; most of it will need to be archived, and current physical storage devices – hard drives, optical discs, gigantic data centres – are simply not big enough. Companies working in big data are looking in some unusual places for alternate solutions.

Storing digital data encoded in DNA. As an idea, it’s not all that farfetched, encoding 1s and 0s in a sequence of A, C, G and T bases. When reduced to the size of a DNA molecule, millions of data bytes suddenly take up a minute amount of space. And DNA won’t degrade with time, unlike a server or a hard drive, ensuring data integrity for the long haul. This is why companies like Microsoft are looking long and hard at this new storage frontier. 

“Today, the vast majority of digital data is stored on media that has a finite shelf life and periodically needs to be re-encoded,” explained Emily M. Leproust, Ph.D., CEO of California-based biotech company Twist Bioscience. “DNA is a promising storage media, as it has a known shelf life of several thousand years.”

Last month Microsoft Research purchased 10 million DNA strands from Twist Bioscience, designed to encode digital data. “As our digital data continues to expand exponentially, we need new methods for long-term, secure data storage,” said Doug Carmean, a Microsoft partner architect within the company’s Technology and Research organisation.

“The initial test phase with Twist demonstrated that we could encode and recover 100 percent of the digital data from synthetic DNA. We’re still years away from a commercially- viable product, but our early tests with Twist demonstrate that in the future we’ll be able to substantially increase the density and durability of data storage.”

In 2012 George Church encoded 70 billion copies of his book Regenesis into a cubic millimetre of DNA, as proof of concept for DNA data storage. A year later Ewan Birney and a team from the European Bioinformatics Institute showed how data could be retrieved from DNA storage with 100% accuracy. 

More on these topics