It’s no surprise that the amount of data being produced by humanity right now is increasing at a very rapid rate. Last year, the research group IDC even calculated that by 2025 we will be producing over 160 zettabytes. 

As you can imagine, it’s all well and good producing such a large amount of data, but you need somewhere to store it. This is the exact problem scientists have been trying to resolve for a while now. 

One solution which has been discussed involves exploiting the molecular structure of DNA. Researchers have known for a long while that DNA can be used for data storage, but nobody has yet come forward with a realistic system of storing data in a DNA library and then retrieving it again when required. 

Besides, for computer scientists this would be a huge benefit as a single gram of DNA has the potential to store roughly a zettabyte. 

According to MIT Technology Review, that’s all going to change after  Federico Tavella and his team from the University of Padua in Itlay revealed that they have designed and tested a technique based on bacterial nanonetworks. 

The principle is actually quite simple. Bacteria often carry genetic information in the form of tiny circular rings of double-stranded DNA called plasmids. The idea going forward is to store data in plasmids inside bacterial cells that are trapped in a specific location. To retrieve this information, the researchers send motile bacteria to this site, where they conjugate with trapped bacteria and capture the data-carrying plasmids. Lastly, the motile bacteria then carry this information to a device that extracts the plasmids and reads the data they carry. 

The team has even managed to perform a proof-of-principle experiment, using two different strains of E.coli – HB101 and Novablue – that are resistant to different antibiotics. HB101 is resistant to streptomycin, while Novablue has tetracycline – resistant plasmids. Novablue can pass on this resistance to HB101 by transferring these plasmids during conjugation.

This effectively gives the team control over where the bacteria can grow. The prototype memory consists of a data storage area, a data reader, and a data transfer channel that connects them. In order to store the data, the researchers encode a simple message into the tetracycline-resistant plasmids carried by the Novablue bacteria. 

To start this process, the Novablue bacteria are placed in the data storage area, where they are unable to escape. This is a flat surface of hard agar that is not suitable for bacterial motility, the team surrounds this with streptomycin, which kills Novablue. 

The data transfer channel runs from a source of HB101 bacteria across the data storage area and then on toward the data reader. This is made up of soft agar that is suitable for bacterial motility, and since HB101 is resistant to streptomycin, it can move through this channel with relative ease. 

But, the area between the data storage area and the data reader is rich in tetracycline as well as streptomycin, which prevents both bacteria from traveling across it. 

The next process is crucial, as it sees HB101 bacteria travel to the data storage area, conjugate with the Novablue bacteria, and pick up the data-carrying plasmids. However, this is what gives the team tetracycline resistance. Therefore, once they’ve picked up the data, they can then travel on through the channel to the data reader. The researchers then extract the plasmids and read the data. They are then able to watch the way information flows across this specific network thanks to the fluorescent dye. 

It’s worth mentioning that this isn’t a quick process, instead, the HB1010 bacteria takes around 72 hours to travel across the agar channel. Although the data rates have a slow pace, the experiment does show how a DNA archive could work in principle. 

The data archive element can provide very important in such a situation like this. Due to their being many data storage locations, and each one having to be addressable, there must be a way for data transfer bacteria to find each location. 

Tavella and his team have taken that on board and suggested their own possible answer. This takes the form of a molecular positioning system that is analogous to the Global Positioning System. This relies on beacons that each release a chemical that attracts the bacteria. The researchers say that in simulations, this process works well, but they are still yet to try it in a wet lab. 

Despite this, the work is still a very interesting step towards practical DNA-based data storage. “Our solution allows digitally encoded information to be stored into nom-motile bacteria, which compose an archival architecture of clusters, and to be later retrieved by engineered motile bacteria, whenever reading operations are needed,” said Tavella and his team. 

They say that their proof-of-principle experiment shows how this could work. “We have conducted wet lab experiments that show how bacteria nanonetworks can effectively retrieve a simple message, such as ‘Hello World,’ by conjugation with non-motile bacteria, and finally mobilise towards a final point,” they explained.

Yes, there are a number of challenges ahead, but it is definitely a step in the right direction towards dealing with increasing data generation.