Tech

Big Data – A big problem?

This intersection between biotech and computing is extremely exciting, and if technology continues to progress the way we see it now, it’s conceivable that we may see DNA storage as commonplace in the future.
Now Reading:  
Big Data – A big problem?

All of our thoughts and actions these days, through photos and videos, and even our fitness activities, are stored as physical data. Aside from running out of space on our phones, we rarely think about our digital footprint. However, humanity has collectively generated more data in the last few years, than all of preceding human history. As a result, Big Data has become a big problem.

We have come an incredibly long way in data storage. IMB released the first ever hard-drive in 1956, which held the equivalent of one MP3 song, exemplifying how devices have evolved dramatically. But it is argued that all media will eventually become obsolete, and no device will really stand the test of time. For example, if someone handed you a floppy drive today to back up your presentation, would you be able to use it? Every advance in writing data has required a new way to read it. Therefore, storage is not just about how many bytes, but how well we can actually store the data and recover it. 

DNA Data Storage

Think of compressing all the information on the accessible Internet into a shoebox. With DNA data storage, that is possible.

DNA is nature’s oldest storage device. After all, it contains all the information necessary to build and maintain a human being. But can it really be used to store digital data? In short, yes. The information density of DNA is extraordinary. Just one gram can store 215 million gigabytes of data - for context, the average hard drive in a laptop can house just one millionth of that. 

How can Digital Data be stored in DNA?

Simply put, DNA digital data storage is the process of encoding and decoding binary data to and from synthesised strands of DNA.

Storing data in DNA sounds hopelessly complex, but the technologies are well established and understood. DNA is made from four organic bases: Adenine, Thymine, Guanine and Cytosine. 

First, the binary data from digital content, which consists of zeros and ones, is compressed and mapped to those four organic bases in DNA. Those strands are then copied millions of times, to make reading the data easier when it is extracted from its storage container. When the data needs to be read, the opposite process occurs in order to convert the DNA strands back into digital content.

So anything that can be stored as binary data, as zeros and ones, can be stored in DNA - from an Amazon gift card to the oldest film, it is really just a process of recovering enough zeros and ones to put the data back together.

Are Big Tech playing in this space?

Microsoft has seen this impending crisis of not being able to store information as we move forward and chose to invest in experimentation that could revolutionise the way that we think about data storage.

Microsoft have been looking at using DNA for data storage for several years, but the process has typically been incredible manual. The only way to make DNA data storage scale up and be usable is by automating the process: from bits to molecules, back to bits. In 2019, Microsoft undertook a project with the University of Washington to demonstrate the first fully automated system to store and retrieve data in manufactured DNA, showing that the prospect of DNA data storage is not merely theoretical.

So, what does this mean for us?

It is crucial to acknowledge that we are producing a lot more data than we are capable of storing today, and that we need a radical new solution. It is true that DNA could be one such solution, but there is still a long way to go between a really cool, innovative idea, and something we can actually use. 

While there are still advancements to be made in overcoming the practical challenges, it is fascinating to ideate around the transformational benefits this could bring to the business world and society.  For example, we know that DNA can accurately store massive amounts of data without requiring much energy, and such efficiencies promote highly sustainable processes. 

This intersection between biotech and computing is extremely exciting, and if technology continues to progress the way we see it now, it’s conceivable that we may see DNA storage as commonplace in the future.