The future of data storage : DNA

DNA contains information about a living organism. It codes everything in an living being.
That’s why it makes sense for corporations like Microsoft to invest in research that studies how DNA can be used to store data.
Unlike most of the existing data storage devices out there, DNA doesn’t degrade over time, plus it’s very compact.
For example, just four grams of DNA can contain a year’s worth of information produced by all of humanity combined.
1 Gram of DNA Can Store 1,000,000,000 Terabyte of Data for 1000+ Years.

In a study published in the journal Science, Researchers Yaniv Erlich and Dina Zielinski demonstrated how DNA may be the answer to our data storage needs.

Erlich and Zielinski stored six files into 72,000 DNA strands, each 200 bases long. The files included a full computer operating system, a 1895 French film, an Amazon gift card, a computer virus, a Pioneer plaque, and a study by information theorist Claude Shannon. “We mapped the bits of the files to DNA nucleotides.

Then, we synthesized these nucleotides and stored the molecules in a test-tube,” Erlich told ResearchGate. “

To pack the information, we devised a strategy—called DNA Fountain—that uses mathematical concepts from coding theory.

It was this strategy that allowed us to achieve optimal packing, which was the most challenging aspect of the study.”

To retrieve the data, the researchers used DNA sequencing technology and a software to translate the genetic code back into binary. “

To retrieve the information, we sequenced the molecules.

This is the basic process,” Erlich said.

Remarkably, the recovered files were error-free.

Erlich also believes that it’s time to move to a better technology. “[T]raditional media suffers from digital obsoleteness. My parents have 8 mm tapes that are basically useless now,” he added. “DNA has been around for 3 billion years, and humanity is unlikely to lose its ability to read these molecules. If it does, we will have much bigger problems than data storage.”

Asked when this technology could be made available, Erlich replied with an optimistic estimate. “I would guess more than a decade,” he said. “We are still in early days, but it also took magnetic media years of research and development before it became useful.”

Ultimately, research like Erlich’s and Zielinski’s leads to other opportunities to explore a future of biological computers. “This opens the possibility of using molecular biology tools to assist computing,” Erlich said. “Usually, it is the other way around!”

Just last year, Microsoft purchased 10 Million strands of synthetic DNA from San Francisco DNA synthesis startup called Twist Bioscience and collaborated with researchers from the University of Washington to focus on using DNA as a data storage medium.

However, in the latest experiments, a pair of researchers from Columbia University and the New York Genome Center (NYGC) have come up with a new technique to store massive amounts of data on DNA, and the results are marvelous.

The duo successfully stored 214 petabytes of data per gram of DNA, encoding a total number of six files, which include:

  • A full computer operating system
  • An 1895 French movie “Arrival of a Train at La Ciotat”
  • A $50 Amazon gift card
  • A computer virus
  • A Pioneer plaque
  • A 1948 study by information theorist Claude Shannon

But How Did the Researchers Store Digital Data on DNA?

Calling their process a “DNA Fountain,” the researchers first compressed all the data into a single master archive and split it into short strings of binary digits, made up of ones and zeros.

Next, the duo used an “erasure-correcting algorithm called fountain codes to randomly packaged the strings into droplets.

Each droplet contains a barcode in the sequence that helped the researchers reassembling the file.

The researchers then “mapped the ones and zeros in each droplet to the four nucleotide bases in DNA: A, G, C and T,” and ended up with a digital list of 72,000 DNA strands that contained the encoded data.

This code was then sent in a text file to Twist Biosciences, the same DNA synthesis startup from which Microsoft purchased 10 Million strands of synthetic DNA last year, that then turned that digital information into biological DNA.

“Two weeks later, they received a vial holding a speck of DNA molecules. To retrieve their files, they used modern sequencing technology to read the DNA strands, followed by software to translate the genetic code back into binary. They recovered their files with zero errors,” the journal reads.

 ‘Highest-Density Data-Storage Device Ever Created’

The researchers believe that DNA is the perfect storage medium – as it is ultra-compact and can last hundreds of thousands of years if kept cool and dry – and suggests this is the “highest-density data-storage device ever created.”

Since the digital universe is large and by 2020 containing nearly as many digital bits as there are stars in the universe, the data will reach 44 zettabytes or 44 trillion gigabytes.

So, DNA data storage could help big organizations store an enormous amount of information in a way that one can still be able to read it in a hundred years.

However, cost is still an issue.

The researchers spent around $7,000 to synthesize the 2MB of data and another $2,000 to read that data.

However, with the time this will change, so do not expect this technique to go mainstream anytime soon.

For more details on the technology, you can check this link out and the video given above.


Please enter your comment!
Please enter your name here

Questo sito usa Akismet per ridurre lo spam. Scopri come i tuoi dati vengono elaborati.