In last few years, there is quite a buzz around storing digital data into DNA and it has gained huge impetus in both research institutes and industries trying to capitalize the vast opportunities in the information technology age. In 2013 Nick Goldman team at the European Molecular Biology Laboratory have encoded audio, image, and text files into a synthesized DNA molecule about the size of a dust grain, and then retrieve the information from the DNA with 99.99 percent accuracy. Last year, Microsoft has purchased ten million DNA sequences from a tech startup called Twist Bioscience, specifically for research into data storage. Microsoft is known to be working on DNA Storage in collaboration with the University of Washington.

Although prior experiments of information were encoded into a DNA sequence was known, but in these prior experiments, the DNA was synthesized, and then all the information remained outside the realm of living organisms. Recently in a new study Scientist Seth Shipman and Jeff Nivala from Harvard University had developed a unique technique to store the data into the genetic code of living bacteria cells. Also, this stored information is passed on to their descendants, and can be later decoded by genotyping the bacteria.

Harvard scientists are using a gene editing tool known as CRISPR/Cas (Clustered, Regularly Interspaced, Short Palindromic Repeats (CRISPR)–CRISPR-associated protein), to simulate the nature’s own way of storing data in genomes. In past few years, CRISPR-Cas technology has shown a stunning progress from intriguing prokaryotic defense systems to a powerful and versatile biomolecular tools in applications ranging from genome editing to molecular imaging. The concept behind it is that when these bacteria are infected by viruses they cut out the part attacking virus’s DNA and CRISPR diligently records the event in the DNA, by this way the bacteria can recognize the virus if it tries to attack again. CRISPR/Cas does this by storing tiny sequences of the viral DNA itself, called spacers. Researchers figured out that this temporal ordering of spacers could form the basis of a molecular storage. They used this technique and introduced the digital data in the form of viral DNA into the colony of E. coli bacteria, bacteria eat up this information by storing it into their genetic coding and this process continues storing all the information into bacteria and turning them into little efficient memory storage device like hard drives which can be genetically passed to next generation.

Scattering of Data

One problem faced by a scientist is that not all bacteria have all the information stored, for example, if you want to store the text information “Data Storage in Bacteria is completed”. Some bacteria would store “Data”, some would store “Storage in Bacteria”, and some would store “is completed”.

But scientist suggests that we can genotype millions of bacteria quickly and because there is sequential storage of data, the exact message can be extracted successfully.

In near future, more advancement in this technology could mean that we can securely store our confidential data into our own genetic coding.


Mohit Gupta: Team IEBS

Dr. Sudhanshu Das: Team IEBS