Daily Roundup: DNA as Storage Device

(Late to the party, or what? Unfortunately, my day job every now and then interrupts my fantasy of becoming a science writer.)

Amid the building excitement around quantum computing, storage and transmission, researchers at Baltimore and Boston have managed to encode an entire book (on — what else? — biology) into DNA. As Nature News puts it, it’s the largest piece of non-biological information ever encoded in a biological molecule like DNA.

The concept isn’t difficult to comprehend, although the execution is delicate, for obvious reasons. The team decided to treat the information as digital bits of ones and zeros, and encoded every character of the textbook (and, note, this includes images and a JavaScript function as well)  into its ASCII representative. Then, they decided on a biological-binary mapping scheme: A(denine) and C(ytosine) would map to 0’s, and G(uanine) and T(hymine) would map to 1’s. The DNA strands would then be synthesized from scratch. To read the information back again, the researchers first multiplied the DNA using PCR or polymerase chain reaction, which ensured that data would be preserved with very few flaws. Then they sequenced the DNA — “reading” it back to determine the sequence of bases — and then decoded it back to 0’s and 1’s and then to characters.

There’s a remarkable graph in their abstract here that places their work in context, when it comes to information density.

Screenshot from the online paper. Copyright George M. Church, Yuan Gao, Sriram Kosuri and http://www.sciencemag.org.

We see that the information density of DNA is a few magnitudes higher than the next most efficient method of encoding information. This is also a good place to point out that DNA itself has evolved to be incredibly high fidelity: it must be capable of preserving information about an entire organism through its lifespan, so storing information in it is very stable indeed. In fact, in this experiment, the researchers recovered their data with only 10 bit errors in 5.27 million — which gives us an error rate of about 0.0002%.

Of course, this isn’t going to be immediately useful for anything we need in our daily lives. The sequencing and coding will take equipment that is far too specialized at the moment, and will take too long for it to be viable as, say, an alternative to an external hard drive. But think of long-term storage. Any highly sensitive, crucial information of international import could be stored long term, in very little space. Or, if you’d like to think futuristically, it could be part of a planetary colonizing ship’s cargo centuries into the future. All of humanity’s data, in one place!

One interesting thing to note is that DIY genetic tools have been around on the fringe of home grown labs for a couple of years now, and have definitely been coming into their own in the biohacking community. I wrote about genetic hackers previously, and a little web surfing has come up with a couple of articles — published in both Wired and Nature, interestingly enough — that highlight this growing community. If anyone’s going to come up with a way to create nano biobots that you can program through a simple desktop tool, inject into your body, and then wait for repairs or enhancements to be made… it would be the biohackers.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s