Think the memory card in your camera is high-capacity? It's got nothing on DNA. With data accumulating at a faster rate now than any other point in human history, scientists and engineers are looking to genetic code as a form of next-generation digital information storage.
Now, a team of Harvard and Johns Hopkins geneticists has developed a new method of DNA encoding that makes it possible to store more digital information than ever before. We spoke with lead researcher Sriram Kosuri to learn why the future of archival data storage is in genetic code, and why his team's novel encoding scheme represents such an important step toward harnessing DNA's vast storage potential.
Humanity has a storage problem. Recent surveys conducted by IDC Digital Universe suggest that the perfusion of technology throughout society has triggered an explosion in the volume of information that we as a species produce on a daily basis. Between photos, video, texts, tweets, Facebook updates, unsolicited FarmVille requests, Instagram posts and various other forms of digital data production, the world's information is doubling every two years, and that raises some important questions, chief among them being: where the hell do we put it all?
"In 2011 we had 1.8 * 1021 bytes of information stored and replicated" explains Sriram Kosuri, a Harvard geneticist and member of the Wyss Institute's synthetic biology platform, in an email to io9. "By 2020 it will be 50 times that. That's an astounding number; and doesn't include a much larger set of data that's thrown away (e.g., video feeds)."
As Kosuri points out, not all of this information needs to be stored, but — being the diligent little hoarders that we are — a good deal of it will be cached away somewhere for posterity; and at the rate we're generating information, we'll need to find new storage solutions if we want to have any hope of keeping up with our demand for space. "Our ability to store, manage, and archive such information is being constantly strained already," notes Kosuri. "Archival storage is also a large problem."
The (Theoretical) Solution: The Advantages of DNA Storage
Archival storage is where DNA comes in. As storage media go, it's hard to compete with the universal building blocks of life. In an article published in today's issue of Science, Kosuri — in co-authorship with geneticist Yuan Gao and synthetic biology pioneer George Church — describes a new technique for using DNA to encode digital information in unprecedented quantities. We'll get to their novel storage method in the next section, but for now let's look at some numbers that help contextualize what Kosuri identifies as the two major advantages of DNA storage: information density and stability.
At theoretical maximum, one gram of single stranded genetic code can encode 455 exabytes of information. That's almost half a billion terabytes, or 4.9 * 1011 GB. (As a point of reference, the latest iPad tops out at 64 GB of storage space.) DNA strands also likes to fold over on top of themselves, meaning that, unlike most other digital storage media, data needn't be restricted to two dimensions; and being able to store data in three-space translates to more free-space.
DNA is also incredibly robust, and is often readable even after being exposed to unfavorable conditions for thousands of years. Every time researchers recover genetic information from a woolly mammoth specimen, or sequence the genome of a 5,300 year-old human mummy, it's a testament to DNA's durability and data life. Just try recovering files from a 5,000-year-old CD or DVD. Hell, try it with a 20-year old disc; odds are it just isn't going to happen.