Robert Service in Science:
Humanity has a data storage problem: More data were created in the past 2 years than in all of preceding history. And that torrent of information may soon outstrip the ability of hard drives to capture it. Now, researchers report that they’ve come up with a new way to encode digital data in DNA to create the highest-density large-scale data storage scheme ever invented. Capable of storing 215 petabytes (215 million gigabytes) in a single gram of DNA, the system could, in principle, store every bit of datum ever recorded by humans in a container about the size and weight of a couple of pickup trucks. But whether the technology takes off may depend on its cost. DNA has many advantages for storing digital data. It’s ultracompact, and it can last hundreds of thousands of years if kept in a cool, dry place. And as long as human societies are reading and writing DNA, they will be able to decode it. “DNA won’t degrade over time like cassette tapes and CDs, and it won’t become obsolete,” says Yaniv Erlich, a computer scientist at Columbia University. And unlike other high-density approaches, such as manipulating individual atoms on a surface, new technologies can write and read large amounts of DNA at a time, allowing it to be scaled up.
Scientists have been storing digital data in DNA since 2012. That was when Harvard University geneticists George Church, Sri Kosuri, and colleagues encoded a 52,000-word book in thousands of snippets of DNA, using strands of DNA’s four-letter alphabet of A, G, T, and C to encode the 0s and 1s of the digitized file. Their particular encoding scheme was relatively inefficient, however, and could store only 1.28 petabytes per gram of DNA. Other approaches have done better. But none has been able to store more than half of what researchers think DNA can actually handle, about 1.8 bits of data per nucleotide of DNA.