An efficient and scalable strategy with robust error correction is reported for encoding a record amount of information (including images, text and audio files) in DNA strands; a ‘DNA archive’ has been synthesized, shipped from the USA to Germany, sequenced and the information read. This multidisciplinary study in synthetic biology both proposes and demonstrates a system for the DNA-based storage of digital information. Digital information is being produced at an ever-growing rate, requiring an increasing commitment to ongoing maintenance of digital media in the archives. Surprisingly, this provides a niche for DNA, which can serve as a dense and stable information-storage medium. Nick Goldman et al. report an efficient and scalable strategy with robust error correction for encoding a record amount of information (including images, text and audio files) in DNA strands. After synthesizing a 'DNA archive' and shipping it from California to Germany, the DNA was sequenced and the information read. At the current rate of DNA synthesis cost reduction, DNA-based information storage is expected to become cost effective within a decade for archives likely to be accessed only rarely, after about 50 years. Digital production, transmission and storage have revolutionized how we access and use information but have also made archiving an increasingly complex task that requires active, continuing maintenance of digital media. This challenge has focused some interest on DNA as an attractive target for information storage1 because of its capacity for high-density information encoding, longevity under easily achieved conditions2,3,4 and proven track record as an information bearer. Previous DNA-based information storage approaches have encoded only trivial amounts of information5,6,7 or were not amenable to scaling-up8, and used no robust error-correction and lacked examination of their cost-efficiency for large-scale information archival9. Here we describe a scalable method that can reliably store more information than has been handled before. We encoded computer files totalling 739 kilobytes of hard-disk storage and with an estimated Shannon information10 of 5.2 × 106 bits into a DNA code, synthesized this DNA, sequenced it and reconstructed the original files with 100% accuracy. Theoretical analysis indicates that our DNA-based storage scheme could be scaled far beyond current global information volumes and offers a realistic technology for large-scale, long-term and infrequently accessed digital archiving. In fact, current trends in technological advances are reducing DNA synthesis costs at a pace that should make our scheme cost-effective for sub-50-year archiving within a decade.