Every time you consume digital information, whether you are reading a document via Google Drive, browsing a website, streaming music, or watching a YouTube video, you are accessing digital data. This data is stored in sequences of binary bits, each in a state of 0 or 1. In storage nomenclature, the smallest unit for quantifying storage is a byte (typically, 1 byte = 8 bits), and storage capacity is expressed in multiples of bytes, such as kilobytes (KB; 1,000 bytes), megabytes (MB; 1,000,000 bytes), gigabytes (GB; 1,000,000,000 bytes), and terabytes (TB; 1,000,000,000,000 byes)—you get the idea.
The amount of existing digital storage is huge and continually increasing. Seagate, a leading data-storage company, has estimated that global digital storage will increase from 16 zettabytes (ZB; 1,000,000,000,000,000,000,000 bytes) in 2016 to 175 ZB by 2025.
To appreciate the affordability economics of current storage needs, let’s go back in history. In 1960, the cost of storing one GB was an astounding $2 million. For the same digital storage, costs decreased to $200,000 in the 1980s, $7.70 by the early 2000s, and, incredibly, just two cents by 2017. How did storage costs fall so much over the last 40 years that you could argue storage is now, effectively, free?
To understand the answer to this question, let’s start with the basics of digital storage. Every computing device (e.g., your computer or smartphone) has two types of storage: primary and secondary. Primary storage, also called random access memory (RAM), or cache or volatile memory or system memory, offers temporary storage within a device and enables quick access to data when a device is powered on. Secondary storage, often just called “storage” (as we will designate it henceforth), stores data permanently and is available without a power supply. It is a repository for all kinds of digital information, including operating system software, applications, and multimedia for smartphones, personal computers, data centers, and the cloud. The rest of this blog post examines different mediums for digital storage and the drivers of the exponential reduction in storage costs over time.
The earliest and most basic type of storage medium is magnetic tape. Just like audio cassette tapes from a few decades ago, information is stored on magnetic tape as a series of binary 1s or 0s on a thin polymer substrate through the presence or absence of magnetic polarity. These tapes have adopted a “denser, faster, cheaper” approach that has increased information density per unit area of polymer by 30% every year for the last few decades, leading to rapid reduction in storage costs. Today, it is the cheapest way to archive large amounts of information; IBM recently demonstrated an ability to “record data at . . . a density of 201 [GB] per square inch on magnetic tape . . . [which] translates to 330 TB of data stored in a tape cartridge about the same size as the palm of your hand.”
One unique benefit of magnetic tape is enhanced security. Since tapes are stored offline, they are relatively immune to cyber hacks and bugs; hence, they are widely used as archives and backups. For example, in 2011, Gmail experienced a software bug that affected all digital copies of some users’ data. Fortunately, they had backed up user data on tapes and, even though it took over 30 hours, Google was able to restore full account access to customers.
In the mid-1950s, IBM replaced polymer substrates with hard disc drives (HDDs; e.g., floppy discs). Similar to tapes, digital information was stored by magnetizing discs in positive or negative polarity. A mounted mechanical arm with a head would move back and forth to read or write information at the right location on a disc. HDDs became increasingly popular as they were sturdier than tapes and required less time to retrieve information. With massive research and development investments in miniaturization and nanophysics, the mechanical arm in HDD technology became progressively smaller, and the disc rotations per minute (RPM) increased in hard drives on a typical laptop. For example, today the gap between the mechanical arm and disc is just a few nanometers, and discs can spin up to 10,000 RPM. This has increased information density per unit area of a disc and led to lower costs—HDDs cost just a few cents per GB of storage.
Another variant of the same approach is optical storage, where instead of using a mechanical apparatus like HDDs, laser beams and optical reflection are used to store and retrieve information. Over the last few decades, optical storage has become especially popular for multimedia applications, including music CDs, DVDs for multimedia content, and Blu-ray discs for high-definition videos.
However, HDDs have one major disadvantage: the moving mechanical parts not only cause wear and tear but also, in instances of prolonged usage, generate heat and vibration. Solid-state drives (SSDs), also known as flash drives, address this challenge by using a processor (called a controller) to read and write data. Compared to HDDs, flash drives are faster, lighter, quieter, and more durable, yet they are also 10–50 times more expensive. SSD technology is widely used in today’s smartphones, USB sticks, navigation devices, tablets, laptops, and video game consoles.
Currently, from a consumer perspective, storage is offered primarily through the cloud. Major players have taken a freemium approach: Dropbox offers 2 GB of free storage to consumers, with tiered pricing levels for 1 TB and 2 TB of storage available for a monthly fee. Similarly, Google Drive offers 15 GB of free storage to anyone with a Google account, and offers monthly pricing options for 100 GB and 200 GB storage. From an enterprise perspective, storage has historically been handled by on-site data centers, but enterprises are increasingly using multiple hybrid storage solutions to maximize capabilities, security, and access (see our previous blog, Can Enterprise Data Centers Co-Exist with Cloud?). Data centers have mainly relied on HDDs for storage in the past, but a shift to SSDs has recently occurred.
For all the ongoing improvements in storage density and cost, current technologies might not be enough to support the increased demand for storage. With an expected exponential increase in data generation, it is estimated that all microchip-grade silicon in the world will be consumed by 2040. This raises the question as to whether there is a non-silicon innovation that could be used instead of traditional storage methods.
Fascinatingly, this storage breakthrough might come from human DNA, which has been storing information reliably for millions of years. Just like digital information is expressed in a series of 0s and 1s, DNA stores information in combinations of DNA bases: adenine (A), cytosine (C), guanine (G), and thymine (T). Researchers are testing ways to translate binary digital code into combinations of DNA bases. Once the translation of data into DNA is complete, DNA can be stored safely at cold temperatures, and when needed, the digital binary can be retrieved from the stored DNA. According to the Financial Times, “we need [only] about 10 tons of DNA to store all of the world’s data[;] . . . that’s something you could fit on a semi-trailer.” DNA storage is currently both slow and expensive, but based on the history of technology, this might not be a problem for long.