According to IDC, worldwide data growth will reach the unfathomable amount of 40,000 exabytes by 2020 -- or approximately 5,247 gigabytes of data per human on the planet. Furthermore, according to the DatacentreDynamics 2012 Global Census, the worldwide data center energy requirements to store and process this accumulating data grew 63%, in just one year, to reach 38GW in 2012, or roughly the equivalent of 30 nuclear power plants.
This is why financial institutions, that maintain some of the world's largest collections of analytic and compliance-mandated information, are reexamining their data center operations to find cost-controlling and energy-efficient approaches to managing this growth.
For example a new metric, Power Usage Effectiveness (PUE), is being used to identify wasteful energy practices in data center infrastructure. According to data center specialists at IO, "PUE calculates the ratio of total energy entering the data center in relation to the energy that is actually consumed by IT equipment." The energy that reaches the computing equipment is considered productive, while energy used for infrastructure (e.g. cooling, lighting) is viewed as waste. New data centers, such as Facebook's Prineville Data Center are now achieving 91% efficiency, or a PUE approaching 1.0.
However, a survey from Digital Realty Trust indicates that there is more work to be done. A drawback of PUE is that it does not include the energy efficiency of the computer and storage hardware itself, the primary consumer of all this power. So, similar to the way we educate homeowners on the importance of turning off the lights when they are not home, so do we need to examine whether the data center hardware, and the data being stored, really needs all that power all the time.
Matching Costs To Data UseWhile PUE is the new measure of infrastructure efficiency, ROB (Return on Byte) -- or the measure of a data set's use in relation to its storage cost -- is a better KPI to consider for data center hardware efficiency. In this regard, it is important to note that not all data is needed all the time. Recent studies have recognized that a growing proportion of data stored -- as much as 92% according to Facebook's own estimates -- is "inactive," or archival data (i.e. data that has been captured but quickly becomes very infrequently used).
Clearly, storing this seldom-used data on the same expensive, always-on hardware that supports day-to-day operations does not make sense from an ROB perspective. Inactive data is a big deal: IDC estimates that capacity-optimized storage growth for inactive data will far exceed that of active data, and that by 2016 nearly 90 Exabytes of storage will be required for inactive data alone.
Between A Rock And A Hard PlaceTo date, IT professionals have had two choices for data storage when it comes to ROB: storing data on systems that are optimized for millisecond retrieval times and high I/O, or parking data on tape.
The advantages of tape are well known. Tapes are portable, are much less expensive than high-performance storage and don't use energy all the time. Yet, the use of tape is declining because as data grows beyond Petabytes, physical and financial maintenance of extensive tape libraries increases. More importantly, data on tape takes a significant amount of time to be written and recovered in a useful way, creating difficulties for IT professionals in meeting backup windows with tape. Furthermore, once data is written to tape, it is difficult to completely delete when no longer required, a compliance concern for the financial industry. For many archival applications, the need for flexible, disk-based retrieval is driving a move to disk-to-disk and VTL (Virtual Tape Library) architectures, despite additional cost.
Cold Storage On DiskIf archival data does not warrant a real-time response time, why not just turn off the storage disks when they are not in use? In fact, this is exactly the approach that has been adopted by Facebook and shared via the Open Compute Project under the new name, Cold Storage. By storing inactive data using disks that are turned off at least 50% of the time, and by accepting a 30 second spin-up delay for retrieval, three good things happen:
- Open-standards hardware with commodity desktop drives, easily the cheapest storage medium around today, can be safely used.
- Disk life (the most expensive component of storage) is doubled, as drives are powered down nearly 90% of the time.
- The power required to operate a rack of powered-down disks can be substantially reduced to under 3kw, an energy savings of at least half.
In this way, Cold Storage economically fills the gap between hot disks and tape for much of the storage that is currently kept on disk but not ready for offload to tape.
[For more on how the Open Compute Project is helping enterprises lower costs, read: Why You're Paying Too Much for Data Center Technology.]
Some may remember Copan and their early efforts at Cold Storage with MAID (Massive Array of Idle Disks). Although ahead of it's time, recent advances in network speeds to support higher throughput rates and evolutions in disk drive technology and SATA connections, make this a perfect time to revisit the benefits of Cold Storage technology. The result will be significant cost savings and energy efficiency in the fastest growing area of data storage where an impact is sure to be felt.
About The Author: Jeff Flowers is a seasoned technology executive with deep IT and technical expertise. As the former Co-Founder and CTO of consumer backup company, Carbonite, Jeff experienced first-hand the challenges of effectively managing nearly 100 petabytes of data storage under the constraints of a cost-driven business model. In 2009, while at Carbonite, Jeff was named CTO of the Year by the Mass Technology Leadership Council. Carbonite went public in the fall of 2011. Jeff left Carbonite in April 2012 to found SageCloud, a data storage software company, but remains an active member of Carbonite's Board of Directors. Jeff holds a B.S. and M.S. in Information and Computer Science from the Georgia Institute of Technology and attended the Northeastern University School of Business.