The need for more data storage seems inexhaustible. Data storage demand has been growing by 50 percent every year, which is faster than the decrease in the cost of data storage. And with every new big data strategy, the need for data storage is even greater.
Eighty percent of big data developers expect their data storage needs to increase in the next year, and 75 percent of organizations using big data will buy more storage. According to IDC, big data is driving storage growth at a compound annual growth rate (CAGR) of 53 percent:
"Storage will be one of the biggest areas of infrastructure spending for Big Data and analytics environments over the forecast period," said Ashish Nadkarni, Research Director, Storage Systems. "Revenue from storage consumed by BD&A environments will increase from a mere $379.9 million in 2011 to nearly $6 billion in 2016. This growth will come largely from capacity-optimized systems (including dense enclosures), however, software-based distributed storage systems with internal disks to store post-processed data will also be embraced by some users."
As Nadkarni notes, more organizations are going to rely on software defined storage benefits to shape their big data strategy. As big data analytics demand more data from highly distributed data storage sources, software defined storage will prove to be the only way to support real time big data analytics.
Software Defined Storage Supports Big Data Volume
The ability to manage the storage, data flow, and access of vast quantities of unstructured data are just a few of the software defined storage benefits. Big data analytics increasingly relies on unstructured data from rich media, wireless devices, email, social media conversations, and other sources. The volume of such data makes it virtually impossible to store it all within an enterprise, so any big data strategy has to include cloud storage as well as disk arrays. Using software defined storage is the only way to effectively manage these massive data stores.
Network virtualization, including software defined storage, is an efficient way to manage big data resources. Software defined storage abstracts the data from the actual storage hardware. By creating a virtual representation of physical storage it’s easier to configure access to big data resources. You can even pool dissimilar storage systems, e.g. structured and unstructured data, as one virtual data repository, without concern for the underlying hardware, its physical location, or the storage platforms used.
Object-defined Storage Offers Scalable and Performance
Big data requires bigger data sets spread across disparate data sources, and as the data sets get bigger, using object-based storage mechanisms is the best way to feed big data analytics. Breaking data away from conventional file-based hierarchies and using unique identifiers for data objects is that last step in virtualizing data storage.
By using objects as part of software defined storage you now have virtually infinite scalability. Object stores run on storage clusters and commodity servers rather than proprietary appliance, and they are ideal for cloud storage. Using unique identifiers for each object makes it possible to retrieve the object without having to know the physical location of the data. When dealing with an increasingly large number of data sets, as with big data, object-based storage is the only want to manage it.
And you can run software defined storage on everyday hardware. You can use an object-based storage system to handle data outages and rebalance data access based on the state of the storage server, and since you are moving objects rather than blocks of data performance and scalability are no longer a problem.
Policy-based Storage Access
Automating data access using policy-based management is perhaps the greatest of the software defined storage benefits. Programmers can provision storage for big data projects without having to think about hardware attributes which results in a shared pool of data on commodity hardware, operating in its own open storage architecture. It also makes it easier to handle larger workloads for big data analytics. And the entire infrastructure is accessible from a single control point.
Programming access to software defined storage is the only way to keep up with the demands of big data; the multiple steps and conditions for big data analytics would be impossible to program manually.
And, of course, lower operating costs are another one of the software defined storage benefits. Using object-based data storage and policy-based management means you can access cloud storage as needed. That means fewer dedicated storage arrays and lower investment in storage hardware.
These are only a few of the benefits of using software defined storage for your big data strategy. What benefits can you add to the list?