Software defined storage is getting a lot of trade media attention lately because of the possibilities it offers for demanding applications such as big data. The ability to decouple the physical storage hardware from the controlling software opens up new levels of performance and scalability that promise big results to big data VARs. And while vendors are offering software defined storage today, software defined software storage trends are still evolving.
Software defined storage development is lagging behind software defined networking largely because of the storage process’ tight integration with storage controllers and media. However, virtualization and other developments are making it easier to create virtual data storage pools that can be used for applications such as big data analytics.
One of the biggest obstacles to software defined storage adoption is the need to support storage heterogeneity. Storage hardware and operating systems tend to vary more than server hardware. Even different storage arrays from the same vendor can have multiple operating systems and unique features. Storage is one of the fastest growing market segments in IT, and the result is a hodgepodge of storage types supporting the enterprise, each with unique performance and data protection characteristics.
Storage Capacity Moves to the Cloud
According to IDC, annual sales of storage capacity will increase 30 percent from now until 2017, and enterprises are on track to buy 138 exabytes of storage capacity in 2017. Clearly data storage has to be a profitable commodity offered by big data VARs. However, the growth rate is expected to start slowing thanks to an emerging trend toward “lean storage.” Data deduplication, compression, and storage virtualization are slowing demand for enterprise storage, although the appetite for storage continues to grow with demands for more data to support projects like big data. With software defined storage removing the physical boundaries to data repositories, much of that data will migrate to cloud storage systems.
Using cloud services to support software defined storage offers a number of advantages:
- Improved visibility and simplified administration of storage resources.
- Scalability by adding more cloud storage capacity on demand; a must for big data projects.
- Fast provisioning using automated processes with high availability; also important to deliver the performance big data demands.
More Virtualized Storage
An important distinction that big data VARs should appreciate is that virtualized storage is not the same as software defined storage; virtualization is a subset of software defined storage.
Virtualization abstracts workloads from the underlying hardware, essentially transferring the workload to a virtual machine, which is a really nothing more than a software construct. Software defined storage decouples the storage hardware from the control software in order to make control autonomous from the hardware, but it also does much more. Where storage virtualization pools distribute data sources into a central virtual repository, software defined storage gives you control over that data, including how the data is accessed and automated controls.
As big data VARs understand, virtualization is an important part of any big data project. Virtualizing data storage and computing resources is the best way to get the speed and response time needed for big data analytics. Using virtualized storage allows you to pool data from enterprise and cloud repositories in a seamless fashion and get the performance you need; software defined storage allows you to control and automate access to the data pool.
Migration to More Flash-based Storage
Another of the leading software defined storage trends is the adoption of more flash-based data storage. All-flash storage systems and storage systems that mix solid-state drives (SSDs), hard disks, and flash-based caches are becoming increasingly popular to boost overall performance. Since software defined storage opens up storage possibilities to any platform, adding more flash storage is logical and very cost effective.
Performance and scalability are the two biggest storage requirements for big data. Available data storage has to be scalable to accommodate changing demands, such as adding social media conversations to big data analytics. The data storage also has to be elastic enough to handle vast amounts of new data as needed. And storage has to be fast in order to meet the demands of real-time analytics.
Flash-based storage meets both these criteria. Deploying multiple virtual machines in the same physical server increases demand for more input/output operations per second (IOPS), so the solution is either server-side flash or more SSDs.
When you incorporate flash-based storage using software defined storage you get more options for optimization, new possibilities in data center design, new ways to virtualize big data workloads, and general better performance and scalability.
Software defined storage is going to force enterprise architects to rethink their data storage designs using commodity hardware, cloud storage, virtual storage arrays, and abstract control methodologies. These innovations in data storage are what make new trends such as big data analytics possible.