Big data doesn’t have to mean a big budget. Even without the financing needed for a large data center, small and medium-sized businesses (SMBs) can take advantage of big data, including big data storage. It’s all a matter of finding creative ways to harness cloud computing for well-defined projects. By using the cloud, storage capacity for projects such as big data can grow as the needs of the company grow.
Data storage is one of the biggest costs of big data projects. Demand for data storage is growing at a fantastic rate. According to a Research Now survey of CIOs:
- 37 percent of CIOs indicated they are storing between 500,000 and 1 million GB of data. These CIOs anticipated their data storage needs will grow by 11 percent over two years.
- 19 percent of CIOs are storing between 499 and 1 million GB of data. CIOs said their data storage needs will grow 21 percent in two years.
- 60 percent of CIOs surveyed said they will outgrow their storage within 12 months.
- 46 percent said they retain data for six to 10 years for regulatory compliance.
- 80 percent of CIOs said they are paying between $0.11 and $0.50 per GB/month to store cold data.
Hadoop helps make these data stores more accessible and less expensive. Consider that, in 2008, Oracle sold a database with 168 TB of data for $2.33 million at $14,000 per TB (not including operational costs). In 2012, RainStor estimates that running a 75-node Hadoop cluster supporting 300 TB would cost $1.05 million over three years. For SMBs, it’s a matter of finding the right combination of data stores to suit the organization’s analytic needs and budget.
Start With Small Data
To take advantage of big data while minimizing storage costs, SMBs have to stay focused on big data projects that will yield bigger returns. SMBs need to identify the business value first and then assess their big data options.
For example, will social media and blog analytics improve sales? What about analyzing customer service and churn in order to understand customer retention? These kinds of projects take advantage of data stored in-house, such as sales and customer records, and leverage large pools of external data, such as Web and social media analytics. Big data is ideal for this kind of query since it can accommodate large pools of unstructured data, such as social media commentary. The results should yield insight that justifies the cost of the project.
Using cloud services allows SMBs to start small and to scale as needed. It also minimizes up-front capital expenditures, and SMBs can control costs by turning off services that aren’t needed and by using open-source software wherever possible.
An experienced big data architect or reseller can assist with a design strategy that takes into consideration the SMB’s current enterprise infrastructure and big data storage needs. The goal is to design an infrastructure that doesn’t box you into a corner, that delivers big data ROI, and that lays the groundwork for future projects and big data growth.
Expanding Options With Object Storage
SMBs are going to have various types of data storage systems to handle business computing and backup. A big data storage strategy has to be able to eliminate the barriers between data silos in order to access these data storage pools for big data analysis. Let’s consider some of the most common types of data storage:
- Direct-attached storage (DAS) – Data storage attached directly to a PC or server, usually using a USB peripheral port.
- Network-attached storage (NAS) – A storage device connected directly to the network. Like a file server, it accepts multiple storage drivers using RAID for redundancy.
- Online storage – Cloud services for data storage. Note that there are cloud data pools for active data storage, and there are cloud storage services specifically designed for archiving cold data, which present their own issues for big data access.
- Private cloud – A proprietary implementation of cloud data storage that offers the same flexibility as cloud services but without having to leave sensitive data in the hands of third-party cloud vendors. Private cloud services have become more affordable and are now within economic reach of many SMBs.
All of these data storage options can be included as part of SMB big data storage, which means you need to apply the right technology in order to provide data access across all platforms for analytics.
Object data storage is increasingly being adopted for big data storage strategies. Rather than moving huge blocks of data around the network, object-based storage uses extended metadata to abstract the storage files. That way, analytics software can gather stored data without having to physically move the data files or know where the data is stored. Object-based storage is especially valuable for cloud computing, since it allows source files to be stored almost anywhere.
Ultimately, the goal of big data is to yield meaningful analytics, not to store more data. Using object storage, the data can reside anywhere, either on premise or in the cloud, and big data software can access the file, examine it, and release it using node-centralized storage rather than vast data pools.
Choosing the right data sources for big data analytics for SMBs is really a matter of closely defining the big data objectives. With a well-defined use case, you can determine the best combination of data storage solutions, blending existing servers and storage arrays with cloud storage. Then, by designing the right kind of analytics using tools such as object-based storage, SMBs can optimize their big data investment without spending a fortune to build their own data lake.
The reseller has a critical role in helping SMBs make the leap to big data. Resellers should use a small pilot project to show customers the potential returns from big data. Once they have their SMB customers hooked on the results, they will come back for more, and the big data infrastructure, including data storage, will expand to meet their growing big data needs.