For the past few years, everyone involved in networking technology has been fixated on the cloud, which has become the extensible paradise for enterprise computing, offering an infinite number of resources to expand infrastructure capacity on demand. However, is the cloud really a panacea, especially when it comes to data center operations and big data? True, the cloud offers more storage and computing power when you need it, which is valuable for big data applications. But how much infrastructure can you reliably host in the cloud versus installing on premise? What are the technical and business drivers for deciding to use an on-premise solution rather than a hosted approach?
Software-as-a-Service, platform-as-a-service, and everything-as-a-service are gaining market momentum for logical business reasons. Rather than investing in the hardware, software, and staff to scale their computing needs, companies can rent the resources they require and hold a cloud service provider accountable. According to ongoing research by Software Advice, the migration to the cloud is well underway. In 2008, 88 percent of buyers preferred on-premise software, but by 2014, 87 percent say they prefer cloud solutions. In 2014, only 5 percent of buyers indicated any desire to evaluate on-premise software.
In developing a big data sales strategy, VARs must be mindful of customer expectations and preferences as well as big data requirements. They need to be able to articulate the value of on-premise versus cloud solutions in the context of Hadoop and big data computing.
The Argument for Cloud Computing
Cloud service providers offer a number of compelling reasons to adopt cloud computing for big data and Hadoop applications. Here are just four:
- Capacity – On-premise big data requires much capacity, and that translates into added cost. A physical platform for Hadoop analytics can be costly, requiring large servers and server clusters. And you need the IT staff to keep it running. Cloud services require no up-front hardware costs and offer scalable storage and analytics. You only pay for what you use.
- Extensibility – Cloud services offer elasticity, especially when it comes to storage. Big data applications can require petabytes of data storage—more than you want to pay to install in the enterprise. You can add thousands of virtual servers in the cloud in minutes.
- Collaboration – For shared projects, one of the advantages of using cloud services is access. Cloud resources can be accessed anywhere at any time.
- Security – Data security is always a consideration. Most cloud service providers have excellent security capabilities, and using hosted services means you need fewer in-house resources to oversee data security.
Balancing Control Versus Expedience
Of course, cloud servers don’t offer the best solution for all big data applications. Migration to the cloud means surrendering control. It also means demand for more bandwidth to handle added storage and data transfer capacity. If cloud computing isn’t part of the initial big data strategy, then there will be added costs down the road to upgrade the infrastructure to integrate off-premise resources.
However, there are going to be situations when adding cloud services will be a logical extension of any big data project. Here are four examples:
- Adding to existing cloud resources – Many organizations already have cloud computing integrated into their enterprise infrastructure. If that’s the case, adding more capacity to accommodate big data projects may be a simple matter of extending current contracts or leveraging existing cloud capacity in a different way, which is more cost-effective than adding on-premise capacity.
- High-volume data sources that require preprocessing – Much of the cost of big data projects comes from storing and processing extremely large data sets, such as social media feeds. If the organization doesn’t have the storage and processing capacity for such large volumes of data, then using cloud-based services for filtering and pre-processing data is the most cost-effective option.
- Tactical Hadoop applications – If the organization already has a dedicated Hadoop infrastructure on premise, then adding cloud services could facilitate expansion and special projects. For new big data projects that require more data capacity, or for immediate access to needed computing power, extending services to the cloud will quickly provide the necessary resources for the short term, without a huge investment in time and money.
- Adding analytics sandboxes – Cloud resources can be invaluable for short-term, fast-turnaround projects that require an exploratory data mart. For quick results and fast access to more resources to test analytics, the cloud is the only way to go.
A hybrid infrastructure makes sense for many big data projects. Consider using a colocation provider to create a private cloud infrastructure. This offers the best of both on-premise control and cloud performance.
So when weighing the value of on premise versus cloud computing resources for the data center and big data, consider long-term and short-term ROI. If the computing and data storage will be an ongoing need for big data projects, then investing in more on-premise capacity could pay off in the end. If you are provisioning for a short-term project or need more capacity right away, then extending the project to the cloud will likely pay off. However, in designing the big data infrastructure, make sure you have sufficient bandwidth and on-premise capacity to handle cloud services when the demand arises.