When developing a big data strategy for your customers, you need a game plan that you can scale and adapt to any business situation. The basic elements required to harness big data are basically the same for every deployment, so you need to develop a template for your big data strategy that will deliver consistent results.
According to a survey by InfoChimps, 55 percent of big data projects are never completed. Of those failures, 58 percent fail because of inaccurate scope, 41 percent fail because of technical issues, and 39 percent fail because of lack of internal cooperation and siloed data. Clearly, when creating a big data strategy you need to develop an approach that meets the needs of all the stakeholders – senior management, those who need the results, and IT – and you have to develop a clear objective and agree on the infrastructure required to deliver the best results.
Here are five considerations to help you create a winning big data strategy for your customers:
- Be clear about the business objectives. Too often the technical team wants to dive into the project details without taking a step back to look at the objectives first. With any big data engagement you have to establish the goals before you approach the process. You want to be sure that your system architects and analytics experts work closely with the subject matter experts to define the right questions to ask. Try starting with high-level vision statements to define your objectives, such as “increasing the number of repeat customers.” Create objectives that are sufficiently well-defined so that data scientists, developers, analysts, and stakeholders can collaborate on the types of insights needed, analytical models, and other details.
- Agree on how to use the data in advance – To ensure a positive outcome you want to agree, in advance, how the data will be applied to the business. Make sure that the findings from the big data project are directly applicable to improving operations or increasing production. The objective is to deliver insight that is repeatable and directly applicable to processes and methodologies. Be sure that the developers, analysts, and stakeholders all agree on how to use the findings.
- Identify what data you need for best results – Now that you have identified the overall objectives and how the data will be applied, take an inventory of available data resources. Look at the existing data warehouse and in-house data repositories. Then make an inventory of external data sources that could be valuable, such as social media feeds. Once you have an inventory of data sources, assess the value of each source as it relates to the big data objective and discard those data sources with little or no value.
- Build an end-to-end big data pipeline – When you are dealing with petabytes of data you clearly need to automate the data pipeline. There are four steps to consider as part of big data gathering:
- Acquire and store the data – Gather data from the various sources including legacy data, social media, mobile data, etc., and store the data using batch, near real-time and real-time modes.
- Prepare the data – Integrate and cleanse the data for analysis collecting technical and metadata to facilitate categorization and reuse.
- Curate and assess – Create visual representations of the outcome to show patterns, trends, and insights. Identify the data sets that have the most business value as it relates to your initial objectives.
- Distribute the results – Release the results to your stakeholders and end users through reports, mobile dashboards, enterprise application, and other means to turn insight into action.
- Keep the data lake clear – The pool of big data that encompasses cloud storage, SANs, and legacy data warehouses is considered the data lake. Over time this lake will become bigger as you add new data sources, dig deeper into legacy data, and add real-time data. The ongoing challenge is to make sure the data lake doesn’t become a swamp. Make sure all data sources are cleansed and easy to organize and search, and make sure you have all the necessary governance and privacy policies in place. You want to keep the data in the lake clean so anyone can access it for reliable data analysis. And be prepared to add new storage capacity as big data needs grow.
These are only a few of the basics that should go into a big data strategy. There are other considerations, such as how to create incorporate real-time analytics and optimize your data warehouse, but if you start with the basics you can develop a big data strategy that can be adapted for almost any engagement.
What’s your biggest challenge in helping customers achieve their big data objectives? What steps do you consider essential to any big data strategy?