For most IT professionals, big data training immediately implies learning how to write Hadoop or NoSQL algorithms for big data analytics. Knowing coding techniques is important, but it’s only one piece of the big data puzzle. Effective big data training needs to address each step of the process, from project inception to the final conclusions. Any big data initiative is only as effective as the weakest link in the chain of discovery.
The lack of available big data talent is impeding big data adoption. According to a CompTIA study, 50 percent of companies are leveraging data, but 71 percent feel their staff lacks the data management and analytic skills to make big data succeed. McKinsey Global Institute predicts there will be a shortage of 1.7 million big data workers by 2018, including 1.5 million managers and analysts.
In a recent interview with Shawn Rogers, vice president of research for business intelligence (BI), data warehousing and analytics at Enterprise Management Associates Inc. (EMA), Rogers noted that all companies tackling big data have a learning curve. He recently conducted an online job search and saw more than 800 openings with “data science” or “data scientist” in the job description. And finding big data scientists is only one part of the staffing challenge.
Effective big data training has to cover every step in the big data process. Big data training can be broken into four basic categories:
- Defining the business problem – Before you can start assembling your big data infrastructure, you have to know what you are searching for. The first step in big data training is learning how to convert business problems into big data questions. The use case is the core of the big data project. Without properly defining the use case in terms that can be addressed by big data the project is doomed from the start.
Effective big data training will explain the proper criteria for developing a big data use case. It will teach you to determine when a question needs big data analysis versus old-fashioned business intelligence. It will explain how to define the question so you aren’t stirring the data ocean fishing for answers. Defining the parameters of a big data project requires skills and expertise that most project managers won’t have.
- Developing the data model – Determine what data sources you need to address the use case and where the data resides. Big data training will show you how to assimilate data stored in silos within the organization with outside data sources. It will also show you how to develop a common data model with fields, naming convention, relationships, and attributes to align your data, and it will show you how to integrate synchronous and asynchronous data.
You also can think of this as the development phase of big data; the stage where you need to apply Hadoop or NoSQL expertise to develop big data algorithms. Hadoop has the advantage of offering a scalable, open platform, and you can either find Hadoop experts or train your team in Hadoop; expertise in Java, PHP, Ruby, Perl, and Linux adapts well to Hadoop.
- Designing the big data infrastructure – In addition to the software, you need to rethink the enterprise infrastructure. Big data training will show you how big data differs from RDBMS and why you need massively parallel processing for analytical databases.
Big data training also will explain how to modify the enterprise to accommodate big data analytics. You will need to add more computing power, virtualization, and lots of data storage both within the enterprise and in the cloud.
- Delivering insight – The final step is to present the big data findings in a manner that is clear and that provides actionable insight. Some experts say that you need a big data scientist to effectively present big data findings. You will need someone with experience in R programming or some similar graphic presentation language. And you will need to have the expertise to properly interpret the data.
The best big data training here will show you how to bridge the gap between statistical modeling and business insight. Having a firm understanding of the business challenges will be invaluable. Effective training will show you how to connect the dots from the patterns revealed by the data and how to apply that insight.
Big data training spans a variety of disciplines and you don’t have to master every one. However, you should have a firm understanding of what it takes to deliver a big data project so you can assess your strengths, get training for your weaknesses, and know when it’s time to delegate to someone with greater expertise and the appropriate big data training.