By its very nature, big data requires a greater volume of information from multiple sources to power analytics. The value of big data is to reveal new operational efficiencies and new market opportunities is too great to ignore, so big data projects are becoming more prevalent and bigger, using more data sources for analytics. All that data has to be secure, but security for big data analytics is not just a matter of strengthening existing enterprise security tools. The size of the data pool and the diverse nature of the data itself make it more difficult to monitor and protect, which means you need new strategies to ensure security for big data analytics.
The volume of data that has to be managed and secured is escalating faster than conventional enterprise technology can cope with it. Estimates are that 2.5 billion gigabytes of data are created every day, and of that data 80 percent is unstructured content such as business documents, contracts, and social media. All of this content can be valuable for big data, but the more data you add to the mix, the more need for security for big data analytics. Large enterprises, depending on their size, are already generating between 10 billion and 100 billion security events per day. Those incidents will grow with more big data projects.
Using Big Data to Secure Big Data
Security from big data analytics can be considered in two ways: security derived from big data and security of big data.
The big data analytics themselves can provide the information necessary to secure the big data enterprise. Much of the value from big data comes from real-time analytics processing, i.e. monitoring streams of machine data or other information as it happens to create analytics that initiate a real-time response, such as delivery of an ad or, in the case of security, stopping malware.
Real-time monitoring can be used to support security for big data analytics by using big data itself to identify potential threats to the enterprise before they happen. Big data algorithms can be written to detect traffic anomalies or other events that may bypass conventional enterprise security and flag those anomalies as a potential threat. And the advantage of using big data for enterprise security is it consolidates threat management, including data traffic and end user behavior, into a single view so fewer point tools are required for security. And since you are using big data, it can scale with the addition of new data streams.
Securing the Data Repositories
And then there is securing the big data pools themselves. Security for big data analytics means you have to protect the data. It is the organization’s responsibility to safeguard sensitive data, including data from outside sources such as social media streams.
Since big data spans a variety of virtual resources, both within the enterprise and in the cloud, there has to be ongoing vigilance as part of your security strategy. The basic steps are 1) find and classify the data; 2) monitor and audit data access; 3) enforce policies and protect the data; and 4) assess vulnerabilities and harden security. These steps work together in an ongoing cycle, and any security strategies you adopt to protect big data analytics will fall somewhere within these four steps.
Here are six specific strategies to consider:
- Data provenance: As part of big data discovery and classification, the authenticity and integrity of the data needs to be checked. This not only eliminates garbage data but it provides an opportunity to scan for malware before the data is stored.
- Configuration management: Stored files should be checked for access privileges, including database configuration files and executables.
- Change auditing: Audit for changes using a snapshot of secure configurations against new configurations, including real-time activity.
- Monitoring activity: Real-time monitoring of data activity using Hadoop, including creating security policies based on security intelligence, will reveal any suspicious activity.
- Data encryption and loss prevention: Protecting data in transit using masking or encryption protects the data and prevents theft or eavesdropping at the network layer.
- Compliance management: Compliance management and reporting on a regular basis is not only good security practice but essential for regulatory compliance with HIPAA or Sarbanes-Oxley.
These are only a few of the strategies you should adopt for security for big data analytics. Remember that securing big data isn’t only about dealing with more data from different sources; it’s also about consolidating security and using new automated techniques, many of which are actually powered by big data analytics, to monitor and protect your big data infrastructure.
Where do you see the greatest challenge in security for big data analytics? Is it data monitoring? Policy enforcement? Data scrubbing and classification? Is there one challenge that overshadows the others?