Data security is one of the biggest headaches for any CIO, especially as hackers continue to find new ways to steal sensitive data from companies like Target, Anthem Health, and The Home Depot. Every time you add a new system to the enterprise infrastructure, you could be creating a new hole through which sensitive company information can leak. Since big data, by its very nature, has to handle large quantities of data, IT managers have to be careful that data used for analytics do not create a security problem. The security question for IT becomes “Is big data part of the security problem or part of the solution?”
Aon Corporation, an insurance and risk management company, estimates that 80 percent of business security breaches result in less than $1 million in damages, 15 percent of breaches cost between $1 million and $20 million, and 5 percent are the mega-breaches, which cost more than $20 million. IBM estimates that the average cost of a data breach is about $3.8 million, but that cost has risen 23 percent since 2013.
Different markets are subject to different types of attacks. A recent study by ID Experts shows that there has been a shift in the cause of data breaches in healthcare from lost or stolen devices and employee negligence to criminal attacks. There has been a 125 percent growth in cyber attacks in healthcare, and the majority of attacks are now due to cyber criminals. The majority of data breaches come through point-of-sale (POS) systems, which account for 28.5 percent of data breaches, followed by malware or “crimeware,” which makes up 25.1 percent of data breaches.
Since much of the data used for analytics comes from healthcare records, POS systems, and related sources, it is no wonder that IT managers are concerned about big data and security.
Big data poses a security risk largely because of the question of big data ownership. Enterprises that maintain their own, proprietary data pools for analytics have the luxury of pulling up the drawbridge to keep out the barbarians at the gates. Security measures can isolate the data and make it easier to defend from hackers and malware. However, most enterprises do not have the data storage capacity to handle big data in-house, so they rely on cloud computing and cloud data storage, which are more difficult to secure. The problem boils down to who owns the data and who has responsibility for security.
For example, more companies are storing petabytes of data from clickstreams, Web logs, social media conversations, and other sources as fodder for big data analysis. This data can be used to provide deeper customer insight, but it also can be used to introduce malware into the system. Information ownership and information classification become more difficult. And just because the data is stored in the cloud does not mean you do not have responsibility for securing that data or meeting regulatory requirements.
Unlike enterprise data, you cannot dig a moat to protect data in the cloud. If you cannot secure the data repository, then you have to secure the data itself. It may be necessary to adopt new security strategies, such as attribute-based encryption to manage access control, where attributes of the data are protected rather than the storage environment; these strategies are still foreign to most data centers.
Authentication is another security solution. Hadoop is still one of the most popular development platforms for big data applications. When Hadoop was developed, security was not a priority, since the objective was to share large data sets and data processing. As it became clear that Hadoop needed security controls, developers started using authentication patterned after Kerberos. Today the data used in most big data initiatives are protected by data encryption and token-based authentication.
Big Data Networks Police Themselves
Even with authentication and encryption in place, the sheer volume of data transactions is creating a security challenge for big data users. Conventional security information and event management (SIEM) technologies cannot handle the volume of data, so in many environments, big data is being deployed to protect big data.
Consider the case of Barclays Bank. The bank records 44 billion security events each month and climbing. With such a volume of security events, SIEM systems cannot keep pace. Using big data analytics to monitor network activity, Barclays is able to implement real-time controls and head off cyber threats.
Cyber criminals typically look for weaknesses in siloed data sets and security products. Using big data to monitor changes across the entire infrastructure eliminates the need to deploy point solutions for security, instead using contextual awareness, situational awareness, and analytics to detect a security breach. In other words, big data watches the end-to-end infrastructure, looking for anomalies and using real-time analytics to identify the problem and automatically apply a solution, such as isolating an infected server or rerouting traffic. Using big data analytics makes it possible to provide real-time protection of the entire enterprise, including other big data applications.
So big data does not necessarily mean more security headaches, but it does require adopting a new perspective on data security. Rather than protecting the infrastructure, consider encryption and authentication strategies that protect the data. And if you are looking for new ways to protect the enterprise, big data analytics can be our best real-time defense for identifying potential threats and heading them off before they trigger a security breach.