Previously, only large entities such as governments and larger enterprises, could afford the massive infrastructure necessary for hosting and mining big data. Now as the technology become more powerful and affordable, big data use cases are expanding rapidly in numerous industries. Like many new technologies, big data presents numerous challenges as well as opportunities.
Many organizations are struggling with the question of what to do with the data. Analyzing big data in order to reveal insights that can improve decision making and business operations is a growing challenge. Rather than relying on humans analysts, machine learning and cognitive computing are increasingly being leveraged to make sense of big data.
The 3 Vs: Defining Big Data, Causing Security Vulnerabilities
The Cloud Security Alliance “Big Data Security and Privacy Handbook: 100 Best Practices in Big Data Security and Privacy” states that security vulnerabilities are compounded by the diversity of big data sources and formats, the streaming nature of data acquisition, and the transmission of data between distributed cloud infrastructures. The large volume of big datasets also results in large attack surfaces.
In other words, the very attributes that define big data are the same attributes that contribute to security vulnerabilities: volume, variety, and velocity.
Balancing Access and Restriction
Utility and privacy of data often work in opposition. Leaving data free and open for all certainly can enable all interested parties to access and utilize the data to its greatest advantage. But, of course, this is not an option. Fortunately a reasonable balance between enabling necessary access while restricting unauthorized access is possible.
Securing and encrypting big data is a big challenge. The Gemalto 2015 Breach Level Index showed a continuing failure among many organizations to prevent data breaches and actually protect their information assets of all sizes.
According to The Big Data Security and Privacy Handbook, “Traditional security mechanisms tailored to secure small-scale, static data on firewalled and semi-isolated networks are inadequate.”
Security shouldn’t hurt performance and cause lag times. After all, velocity is one of the characteristics that define big data.
As Big Data Advances, So Must Security
Often big data use cases involve public data such as traffic patterns and residency stats. Anonymizing the data is a common solution. Unfortunately, it’s simply inadequate.
Much like perimeter security is no longer adequate to secure an organization’s IT assets, big data has also outgrown the tactics used at the dawn of its time. Anonymizing does not provide adequate security, particularly as additional big datasets emerge, resulting in opportunities for combining datasets to reveal personally identifiable information. And of course, anonymizing was never a viable tactic for securing proprietary big datasets.
However, one of 100 best practices listed by the CSA Big Data Security and Privacy Handbook is to “De-identify data. All personally identifiable information (PII), such as name, address, social security number, etc., must be either masked or removed from data.”
It strikes me as ironic that one security tactic commonly cited as inadequate, remains a best practice.
But that is the fact of the matter.
While de-identifying data is not adequate alone, it can be a helpful component of a broader security program.
The Need for Big Data Encryption
Breach prevention, while still a vital component of IT security, is also not enough. According to the 2016 Data Security Confidence Index most IT decision makers say it is possible for unauthorized users to access their network, although they’re not dedicating funds to encryption.
As stated in The Gemalto 2015 Breach Level Index, “The security strategy of today should include a change of mindset, and the implementation of solutions that control access and the authentication of users, provide encryption of all sensitive data, and securely manage and store all encryption keys.”
Like all information security, big data security must consist of a multi-layered approach to maximize efficacy. Security must be considered in layers, which includes not only efforts to prevent a breach, but tactics to mitigate the consequences of a breach.
You must secure your data, not merely the perimeter, in addition to securing a breach by protecting the data itself and the users accessing the data. You must also securely store and manage all of your encryption keys as well as control access and authentication of users.
360 Degrees through Space and Time
Unfortunately attempting to retroactively protect your big data is more difficult than securing it from the beginning. 360 degree protection involves not just encrypting data through its lifecycle: at rest and in motion, but also starting from the very beginning of your big data project.
Too often, security is tacked on at the end -reluctantly- and viewed as an irritating delay to the launch of a new app or project. But taking the time from the start to implement a comprehensive big data encryption program with multiple 360 degree rings will reduce the risk of your business suffering the numerous painful consequences of a breach.
Discover how you can adopt secure big data technology solutions on-premise, in the cloud or across both, read our Big Data White Paper today!