Cloudera’s Security Architect on data and security
Cloudera’s Chief Security Architect, Eddie Garcia, has his work cut out for him. “My main goal is to protect our customers’ data.”
This includes data that is at rest, in transit and as it goes out of one system to another. “Enabling customers to analyse and process even their most sensitive data is what makes Cloudera the most secure distribution of Apache Hadoop.
“An important KPI is the number of customer wins where security is the main differentiator.”
All too often, the need to gain insights from data overshadows the need for it to be done securely. The requirement for both to go hand-in-hand is usually overlooked. Cloudera finds that their customers are becoming more discerning about security however and Garcia explained, “As Cloudera becomes a true cloud company, the lines between engineering and IT fade away.
“Our customers not only expect our software to be secure, but our internal processes and systems to be secure as well.”
Cloudera is currently also in the midst of helping customers with their data security, by using Hadoop Cybersecurity to detect real-time threats to their networks. “By using analytics with machine learning on packet data, netflow and DNS data, we can detect threats that may escape existing threat detection technology that are based on static signatures.”
The Intel factor
This Chief Security Architect has worked before with AMD, a semiconductor company and is using his past experience to work very closely with Intel, a huge Cloudera partner with an investment of over USD740 million in the latter.
“We have collaborated on methods to take advantage of chip encryption, security, compression, etc.
“We will also see many cool developments in the future where Cloudera will be taking advantage of Intel hardware.”
How to treat your data
According to Garcia, data ethics is a hard problem to solve.
“Organisations in the past with their limited tools to analyse data, gave little thought to what a customer could expect about acceptable usage of their data.
“With new technologies and new ways to aggregate customer data, it is possible to know things about customers that they would rather not share.”
In Garcia’s opinion, three best practices for data ethics are as follows:
- If you don’t need the data, don’t collect the data in the first place
- If you do collect the data, treat it all as sensitive
- When in doubt, consult your ethics committee, and if you don’t have an ethics committee, create one.