The standard cliché used to be that you are what you eat. Which remains true, of course. But it’s also incomplete—so last century. Today, you are what you do online, which is almost everything.
Your age, gender, what you buy or sell, where you go, who you see, what you post on social media, what you study, what you believe, what you wear, what kind and how much exercise you do, how healthy you are, your relationships, where you work, what you earn, and more—it’s all online, creating a staggering trail of “digital breadcrumbs” that can assemble a more detailed and intimate profile of a person than ever before possible in human history.
You are your data. And if your data isn’t private, you have no privacy. That’s the reason for National Data Privacy Day, on January 28 every year.
The event was launched in Europe in 2007 as Data Protection Day, and in the U.S. with its current title two years later. In the decade-plus since it raised awareness about how vast and intrusive data collection can be, it’s been a catalyst for the passage of nearly 130 data privacy laws worldwide.
It has also spurred ongoing debate about the state of personal privacy. Some argue that there are reasonable privacy protections in place, while others say privacy has never been as dead as it is now, and point to the “golden age of surveillance” by both the private and public sectors.
But there isn’t much debate over the fact that for data to be private, it has to be secure. It doesn’t really matter if a law restrains what organizations can do with the data they collect
if it’s vulnerable to theft by criminals.
So one of the primary goals of any initiative to stay in compliance with privacy laws should be to push for use of the best tools, techniques, and practices to keep data secure. And that requires a focus on securing the data itself and also securing the environments where data is stored, which are powered and controlled by software.
When it comes to securing data, the most familiar requirements come from the most famous data privacy law on the books so far: the European Union’s General Data Protection Regulation (GDPR), which took effect May 25, 2018, and has served as a template for many that followed. While it technically applies only to the EU, as a practical matter it’s global, because it applies to any organization that does business with the EU.
But the GDPR’s most specific focus is on restricting the use of data—its collection, processing, sharing, selling, storage, retention, deletion, and so on. Security requirements are also included but they tend to be more general.
The law says data must be “processed in a manner that ensures appropriate security of the personal data, including protection against unauthorized or unlawful processing and against accidental loss, destruction, or damage, using appropriate technical or organizational measures.”
That mandates a specific result but doesn’t spell out what an organization needs to do to “ensure appropriate security” of data or what “appropriate technical or organizational measures” are.
Most experts say that’s a good thing—that government should require a result (in this case, data security) but not dictate to the private sector how to achieve that result. Expecting government legislation to keep up with the evolution of technology is like expecting a horse and buggy to keep up with a high-performance sports car.
But although most companies collect and use data, they aren’t directly in the data security business. Which means they could use a set of best practices.
Ian Ashworth, senior sales engineer with the Synopsys Software Integrity Group, offers a short list of fundamentals.
All data isn’t the same, nor is it equally valuable to an organization, to its competitors, or to attackers. “The first step to keeping data safe is to classify it,” Ashworth said. “This is a fundamental element of any information security program. Keeping data safe can be costly, so classifying it ensures that the right proportion of any spend is efficient and justified.”
Classification is also directly focused on the confidentiality component of the CIA triad (confidentiality, integrity, availability) for data.
A good tool to help classify data is metadata (data about data), which can help find personally identifiable information (PII). Ashworth said a good way to start is with high-level classifications like restricted, confidential, or sensitive.
“From there you can look within and classify individual attributes or fields such as payment cards, which have very distinctive data patterns, or rely on the metadata, which might aptly and accurately name a field and lead to it being classified as PII,” he said.
This sets conditions for who can access certain data and for what purposes. These should be granted on a least-privilege basis. If you don’t need it to do your job, then you don’t need access to it.
“A website may require read-only access to data for the purposes of displaying it,” Ashworth said. “To edit or modify it, you would need to have a reason, such as the owner wishing to change their registered phone number on an account. That higher level of edit control might only be available to the account owner and system administrator.”
Those controls could be contractual or enforced through authorization based on a person’s electronic identity.
“There is a common acronym, AAA (authentication, authorization, and accounting), in cyber security parlance that refers to measured access controls to certain digital resources,” he said.
“Layers of security do play a part in defending against unauthorized prying eyes, but the most critical measure is encryption,” Ashworth said.
And for encryption to be effective, it has to be both rigorous and comprehensive, applying to data both when it’s at rest and in transit. That means organizations need to map their data—both where it’s stored and how it flows from one place to another. You can’t protect assets if you don’t know where they are.
Encryption can also ensure the second element of the CIA triad—the integrity of the data. Hash algorithms, which are very difficult to reverse or attack, can let an organization that has been breached know if its data has been modified.
“If data is passed through such a hashing algorithm and the hash is stored independently, then if the original text is changed, a different hash would result, confirming it is no longer the original stored text,” Ashworth said.
Done correctly, encryption means that even if an organization is breached and data is compromised, it’s useless to attackers.
Then there’s the software security component. Software is embedded into every digital component of a company, and is therefore a major piece of what protects, or doesn’t protect, the data.
There are thousands of examples of what can happen if unsecure software is exploited, but one of the most notorious is the 2017 breach of credit-reporting giant Equifax, which compromised the social security numbers and other personal data of more than 147 million customers. It happened because the company failed to apply a patch to the popular open source web application framework Apache Struts—a software patch that had been available for several months.
There are a number of fundamentals to help organizations avoid becoming the next Equifax.
This requires the use of multiple tools and processes throughout the software development life cycle.
Once software is up and running, an ongoing requirement is to keep it secure. Those measures include:
Doing all this won’t guarantee perfect security. Nothing will. But an organization that invests in best practices to secure both its data and its software will have a pretty good chance of never hearing from the enforcement division of the GDPR or any other privacy law. It will also develop much happier and trusting customers.
All of which is good for the bottom line.