Anonymisation and pseudonymisation

Terminology

Anonymisation

Anonymisation is an irreversible process that removes any data that can be used to identify a person either directly or indirectly, rendering the data subject unidentifiable by the service owner or by third parties.

Once data is truly anonymised and individuals are no longer identifiable, the data will no longer be considered as personal data and thus not fall within the scope of OC 11.

Personal data

According to OC 11, "Personal Data means any information, in any form or medium, relating to an identified or identifiable person. It includes data such as name, passport or other national registration details, CERN ID number, banking information, personnel records, images and video-surveillance footage, online and device identifiers, addresses and telephone numbers, and Sensitive Personal Data."

This means personal data has to be information that relates to an individual who is either identified or identifiable.

Any individual who can be distinguished from others is considered identifiable.

An individual is directly identifiable if you can identify them solely from the information you hold.
They are indirectly identifiable if you cannot identify them from that information alone, but you could do so by combining it with other information you hold or you can access from another source (including information held by third parties).
A third party using your data and combining it with information they can access to identify an individual is another form of indirect identification.

An easy example of information that could be used to indirectly identify someone is an individual’s license plate number. The police (a third party) can match a name to a license plate number. Consequently, the license plate number is personal data.

Pseudonymisation

Pseudonymisation is a way of protecting personal data by replacing data which could be used to directly identify an individual with, for example, a fake name or ID number (i.e. a pseudonym).

This process makes it more difficult to directly identify an individual, but not impossible! It is considered that through pseudonymisation individuals are still identifiable or indirectly identified, because with use of additional information it is still possible to know who they are.

For this reason (that is, because pseudonymised data are personal data), OC 11 remains fully applicable to pseudonymised data.

Anonymisation and pseudonymisation are two apparently similar techniques used to protect personal data, but they work in significantly different ways.

Anonymisation is an irreversible process that completely removes or changes information that could identify a person. This means that once the data has been anonymised, it can't be traced back to an individual. As such, when data are truly anonymised, they are no longer considered personal data. Therefore, OC 11 is not applicable to anonymised data.

Pseudonymisation, on the other hand, is a method where personal identifiers are replaced with something else, like a fake name or a fake ID number. This doesn't completely conceal a person's identity, as it's still possible to figure out who they are with the right additional information.
Since pseudonymised data can potentially be linked back to individuals, it's still considered personal data. Therefore, OC 11 remains fully applicable.

Anonymisation and Pseudonymisation in a nutshell

What is anonymisation?	What is pseudonymisation?
Anonymisation is an irreversible process that completely removes personal identifiers from data, making it impossible to trace it back to an individual.	Pseudonymisation replaces personal identifiers with pseudonyms, making it more difficult (but not impossible!) to link it back to an individual.

Examples

anonymous vs. pseudonymous

Anonymisation and pseudonymisation at CERN - Why are these techniques important to Services?

Both of these techniques can help Controlling and Processing Services to safeguard personal data, as they are practical methods to implement the OC 11 principles, such as data minimisation and Privacy by design and by default.

For example, imagine you're doing a project to see how old most people working at CERN are. Instead of asking for everyone's exact birthday and name, you can just ask if they are in their 20s, 30s, 40s, and so on. This way, you get the information you need without making CERN's members of the personnel identifiable. In other words, you collect anonymised data.

By taking this approach, you ensure that your Service has the necessary data to analyse the age without collecting or otherwise processing exact birth dates, names, or any other personal data that could be linked back to an individual. This effectively reduces the risk for the individuals whose data you are processing while still allowing for the analysis of the age demographics within CERN.

This means you are practically and effectively adhering to the data minimisation principle (since you only collect “must-have” data), and, at the same time, you're using a "Privacy by design" approach, because from the start of your project, you're thinking about how to protect people's privacy.

Remember! Both anonymisation and pseudonymisation can help Services to protect personal data. But always keep in mind that while anonymised data are not considered personal data under OC 11, pseudonymised data are!
In practice, this means that if your activity involves pseudonymised data, you still need to full comply with your data privacy obligations, such as drafting a privacy notice in case you are a Controlling Service.

CERN Accelerating science

Anonymisation

Personal data

Pseudonymisation