We are talking about Personal Data a lot during this period, as the next application of the new European General Data Protection Regulation  (GDPR) of next May 25th.

People are more and more careful to the use made of their data. And now organizations will be obliged to give clear evidence to the authority and to the people to whom they belong the information they collect and use, of all the technical and organizational procedures they adopt for this data management.

But what is meant by “personal data“? What kind of data falls into this classification?

The regulation defines as personal data “any information relating to an identified or identifiable natural person (‘data subject’); an identifiable natural person is one who can be identified, directly or indirectly, in particular by reference to an identifier such as a name, an identification number, location data, an online identifier or to one or more factors specific to the physical, physiological, genetic, mental, economic, cultural or social identity of that natural person;“(Art.4 – 1).

Technological development influenced the data acquisition methods significantly and on the granularity of the collected information.
Therefore rightly the regulation extends the scope of “personal data”.

The Internet has facilitated the collection of very detailed information on how people behave. An observable action, then transformed into data, is not limited to recordings, purchases, archives, but also includes the micro steps that lead to these activities.
So “Personal data” is the data derived from monitoring the behavior of individuals to be tracked on the internet: the IP address, for example, if it can be traced back to the person using the device, as well as the detection of the person’s device position, are considered “personal data”.

New technologies and sensors make granular observation possible in both the physical and the virtual world. All major shopping centers have CCTV cameras, and images can be transformed into data. The cars are equipped with sensors that continuously measure the vehicle’s operating metrics. The combination of online and physical observation has facilitated the expansion of observational data. Although this data starts with actions of individuals, they are not active partners in their origin.

Therefore people can not know what such advanced technologies collect personal information.
Even less anyone can be aware of what information derives from predictive statistics reworkings that, by combining profiled data with generic behavioral data, can precise profiling of the physical person to make decisions concerning or analyze or predict the preferences, behaviors and personal positions.

So “personal data” are also those collected automatically or resulting from subsequent reprocessing.
To better understand what is considered a personal data, we can refer to the classification of the Foundation for Accountability Information that differentiates the different types of personal data based on the origin of the information.

The Foundation is a globally recognized foundation, a reference point also for legislative authorities to frame and advance legislation and practice of data protection through information governance based on responsibilities, protecting people’s rights to privacy and autonomy.

The Foundation for Accountability Information distinguishes four main types of data based on its origin:


  • PROVIDED DATA – data provided directly by the person:
    • INITIATED – data produced by individuals who undertake an action that starts a relationship such as requesting a loan, registering for voting, signing a license, or registering on a website.
    • TRANSACTIONAL – data created when a person is involved in a transaction. Transactions can include buying a product with a credit card, paying an invoice, answering a question, or running a test.
    • POSTED – data created when individuals express themselves proactively in a post on social networks, generating information that will be seen or listened to by others.
  • OBSERVED DATA – data that is observed and recorded:
    • ENGAGED – data comes for example from online cookies, loyalty cards, sensors on personal devices and other cases in which the individual is informed of the observation at a given time.
    • NOT ANTICIPATED – data collected by sensors that people are aware of existence but have limited perception of the fact that they manage data concerning them. For example, a person may be aware of the fact that there are sensors in the car’s wheels and the engine oil sump, but may not be aware of the fact that they collect data about his car maintenance behavior.
    • PASSIVE – data collected from observation of which the individual is completely unaware. For example, consider CCTV videotaping in public places when combined with facial recognition.
  • DERIVED DATA – data derived in a somewhat mechanical way from other data that become a new element of data related to the individual:
    • COMPUTATIONAL – data calculated by an arithmetic process performed on existing numerical elements. For example, a lender could create computational data by calculating the ratio of mortgage debt to total consumer debt; an online seller could calculate the average spend per visitor a trader could calculate the percentage of items returned compared to those purchased. Each of these new computational data products constitutes information that could be used by the organization to understand behavior better and make decisions about the individual. Typically, the person is not aware of the creation of this data.
    • NOTATIONAL – data derived from the classification of individuals as part of a group, based on the standard characteristics shown by group members. For example, a seller might notice that his customers have six common attributes and look for the same characteristics in a group of potential customers.
  • INFERRED DATA – data deduced from an analytical process based on probability. They derive from further analysis with aggregation of information arriving from different contexts. The person is usually not aware of the creation of data concerning him that are the product of the inferences that come from this type of analysis:
    • STATISTICAL – data derived from a statistical re-elaboration. For example, credit risk scores, most fraud scores, response scores, and profitability scores.
    • ADVANCED – data generated by advanced analytical processes such as those found in big data. This data is generated by correlative analysis on larger and different data sets. For example, in the medical field, Big Data is starting to generate insights into the likelihood of future health outcomes.

Whatever the origin of the data, if this is directly attributable to the individual is to be considered in accordance with the law as “personal data”.
Even if data is subject to de-identification, encryption or pseudonymisation, but can be used to re-identify the person, it is considered as “personal data”.

The regulation requires organizations to take all necessary security measures to prevent the confidentiality of private data from being violated, respecting the privacy of individuals.

Having shed light on the real amplitude of the scope that involves the collection and processing of personal data, it is clear that an efficient system of IT security (as well as procedural-organizational) must be the subject of intense attention by companies and public organizations.
But not only! Because if IT security is a natural perimeter of IT departments, often the skills of the latter are verticalized on the company’s core business or strictly linked to operational, technological capabilities. The adoption of a technical system does not solve the issue of protection, which we have seen is much broader, but only provides tools that can implement the necessary mitigation measures.
What tools to adopt and when to activate them cannot be a decision relegated to the IT department alone.

The classification of data and the evaluation of the effectiveness of possible measures requires access to specific analytical skills, able to concentrate efforts in the right direction.
Along with the theme of the adaptation of company processes and procedures to the protection standards, the issue of the correct analysis of the assets constituted by the company data is accompanied by the opportunity to capture from them an excellent value for the business.

How ZeraTech can help you:

We support the consolidated skills in the design of complex information systems with the adoption of new skills in the field of Data Analysis. With an end-to-end approach, we support organizations in the identification and classification of data and their sources. We help and guide IT departments in the difficult task of enabling the necessary data protection, and at the same time, we support the line of business in the analysis and definition of the strategic value of the data that the company owns.

You may also be interested in:

Data Management
Security Assessment

Share it!