Three layers of your digital profile

9 min. read

Whether you like it or not, commercial and public actors tend to trust the string of 1s and 0s that represent you more than the story you tell them. When filing a credit application at a bank or being recruited for a job, your social network, credit-card history, and postal address can be viewed as immutable facts more credible than your opinion.

But your online profile is not always built on facts. It is shaped by technology companies and advertisers who make key decisions based on their interpretation of seemingly benign data points: what movies you choose watch, the time of day you tweet, or how long you take to click on a cat video.

Many decisions that affect your life are now dictated by the interpretation of your data profile rather than personal interactions. And it’s not just about advertising banners influencing the brand of the soap you buy—the same mechanics of profiling users and targeting messages apply to political campaigns and visa applications as much as supermarket metrics. When advertising looks like news and news look like entertainment, all types of content are profiled on the basis of your data.

So what story does your data tell about you?

The layers of your online profile

It would be nice to think that we have control over our online profile. After all, we’re the ones who feed terabytes of personal data into mobile apps and online platforms. We decide which photos we want to share and which should remain private. We accept or reject invitations, control tags, and think twice before publishing a post or a comment. We are critical and selective about the content we like or share. So why wouldn’t we be in control?

The bad news is that when it comes to your digital profile, the data you choose to share is just the tip of an iceberg. We do not see the rest that is hidden under the water of the friendly interfaces of mobile apps and online services. The most valuable data about us is inferred beyond our control and without our consent. It’s these deeper layers we can’t control that really make the decisions, not us.

Let’s peel open this data onion.

Examples of data that can be shared by users (first layer), observed (second layer) and generated by algorithmic analysis (third layer).

Click on the image to see the full size.

The first layer is the one you do control. It consists of data you feed into social media and mobile applications. This includes what you have revealed in your profile information, your public posts and private messages, likes, search queries, uploaded photos, tests and surveys you took, events you attended, websites you visited, and other types of conscious interactions.

The second layer is made of behavioral observations. These are not so much choices you consciously make, but the metadata that gives context to those choices. It contains things that you probably do not want to share with everybody, like your real-time location and a detailed understanding of your intimate and professional relationships. (By looking at location patterns that reveal devices that often meet in the same office buildings or “sleep” together in the same houses, tech companies can tell a lot about who you spend your time with.) It also tracks your patterns of when you’re online and offline, content you clicked on, time you spent reading it, shopping patterns, keystroke dynamics, typing speed, and movements of your fingers on the screen (which some companies believe reveal emotions and various psychological features).

The third layer is composed of interpretations of the first and second. Your data are analyzed by various algorithms and compared with other users’ data for meaningful statistical correlations. This layer infers conclusions about not just what we do but who we are based on our behavior and metadata. It is much more difficult to control this layer, as although you can control the inputs (posting photos of your newborn), you don’t know the algorithm that is spitting the output (that you might need to order nappies).

Here's how it works in practice:

The task of these profile-mapping algorithms is to guess things that you are not likely to willingly reveal. These include your weaknesses, psychometric profile, IQ level, family situation, addictions, illnesses, whether we are about to separate or enter in a new relationship, your little obsessions (like gaming), and your serious commitments (like business projects).

Those behavioral predictions and interpretations are very valuable to advertiser. Since advertising is meant to create needs and drive you to make decisions that you haven’t made (yet), marketers will try to exploit your subconscious mechanisms and automatic reactions. Since they cannot expect that you will tell them how to do this, they hunt for behavioral data and employ algorithms to find meaningful correlations in this chaos.

Binding decisions made by banks, insurers, employers, and public officers are made by big data and algorithms, not people. It saves a lot time and money to look at data instead of talking to humans, after all. And it seems more rational to place statistical correlations over a messy individual story.

Therefore, there’s a shared belief in the advertising industry that big data does not lie—that statistical correlations tell the “truth” about humans, their behavior, and their motivations.

But do they?

When your data double is wrong

The troubling thing is that we as users might not like or recognize ourselves in the profiles that are created for us. How would it feel if you discovered your "data double" is sick or emotionally unstable, not credit worthy, or not simply not cool enough, all because of the way you type, your search queries, or any “strange” relationships you may have?

Your online simulation may look nothing like your real-life one—yet it’s the one that the internet will treat you as.

Market players do not care about you—they care about numbers. Algorithms make decisions based on statistical correlations. If you happen to not be a typical individual, showing unusual characteristics, there is a chance that an algorithm will misinterpret your behavior. It may make a mistake regarding your employment, your loan, or your right to cross the border. As long as those statistical correlations remain true, nobody will care to revise this particular judgement. You’re an anomaly.

If the result of this algorithmic analysis is discriminatory or unfair—for example, your credit application is refused because you live in the “wrong” district, or your job application does not make it through because your social network is not “robust enough”—there is no market incentive to correct it. Why would they? You’re a single data point in a wave of billions. Why make an exception in the system just for you?

We can already see this playing out in China. As part of their “social credit score” system, every citizen is ranked on professional and personal interactions, online activity, and public appearances. Fail to pay a parking ticket? Look up banned topics online? Your actions in real life have lasting effects, such as your ability to buy train tickets or send your kids to good schools.

Scoring systems in the West place the same blind trust in big data, ignoring the specificity and uniqueness of individual cases. We can shake our heads at the absurdity of China’s social credit score all we like—but are we really that far off ourselves?

Will the real digital you please stand up?

We must take back control of our digital shadows. If we don’t, we’ll continue to be incorrectly and unfairly penalized in our lives, both online and off.

We can take measures to control the first layer of our online profile. Even though we are often impulsive or spontaneous with the data we share, we have the tools to control this process. We can choose not to post status updates or like pages. We do not have to use messaging systems embedded into social media platforms. We can encrypt our private communication by choosing certain messaging apps and block tracking scripts by installing simple plug-ins. We can even switch off metadata being stored in our photos by changing the default settings on our phones and making sure that they don’t have access to our locations.

But even if we make that effort, we cannot control what is observed and interpreted by algorithms. The second and third layer of our profiles will continue to be generated by machines.

The only way to regain full control over our profiles is to convince those who do the profiling to change their approach. Instead of hiding this data from us, they could become more transparent. Instead of guessing our location, relationships, or hidden desires behind our backs, they could ask questions and respect our answers.

Instead of manipulation, let’s have a conversation. Imagine that instead of having data brokers guess who you are, you could just tell them. Sharing real information would help make your online experience (and any offline ramifications) more accurate.

Sounds too radical or naive? Not really. European law already requires companies that engage in tracking and profiling to make it more transparent. The data protection regulation GDPR that was put in place in May 2018 gives European users the right to verify their data, including marketing profiles generated by data brokers, internet platforms, or online media. While companies can still protect their code and algorithms as business secrets, they can no longer hide personal data they generate about their users.

GDPR and its logic gives users a good starting point for negotiating the new power balance in data-driven industry. But what will make further transactions possible in the future is building trust. As long as we treat data brokers and marketers as the enemy and they treat us as an exploitable resource, there is no space for open conversation.

It is therefore time to treat users as active players, not passive participants. With GDPR in force and new companies building their competitive advantage on trust and transparency, new models of marketing and financing online content become realistic. Solutions that seem counterintuitive and risky may turn out to be the most natural way forward: Instead of telling users who they are, try listening to what they say.

Katarzyna Szymielewicz

The article was originally published in Quartz.