This article has five parts. The third part opens the anonymity discussion and paints a bleak picture of global surveillance.
Contents of part 3:
11. Being of name unknown
12. Quick and easy shortcuts to anonymity
13. Adversary model
14. The ways you could be found
15. Addresses and other identifiers
16. Global data collection
• • •
Starting now, I will focus exclusively on the threat of de-anonymization - that someone finds out who you are.
I will formulate it more strictly - the threat is that an unspecified adversary is able to reveal your real world identity from your persistent virtual identity. In this context, the word "anonymous" itself will be used as if it also meant "untraceable", as in "untraceable to your real identity".
Note: hacking and covert messaging are not covered, although both require anonymity. Those are typically associated with illegal activity and although much of the same reasoning apply, they have quite different MOs.
Disclaimer: What you have here is just an educated guess from some guy. There is always room for error, misinformation and further research. Make of it what you will.
Option №1: Don't go online
Since there is no online activity, de-anonymization is not an issue.
Option №2: Don't do anything of interest
In a sense, Internet is a billion nobodies. If all you ever watch on YouTube are cooking channels, you might as well be anonymous. If you dare to speak up, but no one cares about what you say, you might as well be anonymous. Nobody cares about who you are, or that you are not even a dog. Nobody has a reason to de-anonymize you. There is no adversary.
Jokes aside, adversary must exist. Depending on how powerful it is, the steps you'll have to take will vary.
Your meekest possible adversary is just another Internet user, holding a grudge against you for whatever reason, but having no legal firepower, and no insider knowledge.
Assuming you never revealed any information about yourself, intentionally or accidentally, his evidence is limited to your IP address (which is sometimes made public under posts), and his push capacity - to a site support complaint. As I mentioned before, IP address alone is seldom enough to identify you, even if it's your home address. And if you were using any VPN even without any special precautions - fuggedaboutit.
At the other end of this spectrum is a god-like adversary, who is capable of intercepting any transmission, hijacking any hardware and accessing any log file, database or surveillance footage at will. The kind of a government agency portrayed in conspiracy thrillers.
To be honest, I don't believe such ideal adversary exists as a single entity, but in its absolute (to which I will later refer as collective "they") it makes a perfect model to which we could benchmark our defenses. Then we don't have to know what kind of subpoena is required to extract such and such data, we simply assume they could do it. If we can withstand an attack from such an omnipotent enemy (spoiler: we can't), then we are absolutely safe. And if barriers exist in their way (spoiler: they do), it gives us a chance.
There could be other adversaries, such as hackers, who could come into play more or less accidentally and probably won't have a direct goal of de-anonymizing you. They will not be held back by legal procedures, but as soon as we have already granted our adversary all the powers, that doesn't give them much additional edge.
De-anonymization is not a virtual game, they need to find you in the real world. They won't stop after geo-location gives them a 5 mile radius, saying "oh, bother". They could go door to door. They could ask your neighbors. They could do anything outside the box.
Also, investigators don't need a proof, they need a clue. The game is over when you become a suspect, not found guilty. Hopefully, what you are doing is not against the law, so that's out of question anyway, but even a rumored allegation could hurt you, and so could doxxing.
It is best to know exactly what kinds of information are transmitted, stored or otherwise associated with your activity, and how those breadcrumbs could lead back to you, and I will try to present a broad view of what is possible.
Generally speaking, there are two ways of finding you.
The first is following the direct evidence generated from your actions. In virtual space it's the log files that get created everywhere. You can think about it this way: each device that handles your data, be it a web server, VPN server, router, Wi-Fi access point or cellular base station, could record every piece of information it has at every moment. Even with limited storage capacity, it could store at least the addresses of the devices accessing it. And since it could, it's safe to assume it does.
Then, in theory, all it takes is to follow the chain of addresses backwards and it leads right to you.
In the same category falls the personal data that you give up voluntarily as you register at the sites. If you used your real e-mail address or cell phone to receive that one-time confirmation code, you have tied the account to your name just the same. With that kind of evidence no search is required at all.
And if that wasn't enough, there is a second, even more powerful approach, called correlation. Correlation is a technique by which data from multiple unrelated information sources is cross-matched to find patterns and connect events that would otherwise appear unconnected. "Wing of a butterfly" kind of intelligence.
As an example, let's say that you have an anonymous Twitter account and a real Facebook account, and every time you post to Twitter, you also open Facebook, because you are already at the computer, and who could resist the temptation, right ? Now if they match Twitter logs to Facebook logs, it's trivial to see that the activity at your two accounts correlates perfectly. It is circumstantial and requires access to unrelated information sources, but it could potentially connect all the dots.
Therefore, in order to stay anonymous, it's not enough just to hide your traffic. You need to exercise a great deal of discipline, hygiene and paranoia in real life as well.
The idea that every device records all the information it has, was of course an exaggeration, but only so much so. It is absolutely correct in two important aspects.
First, that there exists an extended chain of devices and/or services that transfer data between you and your target, whether it's a direct web site connection, or an e-mail being delivered later.
Second, that each device or service in that chain always has some kind of information identifying the adjacent links, even if for the short duration of the transmission, and this is not restricted to IP addresses. Typically, it is some kind of a number, used as an address in the underlying network protocol. And although we can't indeed be sure, what is being recorded where, logging just a number and a timestamp takes no space and is sufficient to trace.
It is important to realize that none of these addresses could be faked, otherwise the transmission would have been impossible.
But what about VPNs ? Are they not supposed to be a fake IP address of some kind ? No, not at all. All that VPNs do for you, is that they perform as a front, replacing your real IP address with their real IP address. It's just a detour, and just like when you drive in a rented car from that mall, it can still be traced back to the garage, if not exact parking spot. In technical terms, when a service assumes your identity for a while, it is called "proxying". So VPN is nothing but a proxy that temporarily lends you its IP address.
And IP addresses are not the only numbers that appear in the logs. VPNs therefore couldn't protect you from revealing the rest, even theoretically. Hopefully, it's clear now why I did not specifically address VPNs yet. It's because they are not the only part of the puzzle, not by any stretch.
What other addresses could get logged besides IP ?
Every device that you use has a unique identifying number. Network cards all have MAC addresses, which is revealed at transmission. Cell phones and cellular modem devices have IMEI, SIM cards have (guess !) phone numbers, and that pretty much covers the devices through which you could get online.
Account names and emails, although not numbers, are also effectively addresses and routinely logged. So are credit card numbers, payments, tickets and transactions identifiers that link to banking, payment and transportation systems and other similar data sources.
Bridging the gap into the physical world are cell phones and video surveillance.
Without going into details, the approximate location of your cell phone is always known (and hence recorded). It is routinely used by law enforcement for corroborating alibis, incriminating connections and building a list of suspects (or at least potential witnesses) in the area. But the cell phone you can at least forget at home.
Your face you can't. When you are being watched all the time, you leave a trace even when you don't engage in any activity whatsoever. The mere fact of your presence is deemed worthy of recording and leaves a trace. And then your face is the ultimate identifier, it follows you around and correlates all your other actions perfectly.
In its current state, video surveillance is at least ineffective. It takes a great amount of storage (and therefore needs to be overwritten periodically), and to find anything you need a pair of eyes watching for hours with no guarantee of success.
The game will be over when face recognition systems kick in universally. Then there will be no need to keep terabytes of useless video. Instead, a stream of face fingerprints would be stored forever, and the search across this database would be automated.
That will effectively be the end of legal anonymity in public.
As you can see, the more data they have, the more power to correlate it gives them. And if you are still not concerned, here is something else to think about.
Governments and Big Tech corporations are acutely interested in global data collection (although their reasons are different). Even when they are not looking for you or anyone in particular (at the moment), having collected as much information as possible is their very desirable goal.
Governments want control to make sure nothing threatens their power. They get there by making illegal everything they could, for the reasons of protecting the children, preventing terrorist attacks or some equally fear-inducing pretenses. But at the end of the day it is now you who are under complete control, and freedom of speech is outlawed. Does it surprise you that certain governments make VPNs illegal ?
Corporations simply want to steer everyone's life, even if for profits alone (and that is a billion dollars incentive already). If you believe it is justifiable to have your data collected across web platforms, messaging and voice calls, e-commerce, shops and payment systems, just to have better margins, you ought to accept everything that comes along, and that's their power to redefine your thoughts.
There is even more danger in this, than in what governments do. At least some governments were elected by people, and some are under checks and balances. Corporations exercise their power in a legal vacuum, there is no control whatsoever. Unrestricted data collection and processing are not even questioned.
As far as freedom of speech goes, the Big Tech corporations come up with their own arbitrary "Community Guidelines" akin to a church covenant, but then have them modified and applied at their own discretion. It is impossible to fight back, because unfair deranking, demonetization and other forms of censorship are asserted to be the acts of the fair and well meaning God-like Algorithm which works in mysterious ways. "Algorithm made us do it" is how the excuse goes.
Both parties use globalization to bring more information at their disposal.
Corporations use acquisitions, and it is exactly the user base which is being sought (notice how we ourselves measure power of social networks in their user counts). No one in this market cares about "algorithms", "expertise", "intellectual property" (aside of securing the patents), or any such nonsense. It is the size of the (potential) user base, along with the terabytes of their (future) data, that matters. And then, do you think many startups would think twice about collecting everything they could, if doing so would increase their chances to be acquired ?
And what do you think AI and machine learning is all about ? Forget walking robots and self-driving cars, think automated processing of enormous amounts of raw information, finding patterns, correlating facts and then building a searchable data set out of everything that happens in the world.
Governments cannot buy each other out of course, at least not in the first world. What they can do, is to form alliances. They sign treaties to cooperate in making information available for every participating country. The current examples are the "eyes" alliances. The "5 eyes", "9 eyes" and "14 eyes" are increasingly large groups of countries that join their surveillance powers and jurisdictions, making it a kind of a law enforcement umbrella. Such conglomerates get created with the only reason in mind - more control.
• • •
Thank you for reading !
In the next part of the article:
17. The barriers
18. The traces that you leave
19. No-logging VPNs
20. Traces you didn't know you were leaving
21. Plan of defense