For nearly a decade privacy advocates have been warning that modern web browsers are leaking vast amounts of data that can be used to identify users online. Over the years the pendulum has swung back and forth as browser technology advancements and the desire for privacy seem to always be at odds with one another. This trend has continued, and an increase in consumer privacy awareness has paved the way for questions such as “Why do companies need all this information anyway? What does this data get used for? Should I be concerned?”
What is browser fingerprinting & why does it matter?
Browser fingerprints are combinations of persistent and non-persistent identifiers that are gathered passively through application programming interfaces (API) built into modern web browsers. The APIs used are developed by standards bodies such as W3C to define what data attributes a browser should support in order to facilitate seamless cross-platform experiences.
Some examples of data attributes that can be collected and reported via browser fingerprinting are featured in the below table. This list is by no means comprehensive but rather is intended to illustrate elements that consumers would likely be familiar with.
Browser fingerprinting is used across a wide variety of industries and use cases. Examples of several modern use cases are featured below; this is not intended to be a comprehensive list but rather highlight common use cases that most consumers are likely familiar with.
Digital Marketing – Online advertisers and merchants utilize fingerprints to target individuals online with digital marketing campaigns to purchase goods or services.
User Experience – Web developers and designers use fingerprints for analytics to build websites that provide a consistent user experience across platforms.
Return Device Recognition – Many organizations (B2B & B2C) utilize browser fingerprints to recognize users that are returning to use online platforms, products, and services.
Fraud Prevention – Banks and merchants frequently use browser fingerprints and associated technologies to detect and mitigate fraud attacks targeting their customers.
While some of these use cases may appear to use the technology for more legitimate purposes than others, the fact of the matter is browser fingerprinting can be used for good or bad purposes. While many consumers may not approve of the technology being used by merchants or advertisers to track their buying habits, most would likely agree that banks using the technology for purposes of fraud mitigation would be welcome.
What data can be collected from different browsers?
Modern web browsers can collect and report dozens of specific data attributes from the host device being profiled. These can provide significant insights into the user, their location, the type of device, connection methodology, OS settings, hardware specifications, and sensors built into a mobile device. While this data in and of itself is very useful, it becomes increasingly powerful when paired with sophisticated machine learning algorithms and other third-party data sources with the intent of identifying a device. Below is a small description of some of the data that can be gathered directly through the browser along with some examples of how the information can be used in real-world applications.
Network Information – Based on the communications between the browser and server, it is possible to gather information about the device’s network connection. This include things like IP addresses (public, private, dns), the region where that device is located, what ISP it connected through, whether VPN or proxies were used, the operating system of host device, and what SSL / TLS connections are supported.
Hardware Information – Examples include information about the audio system, screen resolution, CPU type, number of CPU cores, amount of device memory (RAM), as well as the ability to identify the presence of a battery, microphone, camera, or Bluetooth.
Operating System Settings – This includes environmental settings such as custom configurations made to the operating system such as language, system time, time zone, supported languages, and supported fonts. Generally speaking, most users within a given region will have a relatively homogeneous population so variations in this area may be significant.
Graphics Processing – Many modern web browsers have evolved over time to support advanced graphics for applications and games that are delivered directly through the browser. Along with this capability comes a lot of information about the graphics system itself such as the underlying make and model of the GPU and the ability to generate highly unique fingerprints based on the image processing capabilities of the device.
Sensor Data – One of the more recent additions to browser data comes from sensors built into many mobile devices. This would include signals emanating from sensors monitoring things like ambient light, acceleration, gravity, rotation, magnetic field, air pressure, and location.
Note: Availability of the above data will vary from browser and device in terms of precision, volume, and access rights.
How does this work in the real world? Why can’t I just hide myself?
Consumers often install additional software or browser plugins to bolster their online privacy. Others are also likely using things like VPN, private browsing, and ad blockers in an attempt to hide themselves or become ‘invisible’.
While many of these tools in and of themselves are very useful, they are not necessarily the best way to hide oneself online. The very fact that these tools are being used is rather trivial to detect. The more of these ‘protective’ measures that get layered on top of each other, the more interesting and unique the device becomes, thus making them easier to identify. That is counter-intuitive to what most people would believe so perhaps an analogy will help illustrate this.
Let’s say an individual in real life didn’t want anyone to see their face, what they were wearing, or their height. In order to achieve this the person dresses in all black, wears a ski-mask, long black trench coat, hat, and platform shoes. Mission accomplished. However, if that individual were to travel to a shopping mall, they would be extremely easy to spot as they would stick out from the crowd. The mall probably doesn’t get very many people coming through dressed that way on a regular basis so the more often our fictitious person were to return, the easier they could be re-recognized.
While this example may sound extreme or even ridiculous, it’s very similar to what people using multiple tools to disguise themselves look like online. The mere fact that few people use these technologies, and even fewer use several together make the persona highly unique and easier to spot. So, what should people do if they care about privacy and the amounts of data their browser is leaking?
One strategy that can be used would be to blend in with the crowd. The more a persona looks like everyone else online, the harder it is to pick out of a crowd. Web browsers should be kept up to date and set to update automatically. Persistent identifiers such as an IMEI or MAC address should be blocked or randomized whenever possible. Non-persistent identifiers such as cookies and advertiser IDs should be regularly wiped and reset. If the website or application you are using offers opt in our out options, read these carefully and set your preferences accordingly. These activities may be difficult depending on the type of device, operating system and apps/sites you are using, but it is always prudent to use products and services that are privacy-centric by design.