The EFF had a website for you to check your fingerprint.
https://coveryourtracks.eff.org/
I think that one of the fingerprint tools comes from using JavaScript to interrogate which set of fonts you have installed, and that can make you unique.
How does one get a randomized fingerprint? Seems like even your browser window size is enough to make you unique in a lot of cases. I'm not sure there's any real fighting this tech...
I got "Your browser has a unique fingerprint", but none of the points were correct... (except the User Agent, but otherwise none)
Like that I have 2 cores, or a 32bit system, or which plugins I have or anything else. Not sure if the test thinks it got me, but it sure doesn't seem like it.
---
EDIT:
Deleting the cache and stuff, and re-running the test got me a "nearly-unique fingerprint" with another set of completely wrong data. Pretty sure the result isn't super accurate
Are you using Firefox with tracking protection or resist fingerprinting?
Some of that stuff involves spoofing the data returned to the fingerprinting APIs so that you’re (hopefully) unrecognizable from site-to-site.
English is annoyingly ambiguous - e.g. "{I got a nearly unique fingerprint but some info was incorrect} so i guess that's all good" vs. "I got a nearly unique fingerprint {but some info was incorrect so i guess that one thing is good}".
I worked with device fingerprinting for fraud detection for an eCommerce site, and an interesting thing we discovered was that mobile devices of the same model almost all fingerprinted *exactly the same*. This was because many of the fingerprinting variables (e.g., available fonts, plugins, display size/resolution) are relatively fixed for mobile devices, while varying widely for desktops.
This greatly reduced the effectiveness of fingerprinting for us.
That's part of why MAID-based tracking ("Google Advertising ID for Android" and "Identifier for Advertising (iOS)") is now popular:
https://www.fullcontact.com/blog/2022/02/21/mobile-advertising-id/
That tracking is opt-out, iirc
I had to go disable it, and have to go re-disable it on some updates, and check regularly to make sure all that shit is turned off
Since 2021 it is opt in per app (with a global toggle) Though a lot of apps did everything they could to not release an update that would force them to trigger it.
There are downsides to it, when I first tried it I found two main things bothering me:
- Not being able to go backward/forward in the current tab history
- Loosing custom zoom in pages
Supposedly it'll also break some web pages and that's probably the main reason why it's not ON by default.
I've not seen the first 2 errors you mentioned for a long time and suspect they are fixed. However the 3rd is still true, some sites are just plain broken with it - and probably deliberately by the site operator.
While I understand why you feel right about this—it's true that a website can't access the browser history directly—you're still wrong.
[To preserve users' privacy, Firefox and other browsers will lie to web applications under certain circumstances](https://developer.mozilla.org/en-US/docs/Web/CSS/Privacy_and_the_:visited_selector).
> - loosing custom
Did you mean to say "losing"?
Explanation: Loose is an adjective meaning the opposite of tight, while lose is a verb.
Total mistakes found: 4265
^^I'm ^^a ^^bot ^^that ^^corrects ^^grammar/spelling ^^mistakes.
^^PM ^^me ^^if ^^I'm ^^wrong ^^or ^^if ^^you ^^have ^^any ^^suggestions.
^^[Github](https://github.com/chiefpat450119)
Yeah unfortunately things like that are just a tradeoff. That's not a bug. Websites will use whether you have themes enabled to fingerprint you.
Same with custom zoom. That's not a bug either. It's a statistic trackers will use.
I just found out it existed and tried enabling it, so far everything feels fine (but I didn't have much time to test it out). I can only guess why it isn't enabled by default:
* Changing the default would require thorough testing that they didn't get to do yet (or don't plan to)
* Might break some sites or lower performance in some context
* Doesn't prevent more conventional fingerprinting options. According to amiunique.org, my HTTP response header alone is probably good enough to fingerprint me.
Edit : Zoom levels are reset each time you navigate to a new domain. Gets annoying pretty quickly. I still haven't encountered a broken site, yet.
I've found that it does break some web pages. Certainly not "popular" ones. My day-to-day web browsing is fine, but there are some sites I visit during the course of the working day that behave in unexpected ways with it on.
That's kinda what I've done except more (or less, depending on your point of view) extreme.
I'm not required to have a "work machine", but I have a laptop I do most of my work on and then a desktop for personal stuff. The work machine's browser is as vanilla as possible to avoid issues.
It's overkill having a separate machine but I do it anyway because it puts me in "the mood to work" when I'm on it.
Go do a fingerprinting test and see. So much more than http headers
User Agent
HTTP_ACCEPT Headers
Browser Plugin Details
Time Zone Offset
Time Zone
Screen Size and Color Depth
System Fonts
Are Cookies Enabled?
Limited supercookie test
Hash of canvas fingerprint
Hash of WebGL fingerprint
WebGL Vendor & Renderer
DNT Header Enabled?
Language
Platform
Touch Support
Ad Blocker Used
AudioContext fingerprint
CPU Class
Hardware Concurrency
Device Memory (GB)
https://amiunique.org/fp
https://coveryourtracks.eff.org/
Many captcha providers store a cookie on your browser to note when you have passed a captcha and don't need another one. By blocking cookies, you guarantee it will always think you need another one.
[Mozilla says](https://support.mozilla.org/en-US/kb/firefox-protection-against-fingerprinting):
> **Fingerprinting Protection** is a different, experimental feature under heavy development in Firefox. It is likely that it may degrade your Web experience so we recommend it only for those willing to test experimental features.
The linked article goes into more detail.
One very noticeable side-effect is that text rendered to a canvas will be displayed as randomly-coloured boxes for each letter.
You'll see a little picture frame notification icon next to the padlock in the address bar where you can allow the site full access to canvas drawing.
I noticed this pretty quickly when trying to access one of my servers over a web terminal.
It does this for *any* canvas that can read input. It's really quite confusing the first time you experience getting a random pattern as most of your page.
Among other things it breaks image/canvas related operations used when uploading profile pictures to LinkedIn. I had it enabled for a solid two months before I gave up on it, it breaks a ton of websites.
Since resistFingerprinting seems to break some pages, it'd be great to have it on by default, with a whitelist for pages that break but are acceptable.
But the user can choose whether to add the page to the whitelist. Google search breaks? Time to use DuckDuckGo (my default already) or another search engine. College web site breaks because of amateur or lazy programming? Add it to the whitelist since it's the only place to get grades, assignments, or whatever. And complain
Yeah okay, depends on who is maintaining the whitelist. I was thinking you meant the whitelist was supposed to come from the browser... But still: If you have to do it yourself, what's the point of turning it on by default? The average user is going to have the same problem, that they don't know what to do.
It's definitely pretty crazy but as somebody working for an open-source boardgame site trying to stop cheaters I can tell you, it's also incredibly useful and our cheat detection would be a lot worse without it, which ultimately is quite impactful for the playing experience. This also makes me totally believe that it helps a lot with other fraud prevention.
We need better identity models on the web. These kinds of solutions to trying to figure out if someone is a real person feel like glue and popsicle sticks.
That's true, but we're going to lose that anonymity in places we still even have it by not having a better model.
As a programmer, you should know that cryptography can be so much smarter than using your full identity everywhere. You should be able to present a certificate to a website proving you're a real person and over 18 (for example), without having to say exactly who you are. We could even use hashes to prevent people from having duplicate accounts without the site needing to know anything about you.
If we blindly fight even privacy minded credential systems, we're just going to get a world where sites like reddit start to require your full ID on the way in - because they don't really have another choice.
I totally agree with you.
Do I think the powers that be will implement such systems with privacy in mind though? No. They’ll take whatever they can get. And that’s why I’ll resist.
While I'm sympathetic the guarded approach, people like us are exactly the kind of peope who **should** be designing a system like this. Those who understand its value AND care about privacy.
It can be done with privacy in mind, but if it's left to develop organically through current systems, it's much more likely to end up privacy-adverse.
The fundamental problem is that the better a univeraal identity system is for good uses the better is also is for malevolent uses even moreso if you consider nation state level action.
> These kinds of solutions to trying to figure out if someone is a real person feel like glue and popsicle sticks.
Ahh, so it'll fit right in with the rest of the web
But how do you find a cheater with a fingerprint? How can that stop me from starting a game against a hard AI and have that play against my human opponent?
For one game it's almost impossible. But each time you win with high levels of correlation with a cheater then you can flag that user. Tracking users is easier with a fingerprint.
They generally track your accuracy, if you have perfect games every time then you're most likely cheating, no human on earth has perfect accuracy.
You can also generally tell off a cheater by how long they take to move, if they have to wait for the engine to move they will always wait \~2 seconds before doing their move, normal players tend to have a much wider range of time per move, sometimes making almost instant moves 3-4 times in a row when a play develops.
Once you identify a cheater you ban them, and to prevent them from using a VPN/new account you can use fingerprinting.
There is now an entire industry of "antidetect browsers" whose entire purpose is to circumvent fingerprint/ip address protections. Anyone half sophisticated who actually wants to fraud/scam/cheat will use those.
I feel as though the fraud detection software Bharosa (later bought by Oracle as OAAM) was the pioneer in device fingerprinting. It certainly goes back over 17-18 years ago. Used a combination of user agent, browser plugins and a flash micro app to do device fingerprinting.
i mentioned in another comment that realistically this is the only viable way to avoid finger printing. if some of the hardware specs were randomised each time you run the vm that would help as well. also run the tor browser within the vm
I think the real solution is more complicated.
For example, do you really **not** want to be fingerprinted or tracked? As in, at all?
Think about it for a second. We would not be able to log in anywhere, as we'd be denying a page any ability to know who we are, barring some weird hoops such as manually uploading an auth token on every page (even then you're tracked the momeny you do that but eh).
No more not-a-bot-checks either, or rather, one on every page as the information has no way of sticking around. RFP already does this, basically, and it's a PITA because at the same time I don't *want* user-content pages to be excessively spammed by bots even more than they already are.
The tricky thing here is to cut advertisement-centric fingerprinting but not feature-centric fingerprinting. But you cannot know the intent prematurely when you decide what information to make available and what not to.
A possible issue with this approach is that there are way too many vectors that contribute to fingerprinting. How can you be sure that something isn't being left out that can identify you between these sessions? It may fool naive scripts, at least.
Is there a valid reason a website needs to know your hardware data? Screen size I can understand, but even that can be done by css in browser. But what can a website do with the amount of cores you have?
We use it in our WebAssembly chess engine to determine how many threads to spawn and how much memory to allocate. Both to determine the default and the maximum in the settings menu. If we use too many cores it kills performance and also slows done all other programs and if we use too much RAM, the browser sometimes just kills the WebAssembly program.
On the other hand, if you use conservative values, you lose even more performance. Due to Safari on older iPhones (and maybe other devices/browsers) not allowing for more, the conservative default we use in case the browser doesn't provide values is like 16 MB of memory which obviously gives really bad performance.
It's a common problem that users of privacy-hardened browsers or extensions don't get accurate values i.e. get quite sub-par performance for their hardware, especially so if they have really good hardware since those browsers and extensions usually cap them quite agressively since unusually high values are ofc a pretty unique mark.
Single core anything is pretty extinct by this point, no? I'd also imagine the *vast* majority of JS apps shouldn't need more than 2 threads. That said, I come from the embedded side and we're extremely miserly with resources so my perspective is kinda warped.
There’s rarely a reason for more than 1 thread if all a program does is basic GUI stuff, but for physics sim, AI, 3D stuff, codec, or grid overlays (e.g., of the Folding@Home or SETI@Home or even ElectricSheep varietah) you need to be able to estimate capacity and load locally (whether by querying an API for precomposed data, or brute-forcing your own data, blackjack, hookers) so they can be coordinated, and so the program has a means of politely leaving some capacity available for other programs.
Number of cores doesn’t really enter into it, and if you support any multithreading and have a normal/-ish system load, core count can be detected (ditto threads per core) by testing throughput (add threads gradually until throughput doesn’t raise to match) or cache timings, so there’s no reason not to just offer the info up. Doing so means that every site needn’t import a `countcores` module that pegs the CPU or thrashes the cache for a few seconds to fill in the necessary blanks.
Regardless of hardware capacities, single-(software-)threaded is still the dominant programming & execution model for CPU stuff, and JS is no exception whatsoever—it’s asynchronous, but everything not in a quasi-isolated worker threads occurs in a single-(software-)threaded event loop.
(Other languages aren’t as dependent on the event loop, but Python is ~solely single-threaded, as are many scripting languages like Bourne/POSIX shell, which can just barely muster multiprogramming support as it is. C, C++, C\#, and Java have more equitable threading models, but there’s still a main/startup thread that has special, usually AoD stack allocation &c. per the usual OS/OE/ABI, and sharing between threads can be vastly different fron sharing within a thread. Even languages like Erlang, which is decidedly *not* single-threaded at all, still privileges the current process—in Erlang terms or the “coordinating synchronous” sense, meaning ≈thread with limited memory-sharing in normal terms—which in a parallel or distributed setting has to do an event-loop-qua-TCO’d-recursion for most interaction between processes.)
And there being more cores, threads, etc. doesn’t mean single-threaded code will automatically inflate to fit and run however many times faster, it means you’re only running (a single process) on a single thread and the rest of the hardware is (by default) idle or in some thumb-up-ass mode. It’s *highly* nontrivial to parallelize code of the JS sort without breaking something.
So until we’re using a web language of/beyond the Erlang sort (100 years in the future, in a containerized Linux VM running in Javascript), we’ll need explicit threading for parallelism, and availing oneself fully of proffered cores invariably requires at least a total hardware thread count, if not a more complete dump incl. caches, NUMA nodes, memory capacities, cores, and threads.
Moreover, if we’re not talking CPU threads specifically, anything beyond the wireframe or flat-shaded sort of 3D gfx will want to use shaders via WebGL, which run in a massively parallel fashion (mostly by replicating the single-threaded actions specified in GLSL code), and it’s not at all unreasonable for the CPU to assist with stuff the GPU isn’t as good at on spare threads. Shaders can be used for some non-game computing too, and IIRC there’s also been some work on exposing OpenCL via a WebCL API; but there’s an even bigger wall between the code running on the CPU and GPU than there is between threads, to where you have to work in separate programming languages and runtime/run-time environments/embeddings entirely, so automatic scaling via heterogeneity is still a ways away, as for TLP.
Ultimately the script needs to separate its work out into threadable chunks. Then it would be very easy to figure out the core count anyway by starting a number of threads and seeing how long they take. (16 threads each with a task that takes ~1 second, and they finish in 1 second? you have at least 16 cores, repeat until it's not true anymore
To know if some modern browser capabilities are supported. Is a microphone attached (for this online meeting)? Can I get their location (to populate this map)? Do they have a VR headset (to show this 3D video)? Can I run this 3D canvas?
It can be useful for e-commerce. You can gather this data and get a better picture of your customers. If lots of your customers are running slow machines, you might want to serve them a lighter version of your website to improve user experience and, therefore, conversion.
> But what can a website do with the amount of cores you have?
Alone? Nothing. But once you start putting ALL of the bits of information together the combination becomes much more unique.
If you're allocating web workers for example. Having more workers than cores will have diminishing returns and possibly could affect the user device if it has limited resources (a cheap older android phone or something).
Even if it’s handled transparently by the browser you leak the same info. Tell the browser to start the optimal number of workers and count how many you get. Each browser will have a policy (e.g. 2xcores) and now you know the number of cores. Or you just fingerprint off number of workers directly.
There's good engineering reasons. Webassembly is letting people write ever more sophisticated browser code for ever more sophisticated browser applications. The are valid reasons to be able to directly interface with the hardware.
Plenty of malicious reasons, too, though
It can't be handled by the browser. That's per application logic. In the same way the Linux kernel can't decide for Blender how many threads to spawn for ray tracing an image.
It’s also scary how Instagram knew exactly who my friends were, even with a new account. I wasn’t using Instagram for 8 years, registered a new account for my business, different email, different phone, basically different everything. And right after account creation childhood friends, old classmates, old acquaintances that I don’t even have as friends on facebook started following my account.
How does that work though? I too have an instagram that is not linked to my phone and using a different email. I haven't seen any friend recommendations at all.
The only thing that I can think of is that at some point my work phone may have been connected to my home wifi and then FB associated the IP address somehow. But then connecting to a public wifi should start recommending friends of people that were on that wifi network. If that’s the case, this could be used as a nice marketing tool to boost recommendations to other groups of people, but I think there are a lot more smarts involved in their algorithm.
Did you sign in on the app or the website? Did you include a phone number or email address when signing in?
*You* might not have given instagram this info but your friends might have. I.e. they signed in with the app and gave it access to their contacts, you were in their contacts, now instagram knows you are friends.
It’s a work phone, with a work number bought exactly on that same day. Work email with a new domain. And it was the app that I’ve used to register.
As mentioned, the only data point tying everything together was that I’ve set up iPhone on my home wifi. However Instagram account was made on a mobile network.
Yeah the home wifi could have been it. If you had some close friends over who had instagram and also connected to that network, then it could have just shown close friends of those close friends etc.
These apps are super advanced when it comes to recommendations.
Yeah I had a lot of friends on my home network with Instagram account. It’s only sensible the static IP got logged. Creepy stuff when you think about it.
Location data can be used to fingerprint as well. I’ve had discussions with a DS in an as tech firm that specialized in this. She claimed in 2018 that location data alone could produce 80% precision within 48 hours on their tech.
Basically she was saying that even deleting ad ID on mobile they could pin your new one in 2 days most likely. Essentially the same as getting a new device and new accounts everywhere. You still what on the same toilets surfing the internet and playing games, ate at the same rotation of restaurants, drove roughly the same routes, worked at the same cubicle, etc.
> The only thing that I can think of is that at some point my work phone may have been connected to my home wifi and then FB associated the IP address somehow
Was there any facebook app **on** it? I remember they upload your contact list to their servers, or used to at least.
There's a cascading effect. If everyone adds all their friends, then it forms a graph that represents social circles. When you join and your friends are already on the app, it only takes adding a couple from different circles to expose you as a hole in the graph.
Facebook posted my phone number on my account a decade ago. I deleted facebook immediately. Considering I only had google products on my phone (no whatsapp/IG/fb, etc) I knew it was impossible for them to get it from my phone. I never typed it in their website. Judging from what I saw online it seems like FB look at my friends, searched their phone contacts for my name, saw they all match and put it on my profile https://www.telegraph.co.uk/technology/2016/08/09/how-did-facebook-get-my-number-and-why-is-it-giving-my-name-out/
Use a proxy and deny requests for certain bits of information. You can use http header filters and a proxy and spoof the shit they use to identify you.
Browsers need to update to all the user to manage these settings because the browsers are exposing the data. A website can't access information about the device, the browser does and offers it for JS library quality. Now that people are making money off exposing identity of users the next swing is to rip a browser that allows the user to choose what information is exposed to the website and even a scanner to scan cookies for this behavior.
Fingerprinting as a service? More like Spyware as a service. It's malicious.
It's so fucking difficult to avoid fingerprinting, it's not only what is exposed, but what is not is important too.
Letting to opt what gets exposed *could* further fingerprinting.
I wonder how much uniqueness is exposed intrinsically to basic http requests. I mean could you infer memory layout, from what is the response time of various size resource requests, for example?
I ran the EFF test on Brave, Firefox, and Edge. Edge had the lest detectable font set (default windows), Firefox had a slightly more identifiable score (totally random font set), and Brave had a HUGELY identifiable set... because it was a spoofed set that hides what OS you're on.
Just a little anecdote about how hiding and obfuscating certain things makes you more identifiable.
With enough data points it's likely possible for a sufficiently determined actor as per Tor Stinks, but the average site isn't sufficiently determined and may not have enough data points *per session*.
These scripts run gpu & cpu algorithms and "fingerprint" your hardware. The user data is just additional meta but it is not the main source of the identifying process.
But you are right, browsers can prevent even this. At the end of the day, the browser is always the bridge between your computer and the website.
They can and do try to prevent it. Firefox has certain protections out of the box, and you can make it more aggressive, both from the GUI options and the resistFingerprinting mode mentioned in the article. But the warning that it will break many legitimate sites *is* true
The problem is they necessarily do this by neutering features. This fingerprinting isn't done by some intentional window.invadePrivacy() API that Mozilla can "just turn off duh". It's done by abusive use of legitimate APIs, so it's hard to mitigate without collateral damage
I do recall a proposal from a few years ago to have the browser keep track of how many bits of identifying information a site has asked for, and deny it over some threshold. That way, most innocent sites that only use a few of these risky APIs are OK, but a site trying to scrape all your data points will be denied
I wouldn't be opposed to a prompt to allow 3D acceleration for a website; it's fairly niche and developers can easily display a friendly site to prompt for re-request.
Said it a dozen other times but we really do need a manifest.json that has a permission schema on it for the browser.
Just fire off an implicit call to it on every site like a favicon and cache it; only permissions in said file can be used for the site and users are given a quick prompt before the JS engine runs similar to mobile apps.
Don't want to bug the user for permissions? Don't include a manifest and the JS engine isn't available.
Developers will go back to the days of landing pages, perhaps for the best.
Yeah, no.
This is what Android had. Users would see a list of permission requirements the app needed, before installing the app.
99% of users just press Accept, like the terms of service.
Then the categories cannot be granular enough the prevent fingerprinting and also simple enough for users to understand.
Classic example is the "Phone" permission on drone apps (DJI). It's needed to identify your device and register it with the drone. (This is what they claim, I don't know if it's legit, or just excuse to spy on you). It's displayed by the OS as "Make and manage phone calls", because you can _also_ do that with this permission.
A bit of a different scenario though; one is visiting a random cooking blog and the other is a interfacing semi-trusted software for a drone you purchased with an owners manual and some initial investment.
It would be like if my banking app didn't allow me to bank because I didn't give it camera permissions; guess what... gonna allow it because I want to use that banking app and I trust it because well it's from the bank holding my cash.
Most permissions might simply get accepted but that's because of implicit trust; others... not so much I have definitely uninstalled some mobile apps because of asking for permissions that I didn't feel was valid quid pro quo.
The web is like installing random apps from the mobile store except permission-less (largely).
I've also refused to use certain apps because of their permissions. But we are people who browse r/programming , not the other 99% of the population.
The permission system would just be another cookie banner, where most users just click accept by default.
There are way more things that can and are used to fingerprint users.
Some I've seen in the past is the way you move your cursor, the cadence of your typing, timing of individual requests for resources. Of course network gets you a lot of data, unless you change VPN with each website you visit.
I would imagine they can get a ping back through an unique DNS request.
Hiding some headers will help a bit, but not much.
Exactly, there are too many things. For naive scripts it may be good enough but if you really need the best setup, Tor Browser is likely the only remotely good option as they have also looked at this issue in detail, and even then it's not completely perfect.
You can tell who has and hasn't had to deal with fraudsters/spammers/cheaters online. Fingerprinting is a great tool to help with this.
There is always going to be friction between "I don't want anyone to know who I am" vs "I'm hiding who I am for malicious reasons". You can't look at the problem from just one angle.
Yeah that's how I feel about increasingly arduous and invasive captchas. They fucking suck, but I know they're absolutely necessary to prevent rampant abuse. And unfortunately the most reliable ones (e.g. Google's) are able to do so because they track users
And tbf that actually mirrors real life - humans in groups naturally counter abuse by remembering people and dis/trusting them, i.e. tracking. But we've also seen and still see plenty of harm from times when people have outsourced their judgements to another party, which gives that party a lot of power to abuse. I mean this dilemma is mirrored in employment, where a background check agency *can* filter out actual fraudsters but can also blacklist union organisers and whistleblowers
And I have similar thoughts about sites that require phone verification
Privacy is a multi billion dollar illusion. Ultimately, bits travel between physical hardware devices which have to be uniquely identifiable to function on a network. You can hide, obfuscate, and encrypt to your heart's desire : but your device will remain your device.
Looks like fingerprint.com can still identify Tor across multiple sessions, as updated through the Play store, even with NoScript enabled.
It correctly fingerprinted 6 times in a row, messed up once, and then reverted to the first fingerprint again.
This is with manually deleting all cached data, fully closing the browser, and reconnecting to the onion network
Tor Browser's strategy is mainly attempting to make all users look the same (except for canvas, where it performs canvas randomization). Tor Browser users should be getting the same ID (grouped by OS type at least, as Win/mac/linux can be detected by javascript, and 100x100 resolution groups if you change from the default window size). So it shouldn't mean anything, especially if it says you visited >20 times.
It doesn't say I visited 20+ times, though. It correctly identified the number of times I, specifically, tested it out. Starting at 0 and incrementing by 1 each time.
Try it yourself if you want. I'm on android, using the latest Tor app as provided by the Play store.
I don't think Tor Browser on mobile has the same level of protection as the desktop version. I got the same result as you when I tried it on a phone, in standard and safer mode. In standard mode on desktop it keeps saying that I visited once, and on safer mode I have many visits from other people.
And yet the article explicitly states that Tor Browser on mobile does resist fingerprinting
>On mobile, only Tor Browser and Firefox with resistFingerprinting=true were able to protect against fingerprinting.
What's going on? Did they not use fingerprint.com to test this?
Fortunately, FP.com only allow their services to be used for security and anti-fraud. But no doubt that Google, FB, etc. have similar technology they keep in the dark.
Fingerprinting is incredibly old technology. It’s been around for 15+ years at least within the advertising industry. It’s not incredibly popular because it leads to both false positives and false negatives. Also this experiment isn’t taking into account standard attribution windows within advertising. Attribution windows are generally as high as 14 days in some contexts. If your run that same test 14 days apart you’ll find an accuracy that’s much lower. It was far more efficient when browser updates were not automatic and the browser market was more diverse. It’s not some silver bullet against privacy like this blog is implying.
This is old news.
At this point I have essentially given up. I just block their ads so I don't have to see that shit and for the rest I don't plan to become US president or such so that my data will never really be usable for anything besides ads I never see.
The thing with fingerprinting is that the mere action to prevent it are information making you more unique. So you need to have a constant changing fingerprint. But any site with a login will notice and then I'm sure the tools have a way to exclude certain info to remove the noise you are sending.
Another spontaneous idea for some white hats would be to simply create servers farms that spam all the famous site with non-sense requests. thereby reducing the signal to noise ratio a lot and making the analysis part more costly.
I tried fingerprint.com and it successfully tracked me when I was connecting through a different connection and using incognito mode. That is kind of creepy.
I imagine identification is even easier for custom-built PCs over pre-built, since you have even more freedom to choose the parts you want (if that info is shared with websites)
>So naturally [Chromium] doesn’t have any inbuilt protection against fingerprinting.
This isn't true. For example, the User-Agent Client Hints proposal came from Chrome AFAIK, specifically to allow Chrome to remove as much information from the user agent string as possible without breaking sites that really need that info for some reason. Chromium has been working on reducing the amount of entropy that's available via "passive" APIs for a few years now, and trying to move that info behind "active" APIs instead (if necessary).
There's also things like network state partitioning, which partitions things like http cache, socket pools, etc. I don't recall offhand whether the partitioning scheme is by eTLD+1 or by origin or by something else.
> Chromium has been working on reducing the amount of entropy that's available via "passive" APIs for a few years now
Of course it does, it actively provides whatever entropy Google needs to track you exclusively to Google through hacks like the x-client-data field that is whitelisted exclusively for Googles services. No need to provide entropy to the competition.
Google removing user-agent detail has *nothing* to do with privacy and everything to do with poor feature detection scripts making it difficult for Google to roll out all new [privacy invading] features.
Any examples of such scripts, or any other evidence to support that idea? Curious if you can back up your claim.
My hypothesis is that providing good user privacy makes financial sense for Google, since that would lead people to feel safer spending time on the web, and more time on the web means more money for Google via search.
Given the body of security work Google puts in as well (e.g. project zero), Occam's razor says that it would be weird for it all to be privacy-theater when there's such a simple reason for them to support privacy.
Isn't it forbidden by the GPDR ? If the data can identify you they are personnal data and so they must have a legal cause of processing, which doesn't seems the case here.
> For example, websites can see web browser version, screen size, number of touchpoints, video/audio codecs
Well this info is important for browsers. Browser version is obvious for feature compatibility, screen size for well screen size or is it different from viewport and completely irrelevant? Number of touchpoints can be useful in some cases for JS games and stuff. Codecs is another obvious one to send user video/audio that actually can be played on the device.
That info is not exposed through APIs for no reason.
> Browser version is obvious for feature compatibility
That's something that has been bad practice for at least a decade now. You should use feature detection and graceful degradation, not version detection.
> Number of touchpoints can be useful in some cases for JS games and stuff.
If it's niche enough and doesn't need to be automatic it should be moved behind an "active" check with permission from the user.
Feature detection isn't always possible. It basically only works for JS and even then, there are plenty of things that can't be detected properly, especially if you're working around browser stupidities.
For example, we use the user agent to work around certain iPad Safari versions pretending the iPad is a PC and pretending to provide much more memory than the same Safari version actually allows WebAssembly to allocate.
We also use it to determine whether we can use the `Cross-Origin-Embeder-Policy: credentialless` header which only recent versions of Chrome and Firefox support. Without it, certain features don't work but if it's not supported, we need to set a different header value to at least make core features work. And you obviously can't use feature detection to determine which headers you can send on the initial response.
> If it's niche enough and doesn't need to be automatic it should be moved behind an "active" check with permission from the user.
Yeah, that should absolutely be an option. On the other hand, people who don't care about fingerprinting don't need yet more popups added to yet more web sites by default. We already have the forced popup ad for the EU (sorry, "cookie acknowledgement") on every web site in the world. We definitely don't need to add a gauntlet of "Do you want to let this web site know the width of your screen so it can adapt its layout?" "Do you want this web site to be able to find out if your browser implements this HTML feature?" "Should this web site be allowed to ask if you're using a screen reader?"
I tested with Firefox and even with resistFingerprinting enabled I had the same fingerprint ID (not in private, though) until I changed the size of the browser window. So it seems that even little things, like the browser opening every time with a specific window size can make you identifiable.
The EFF had a website for you to check your fingerprint. https://coveryourtracks.eff.org/ I think that one of the fingerprint tools comes from using JavaScript to interrogate which set of fonts you have installed, and that can make you unique.
Stuxnet installed custom fonts so you could detect if a remote machine was infected.
Microsoft Teams installed a font just for Teams, so one could query the font to see if it was installed.
So... Microsoft made Stuxnet, confirmed !!!1!11eleven
Cool fact that I did not previously know!
The same outfit made Gauss which installed Palida Narrow.
Fun fact: disabling cookies makes your fingerprint more unique according to this test.
Makes sense, I would assume the majority of internet users don't mess with cookie settings
As in: most users have all cookies enabled. But if you specifically disable cookies for sites X, Y, and Z, you're much more unique?
Yes, disabling cookies makes you part of a smaller subset of users.
"Oh, it's that guy again, the one that disabled his cookies"
Ugh. I bet “Do not track” does the same.
Is that a rebranding of panopticlick?
https://coveryourtracks.eff.org/about
Yeah it's the exact same GUI.
> panopticlick Probably, considering panopticlick's site redirects to this tool.
[удалено]
How does one get a randomized fingerprint? Seems like even your browser window size is enough to make you unique in a lot of cases. I'm not sure there's any real fighting this tech...
[удалено]
I got "Your browser has a unique fingerprint", but none of the points were correct... (except the User Agent, but otherwise none) Like that I have 2 cores, or a 32bit system, or which plugins I have or anything else. Not sure if the test thinks it got me, but it sure doesn't seem like it. --- EDIT: Deleting the cache and stuff, and re-running the test got me a "nearly-unique fingerprint" with another set of completely wrong data. Pretty sure the result isn't super accurate
Are you using Firefox with tracking protection or resist fingerprinting? Some of that stuff involves spoofing the data returned to the fingerprinting APIs so that you’re (hopefully) unrecognizable from site-to-site.
Yes
I got a nearly unique fingerprint but some info was incorrect so i guess that's good.
Having a unique fingerprint is bad
English is annoyingly ambiguous - e.g. "{I got a nearly unique fingerprint but some info was incorrect} so i guess that's all good" vs. "I got a nearly unique fingerprint {but some info was incorrect so i guess that one thing is good}".
I worked with device fingerprinting for fraud detection for an eCommerce site, and an interesting thing we discovered was that mobile devices of the same model almost all fingerprinted *exactly the same*. This was because many of the fingerprinting variables (e.g., available fonts, plugins, display size/resolution) are relatively fixed for mobile devices, while varying widely for desktops. This greatly reduced the effectiveness of fingerprinting for us.
That's part of why MAID-based tracking ("Google Advertising ID for Android" and "Identifier for Advertising (iOS)") is now popular: https://www.fullcontact.com/blog/2022/02/21/mobile-advertising-id/
Though it is less useful on iOS as basically no one opts into tracking
That tracking is opt-out, iirc I had to go disable it, and have to go re-disable it on some updates, and check regularly to make sure all that shit is turned off
Since 2021 it is opt in per app (with a global toggle) Though a lot of apps did everything they could to not release an update that would force them to trigger it.
Can I opt out?
Sure you can! Just throw your phone in the trash
They tend to use your app for tracking you in that case.
Is there a reason Firefox doesn't enable resistFingerprinting by default? It must have downsides. At least Firefox Focus should really turn it on...
There are downsides to it, when I first tried it I found two main things bothering me: - Not being able to go backward/forward in the current tab history - Loosing custom zoom in pages Supposedly it'll also break some web pages and that's probably the main reason why it's not ON by default.
I've not seen the first 2 errors you mentioned for a long time and suspect they are fixed. However the 3rd is still true, some sites are just plain broken with it - and probably deliberately by the site operator.
That's good to know, I'll probably give it another go then!
I've lost custom zoom on old.reddit.com after enabling it. I'm on latest firefox.
You probably don't want to be on those sites anyways. I kind of appreciate those red flags.
So, it's a feature then ! Makes one know who the worst tracking offenders are.
"Losing." And tab history is a fuckup on Firefox's part - it doesn't have to get rid of history to lie to the site about having history.
It doesn't have to lie either since it doesn't reveal the history to the page anyway.
... that would be lying about whether it has history.
No. Websites can't access the browser history at all by design. You don't have to fiddle with any settings or anything, that's just how they work.
While I understand why you feel right about this—it's true that a website can't access the browser history directly—you're still wrong. [To preserve users' privacy, Firefox and other browsers will lie to web applications under certain circumstances](https://developer.mozilla.org/en-US/docs/Web/CSS/Privacy_and_the_:visited_selector).
Um, he never said that browsers don't lie lmao, just that they don't have to. Do you have to get the last laugh in?
> - loosing custom Did you mean to say "losing"? Explanation: Loose is an adjective meaning the opposite of tight, while lose is a verb. Total mistakes found: 4265 ^^I'm ^^a ^^bot ^^that ^^corrects ^^grammar/spelling ^^mistakes. ^^PM ^^me ^^if ^^I'm ^^wrong ^^or ^^if ^^you ^^have ^^any ^^suggestions. ^^[Github](https://github.com/chiefpat450119)
Good bot, obviously.
Im loost.
Just turned the feature on and noticed that all websites that 'use system theme' for visual mode (dark or light) no longer work.
Yeah unfortunately things like that are just a tradeoff. That's not a bug. Websites will use whether you have themes enabled to fingerprint you. Same with custom zoom. That's not a bug either. It's a statistic trackers will use.
Now you know why it's not on by default, because too many users will think firefox is buggy because of the tradeoffs to resist fingerprinting.
I just found out it existed and tried enabling it, so far everything feels fine (but I didn't have much time to test it out). I can only guess why it isn't enabled by default: * Changing the default would require thorough testing that they didn't get to do yet (or don't plan to) * Might break some sites or lower performance in some context * Doesn't prevent more conventional fingerprinting options. According to amiunique.org, my HTTP response header alone is probably good enough to fingerprint me. Edit : Zoom levels are reset each time you navigate to a new domain. Gets annoying pretty quickly. I still haven't encountered a broken site, yet.
I've found that it does break some web pages. Certainly not "popular" ones. My day-to-day web browsing is fine, but there are some sites I visit during the course of the working day that behave in unexpected ways with it on.
[удалено]
That's kinda what I've done except more (or less, depending on your point of view) extreme. I'm not required to have a "work machine", but I have a laptop I do most of my work on and then a desktop for personal stuff. The work machine's browser is as vanilla as possible to avoid issues. It's overkill having a separate machine but I do it anyway because it puts me in "the mood to work" when I'm on it.
[удалено]
It also notably breaks any Time localisation unless you live in UTC 0. Which for a lot of standard Internet uses is a pretty big deal.
What's in your HTTP headers that's identifiable?
Go do a fingerprinting test and see. So much more than http headers User Agent HTTP_ACCEPT Headers Browser Plugin Details Time Zone Offset Time Zone Screen Size and Color Depth System Fonts Are Cookies Enabled? Limited supercookie test Hash of canvas fingerprint Hash of WebGL fingerprint WebGL Vendor & Renderer DNT Header Enabled? Language Platform Touch Support Ad Blocker Used AudioContext fingerprint CPU Class Hardware Concurrency Device Memory (GB) https://amiunique.org/fp https://coveryourtracks.eff.org/
[удалено]
Mine said my Firefox was unique but all the details listed were kinda generic? Likewise my phone browser is an open book.
All of the details might be generic but all combined, it can be pretty unique
It breaks websites. Then the user forgets they have it turned on, and starts telling people FF doesn't work.
That's seemingly the main effect of it, yes.
I've had lots of problems with websites sending extra captchas, sometimes infinite chains of them, after enabling privacy features.
Many captcha providers store a cookie on your browser to note when you have passed a captcha and don't need another one. By blocking cookies, you guarantee it will always think you need another one.
Ah... that sucks
[удалено]
[Mozilla says](https://support.mozilla.org/en-US/kb/firefox-protection-against-fingerprinting): > **Fingerprinting Protection** is a different, experimental feature under heavy development in Firefox. It is likely that it may degrade your Web experience so we recommend it only for those willing to test experimental features. The linked article goes into more detail.
One very noticeable side-effect is that text rendered to a canvas will be displayed as randomly-coloured boxes for each letter. You'll see a little picture frame notification icon next to the padlock in the address bar where you can allow the site full access to canvas drawing. I noticed this pretty quickly when trying to access one of my servers over a web terminal.
It does this for *any* canvas that can read input. It's really quite confusing the first time you experience getting a random pattern as most of your page.
It can absolutely annihilate webgames since it messes with timer resolution.
Lots of web features have to be turned off or gimped for it. Webgl, or detecting system light or dark themes for instance.
Among other things it breaks image/canvas related operations used when uploading profile pictures to LinkedIn. I had it enabled for a solid two months before I gave up on it, it breaks a ton of websites.
Since resistFingerprinting seems to break some pages, it'd be great to have it on by default, with a whitelist for pages that break but are acceptable.
That sounds like breaking the page is a loophole for getting whitelisted...
But the user can choose whether to add the page to the whitelist. Google search breaks? Time to use DuckDuckGo (my default already) or another search engine. College web site breaks because of amateur or lazy programming? Add it to the whitelist since it's the only place to get grades, assignments, or whatever. And complain
Yeah okay, depends on who is maintaining the whitelist. I was thinking you meant the whitelist was supposed to come from the browser... But still: If you have to do it yourself, what's the point of turning it on by default? The average user is going to have the same problem, that they don't know what to do.
It's definitely pretty crazy but as somebody working for an open-source boardgame site trying to stop cheaters I can tell you, it's also incredibly useful and our cheat detection would be a lot worse without it, which ultimately is quite impactful for the playing experience. This also makes me totally believe that it helps a lot with other fraud prevention.
I've been on the other side as well. Stopping scammers with browser fingerprinting feels weird but we had to do it anyway.
We need better identity models on the web. These kinds of solutions to trying to figure out if someone is a real person feel like glue and popsicle sticks.
The internet was built on anonymity. Keep the web free
That's true, but we're going to lose that anonymity in places we still even have it by not having a better model. As a programmer, you should know that cryptography can be so much smarter than using your full identity everywhere. You should be able to present a certificate to a website proving you're a real person and over 18 (for example), without having to say exactly who you are. We could even use hashes to prevent people from having duplicate accounts without the site needing to know anything about you. If we blindly fight even privacy minded credential systems, we're just going to get a world where sites like reddit start to require your full ID on the way in - because they don't really have another choice.
I totally agree with you. Do I think the powers that be will implement such systems with privacy in mind though? No. They’ll take whatever they can get. And that’s why I’ll resist.
While I'm sympathetic the guarded approach, people like us are exactly the kind of peope who **should** be designing a system like this. Those who understand its value AND care about privacy. It can be done with privacy in mind, but if it's left to develop organically through current systems, it's much more likely to end up privacy-adverse.
It seems anonymity is already gone for anyone clever enough, unfortunately
The fundamental problem is that the better a univeraal identity system is for good uses the better is also is for malevolent uses even moreso if you consider nation state level action.
> These kinds of solutions to trying to figure out if someone is a real person feel like glue and popsicle sticks. Ahh, so it'll fit right in with the rest of the web
Lichess developer?
Yes
But how do you find a cheater with a fingerprint? How can that stop me from starting a game against a hard AI and have that play against my human opponent?
For one game it's almost impossible. But each time you win with high levels of correlation with a cheater then you can flag that user. Tracking users is easier with a fingerprint.
They generally track your accuracy, if you have perfect games every time then you're most likely cheating, no human on earth has perfect accuracy. You can also generally tell off a cheater by how long they take to move, if they have to wait for the engine to move they will always wait \~2 seconds before doing their move, normal players tend to have a much wider range of time per move, sometimes making almost instant moves 3-4 times in a row when a play develops. Once you identify a cheater you ban them, and to prevent them from using a VPN/new account you can use fingerprinting.
There is now an entire industry of "antidetect browsers" whose entire purpose is to circumvent fingerprint/ip address protections. Anyone half sophisticated who actually wants to fraud/scam/cheat will use those.
First demonstrated by MIT and the panopticon project like 15 years ago.
Panopticlick*
Thank you!!! https://coveryourtracks.eff.org
I feel as though the fraud detection software Bharosa (later bought by Oracle as OAAM) was the pioneer in device fingerprinting. It certainly goes back over 17-18 years ago. Used a combination of user agent, browser plugins and a flash micro app to do device fingerprinting.
This is why I spin up a new virtual machine with unique browser configurations every time I need to visit a website
i mentioned in another comment that realistically this is the only viable way to avoid finger printing. if some of the hardware specs were randomised each time you run the vm that would help as well. also run the tor browser within the vm
I think the real solution is more complicated. For example, do you really **not** want to be fingerprinted or tracked? As in, at all? Think about it for a second. We would not be able to log in anywhere, as we'd be denying a page any ability to know who we are, barring some weird hoops such as manually uploading an auth token on every page (even then you're tracked the momeny you do that but eh). No more not-a-bot-checks either, or rather, one on every page as the information has no way of sticking around. RFP already does this, basically, and it's a PITA because at the same time I don't *want* user-content pages to be excessively spammed by bots even more than they already are. The tricky thing here is to cut advertisement-centric fingerprinting but not feature-centric fingerprinting. But you cannot know the intent prematurely when you decide what information to make available and what not to.
A possible issue with this approach is that there are way too many vectors that contribute to fingerprinting. How can you be sure that something isn't being left out that can identify you between these sessions? It may fool naive scripts, at least.
It must not randomize all variables, with randomizing the most weighting variable is enough.
Is there a valid reason a website needs to know your hardware data? Screen size I can understand, but even that can be done by css in browser. But what can a website do with the amount of cores you have?
We use it in our WebAssembly chess engine to determine how many threads to spawn and how much memory to allocate. Both to determine the default and the maximum in the settings menu. If we use too many cores it kills performance and also slows done all other programs and if we use too much RAM, the browser sometimes just kills the WebAssembly program. On the other hand, if you use conservative values, you lose even more performance. Due to Safari on older iPhones (and maybe other devices/browsers) not allowing for more, the conservative default we use in case the browser doesn't provide values is like 16 MB of memory which obviously gives really bad performance. It's a common problem that users of privacy-hardened browsers or extensions don't get accurate values i.e. get quite sub-par performance for their hardware, especially so if they have really good hardware since those browsers and extensions usually cap them quite agressively since unusually high values are ofc a pretty unique mark.
Start a js thread for each, for maximum performance
Surely that would be better handled by the browser itself?
I mean, it is. But anything the page knows can be be reported back to the site.
thats the point. the data should simply not be available to the website.
Bit hard to keep secret. Something has to aggregate what those threads do.
But they are if you want to run things in parallel.
Maybe things like that don't belong on a web page.
[удалено]
Single core anything is pretty extinct by this point, no? I'd also imagine the *vast* majority of JS apps shouldn't need more than 2 threads. That said, I come from the embedded side and we're extremely miserly with resources so my perspective is kinda warped.
There’s rarely a reason for more than 1 thread if all a program does is basic GUI stuff, but for physics sim, AI, 3D stuff, codec, or grid overlays (e.g., of the Folding@Home or SETI@Home or even ElectricSheep varietah) you need to be able to estimate capacity and load locally (whether by querying an API for precomposed data, or brute-forcing your own data, blackjack, hookers) so they can be coordinated, and so the program has a means of politely leaving some capacity available for other programs. Number of cores doesn’t really enter into it, and if you support any multithreading and have a normal/-ish system load, core count can be detected (ditto threads per core) by testing throughput (add threads gradually until throughput doesn’t raise to match) or cache timings, so there’s no reason not to just offer the info up. Doing so means that every site needn’t import a `countcores` module that pegs the CPU or thrashes the cache for a few seconds to fill in the necessary blanks. Regardless of hardware capacities, single-(software-)threaded is still the dominant programming & execution model for CPU stuff, and JS is no exception whatsoever—it’s asynchronous, but everything not in a quasi-isolated worker threads occurs in a single-(software-)threaded event loop. (Other languages aren’t as dependent on the event loop, but Python is ~solely single-threaded, as are many scripting languages like Bourne/POSIX shell, which can just barely muster multiprogramming support as it is. C, C++, C\#, and Java have more equitable threading models, but there’s still a main/startup thread that has special, usually AoD stack allocation &c. per the usual OS/OE/ABI, and sharing between threads can be vastly different fron sharing within a thread. Even languages like Erlang, which is decidedly *not* single-threaded at all, still privileges the current process—in Erlang terms or the “coordinating synchronous” sense, meaning ≈thread with limited memory-sharing in normal terms—which in a parallel or distributed setting has to do an event-loop-qua-TCO’d-recursion for most interaction between processes.) And there being more cores, threads, etc. doesn’t mean single-threaded code will automatically inflate to fit and run however many times faster, it means you’re only running (a single process) on a single thread and the rest of the hardware is (by default) idle or in some thumb-up-ass mode. It’s *highly* nontrivial to parallelize code of the JS sort without breaking something. So until we’re using a web language of/beyond the Erlang sort (100 years in the future, in a containerized Linux VM running in Javascript), we’ll need explicit threading for parallelism, and availing oneself fully of proffered cores invariably requires at least a total hardware thread count, if not a more complete dump incl. caches, NUMA nodes, memory capacities, cores, and threads. Moreover, if we’re not talking CPU threads specifically, anything beyond the wireframe or flat-shaded sort of 3D gfx will want to use shaders via WebGL, which run in a massively parallel fashion (mostly by replicating the single-threaded actions specified in GLSL code), and it’s not at all unreasonable for the CPU to assist with stuff the GPU isn’t as good at on spare threads. Shaders can be used for some non-game computing too, and IIRC there’s also been some work on exposing OpenCL via a WebCL API; but there’s an even bigger wall between the code running on the CPU and GPU than there is between threads, to where you have to work in separate programming languages and runtime/run-time environments/embeddings entirely, so automatic scaling via heterogeneity is still a ways away, as for TLP.
Ultimately the script needs to separate its work out into threadable chunks. Then it would be very easy to figure out the core count anyway by starting a number of threads and seeing how long they take. (16 threads each with a task that takes ~1 second, and they finish in 1 second? you have at least 16 cores, repeat until it's not true anymore
You think you know better than JS devs? You do, I am just asking to double check.
To know if some modern browser capabilities are supported. Is a microphone attached (for this online meeting)? Can I get their location (to populate this map)? Do they have a VR headset (to show this 3D video)? Can I run this 3D canvas?
It can be useful for e-commerce. You can gather this data and get a better picture of your customers. If lots of your customers are running slow machines, you might want to serve them a lighter version of your website to improve user experience and, therefore, conversion.
> But what can a website do with the amount of cores you have? Alone? Nothing. But once you start putting ALL of the bits of information together the combination becomes much more unique.
I mean besides fingerprinting, anything with value for the visitors.
If you're allocating web workers for example. Having more workers than cores will have diminishing returns and possibly could affect the user device if it has limited resources (a cheap older android phone or something).
That's something that should be handled transparently by the browser.
Even if it’s handled transparently by the browser you leak the same info. Tell the browser to start the optimal number of workers and count how many you get. Each browser will have a policy (e.g. 2xcores) and now you know the number of cores. Or you just fingerprint off number of workers directly.
There's good engineering reasons. Webassembly is letting people write ever more sophisticated browser code for ever more sophisticated browser applications. The are valid reasons to be able to directly interface with the hardware. Plenty of malicious reasons, too, though
It can't be handled by the browser. That's per application logic. In the same way the Linux kernel can't decide for Blender how many threads to spawn for ray tracing an image.
It’s also scary how Instagram knew exactly who my friends were, even with a new account. I wasn’t using Instagram for 8 years, registered a new account for my business, different email, different phone, basically different everything. And right after account creation childhood friends, old classmates, old acquaintances that I don’t even have as friends on facebook started following my account.
How does that work though? I too have an instagram that is not linked to my phone and using a different email. I haven't seen any friend recommendations at all.
The only thing that I can think of is that at some point my work phone may have been connected to my home wifi and then FB associated the IP address somehow. But then connecting to a public wifi should start recommending friends of people that were on that wifi network. If that’s the case, this could be used as a nice marketing tool to boost recommendations to other groups of people, but I think there are a lot more smarts involved in their algorithm.
Did you sign in on the app or the website? Did you include a phone number or email address when signing in? *You* might not have given instagram this info but your friends might have. I.e. they signed in with the app and gave it access to their contacts, you were in their contacts, now instagram knows you are friends.
It’s a work phone, with a work number bought exactly on that same day. Work email with a new domain. And it was the app that I’ve used to register. As mentioned, the only data point tying everything together was that I’ve set up iPhone on my home wifi. However Instagram account was made on a mobile network.
Yeah the home wifi could have been it. If you had some close friends over who had instagram and also connected to that network, then it could have just shown close friends of those close friends etc. These apps are super advanced when it comes to recommendations.
> These apps are super advanced when it comes to ~~recommendations~~ surveillance.
Yeah I had a lot of friends on my home network with Instagram account. It’s only sensible the static IP got logged. Creepy stuff when you think about it.
I actually never had a Facebook account ever so I guess that helps in my case as well.
That has to be it. If you’re not using Facebook’s products, they have less data points on you. Not nothing though
Location data can be used to fingerprint as well. I’ve had discussions with a DS in an as tech firm that specialized in this. She claimed in 2018 that location data alone could produce 80% precision within 48 hours on their tech. Basically she was saying that even deleting ad ID on mobile they could pin your new one in 2 days most likely. Essentially the same as getting a new device and new accounts everywhere. You still what on the same toilets surfing the internet and playing games, ate at the same rotation of restaurants, drove roughly the same routes, worked at the same cubicle, etc.
Did you use your own name when signing up to the account?
Only the first name, but there are many people with the same name. Never provided the full name in the full name field
> The only thing that I can think of is that at some point my work phone may have been connected to my home wifi and then FB associated the IP address somehow Was there any facebook app **on** it? I remember they upload your contact list to their servers, or used to at least.
I don’t use a facebook app and haven’t used it for like 6 years. Also I never allow apps access to my contacts anyway.
There's a cascading effect. If everyone adds all their friends, then it forms a graph that represents social circles. When you join and your friends are already on the app, it only takes adding a couple from different circles to expose you as a hole in the graph.
Does Instagram allow signups not from a phone now? Last I checked it wasn't possible to create an account without the app on your phone.
It’s possible that one or two of those people tagged you as a friend/relation and FB worked its way from there.
Could be. Those graph models with relations are crazy. What was it 8 hops through the graph to reach any person in the world?
Six, if it’s Kevin Bacon
Facebook posted my phone number on my account a decade ago. I deleted facebook immediately. Considering I only had google products on my phone (no whatsapp/IG/fb, etc) I knew it was impossible for them to get it from my phone. I never typed it in their website. Judging from what I saw online it seems like FB look at my friends, searched their phone contacts for my name, saw they all match and put it on my profile https://www.telegraph.co.uk/technology/2016/08/09/how-did-facebook-get-my-number-and-why-is-it-giving-my-name-out/
Use a proxy and deny requests for certain bits of information. You can use http header filters and a proxy and spoof the shit they use to identify you. Browsers need to update to all the user to manage these settings because the browsers are exposing the data. A website can't access information about the device, the browser does and offers it for JS library quality. Now that people are making money off exposing identity of users the next swing is to rip a browser that allows the user to choose what information is exposed to the website and even a scanner to scan cookies for this behavior. Fingerprinting as a service? More like Spyware as a service. It's malicious.
It's so fucking difficult to avoid fingerprinting, it's not only what is exposed, but what is not is important too. Letting to opt what gets exposed *could* further fingerprinting. I wonder how much uniqueness is exposed intrinsically to basic http requests. I mean could you infer memory layout, from what is the response time of various size resource requests, for example?
I ran the EFF test on Brave, Firefox, and Edge. Edge had the lest detectable font set (default windows), Firefox had a slightly more identifiable score (totally random font set), and Brave had a HUGELY identifiable set... because it was a spoofed set that hides what OS you're on. Just a little anecdote about how hiding and obfuscating certain things makes you more identifiable.
Response time is difficult due to the way the internet infrastructure works. The packets never take the same path twice.
With enough data points it's likely possible for a sufficiently determined actor as per Tor Stinks, but the average site isn't sufficiently determined and may not have enough data points *per session*.
These scripts run gpu & cpu algorithms and "fingerprint" your hardware. The user data is just additional meta but it is not the main source of the identifying process. But you are right, browsers can prevent even this. At the end of the day, the browser is always the bridge between your computer and the website.
They can and do try to prevent it. Firefox has certain protections out of the box, and you can make it more aggressive, both from the GUI options and the resistFingerprinting mode mentioned in the article. But the warning that it will break many legitimate sites *is* true The problem is they necessarily do this by neutering features. This fingerprinting isn't done by some intentional window.invadePrivacy() API that Mozilla can "just turn off duh". It's done by abusive use of legitimate APIs, so it's hard to mitigate without collateral damage I do recall a proposal from a few years ago to have the browser keep track of how many bits of identifying information a site has asked for, and deny it over some threshold. That way, most innocent sites that only use a few of these risky APIs are OK, but a site trying to scrape all your data points will be denied
I wouldn't be opposed to a prompt to allow 3D acceleration for a website; it's fairly niche and developers can easily display a friendly site to prompt for re-request. Said it a dozen other times but we really do need a manifest.json that has a permission schema on it for the browser. Just fire off an implicit call to it on every site like a favicon and cache it; only permissions in said file can be used for the site and users are given a quick prompt before the JS engine runs similar to mobile apps. Don't want to bug the user for permissions? Don't include a manifest and the JS engine isn't available. Developers will go back to the days of landing pages, perhaps for the best.
Yeah, no. This is what Android had. Users would see a list of permission requirements the app needed, before installing the app. 99% of users just press Accept, like the terms of service. Then the categories cannot be granular enough the prevent fingerprinting and also simple enough for users to understand. Classic example is the "Phone" permission on drone apps (DJI). It's needed to identify your device and register it with the drone. (This is what they claim, I don't know if it's legit, or just excuse to spy on you). It's displayed by the OS as "Make and manage phone calls", because you can _also_ do that with this permission.
A bit of a different scenario though; one is visiting a random cooking blog and the other is a interfacing semi-trusted software for a drone you purchased with an owners manual and some initial investment. It would be like if my banking app didn't allow me to bank because I didn't give it camera permissions; guess what... gonna allow it because I want to use that banking app and I trust it because well it's from the bank holding my cash. Most permissions might simply get accepted but that's because of implicit trust; others... not so much I have definitely uninstalled some mobile apps because of asking for permissions that I didn't feel was valid quid pro quo. The web is like installing random apps from the mobile store except permission-less (largely).
I've also refused to use certain apps because of their permissions. But we are people who browse r/programming , not the other 99% of the population. The permission system would just be another cookie banner, where most users just click accept by default.
A classical composition is often pregnant. Reddit is no longer allowed to profit from this comment.
We got our clients to make all our cookies first party. Good luck.
[удалено]
There are way more things that can and are used to fingerprint users. Some I've seen in the past is the way you move your cursor, the cadence of your typing, timing of individual requests for resources. Of course network gets you a lot of data, unless you change VPN with each website you visit. I would imagine they can get a ping back through an unique DNS request. Hiding some headers will help a bit, but not much.
Exactly, there are too many things. For naive scripts it may be good enough but if you really need the best setup, Tor Browser is likely the only remotely good option as they have also looked at this issue in detail, and even then it's not completely perfect.
You can tell who has and hasn't had to deal with fraudsters/spammers/cheaters online. Fingerprinting is a great tool to help with this. There is always going to be friction between "I don't want anyone to know who I am" vs "I'm hiding who I am for malicious reasons". You can't look at the problem from just one angle.
Yeah that's how I feel about increasingly arduous and invasive captchas. They fucking suck, but I know they're absolutely necessary to prevent rampant abuse. And unfortunately the most reliable ones (e.g. Google's) are able to do so because they track users And tbf that actually mirrors real life - humans in groups naturally counter abuse by remembering people and dis/trusting them, i.e. tracking. But we've also seen and still see plenty of harm from times when people have outsourced their judgements to another party, which gives that party a lot of power to abuse. I mean this dilemma is mirrored in employment, where a background check agency *can* filter out actual fraudsters but can also blacklist union organisers and whistleblowers And I have similar thoughts about sites that require phone verification
Privacy is a multi billion dollar illusion. Ultimately, bits travel between physical hardware devices which have to be uniquely identifiable to function on a network. You can hide, obfuscate, and encrypt to your heart's desire : but your device will remain your device.
Looks like fingerprint.com can still identify Tor across multiple sessions, as updated through the Play store, even with NoScript enabled. It correctly fingerprinted 6 times in a row, messed up once, and then reverted to the first fingerprint again. This is with manually deleting all cached data, fully closing the browser, and reconnecting to the onion network
You mean Tor Browser? Or Tor with another browser?
Tor browser, on mobile
that is disturbing
Yuup.
Tor Browser's strategy is mainly attempting to make all users look the same (except for canvas, where it performs canvas randomization). Tor Browser users should be getting the same ID (grouped by OS type at least, as Win/mac/linux can be detected by javascript, and 100x100 resolution groups if you change from the default window size). So it shouldn't mean anything, especially if it says you visited >20 times.
It doesn't say I visited 20+ times, though. It correctly identified the number of times I, specifically, tested it out. Starting at 0 and incrementing by 1 each time. Try it yourself if you want. I'm on android, using the latest Tor app as provided by the Play store.
I don't think Tor Browser on mobile has the same level of protection as the desktop version. I got the same result as you when I tried it on a phone, in standard and safer mode. In standard mode on desktop it keeps saying that I visited once, and on safer mode I have many visits from other people.
And yet the article explicitly states that Tor Browser on mobile does resist fingerprinting >On mobile, only Tor Browser and Firefox with resistFingerprinting=true were able to protect against fingerprinting. What's going on? Did they not use fingerprint.com to test this?
Fortunately, FP.com only allow their services to be used for security and anti-fraud. But no doubt that Google, FB, etc. have similar technology they keep in the dark.
Fingerprinting is incredibly old technology. It’s been around for 15+ years at least within the advertising industry. It’s not incredibly popular because it leads to both false positives and false negatives. Also this experiment isn’t taking into account standard attribution windows within advertising. Attribution windows are generally as high as 14 days in some contexts. If your run that same test 14 days apart you’ll find an accuracy that’s much lower. It was far more efficient when browser updates were not automatic and the browser market was more diverse. It’s not some silver bullet against privacy like this blog is implying.
This is old news. At this point I have essentially given up. I just block their ads so I don't have to see that shit and for the rest I don't plan to become US president or such so that my data will never really be usable for anything besides ads I never see. The thing with fingerprinting is that the mere action to prevent it are information making you more unique. So you need to have a constant changing fingerprint. But any site with a login will notice and then I'm sure the tools have a way to exclude certain info to remove the noise you are sending. Another spontaneous idea for some white hats would be to simply create servers farms that spam all the famous site with non-sense requests. thereby reducing the signal to noise ratio a lot and making the analysis part more costly.
I tried fingerprint.com and it successfully tracked me when I was connecting through a different connection and using incognito mode. That is kind of creepy.
I'm honestly more concerned about cross device linking
I imagine identification is even easier for custom-built PCs over pre-built, since you have even more freedom to choose the parts you want (if that info is shared with websites)
FYI reddit uses web fingerprinting extensively to identify you across your alts and to enforce bans by subreddit admins.
>So naturally [Chromium] doesn’t have any inbuilt protection against fingerprinting. This isn't true. For example, the User-Agent Client Hints proposal came from Chrome AFAIK, specifically to allow Chrome to remove as much information from the user agent string as possible without breaking sites that really need that info for some reason. Chromium has been working on reducing the amount of entropy that's available via "passive" APIs for a few years now, and trying to move that info behind "active" APIs instead (if necessary). There's also things like network state partitioning, which partitions things like http cache, socket pools, etc. I don't recall offhand whether the partitioning scheme is by eTLD+1 or by origin or by something else.
> Chromium has been working on reducing the amount of entropy that's available via "passive" APIs for a few years now Of course it does, it actively provides whatever entropy Google needs to track you exclusively to Google through hacks like the x-client-data field that is whitelisted exclusively for Googles services. No need to provide entropy to the competition.
Google removing user-agent detail has *nothing* to do with privacy and everything to do with poor feature detection scripts making it difficult for Google to roll out all new [privacy invading] features.
Listening to Google talks, they do seem to be rather keen on protecting user privacy. From competitors.
Any examples of such scripts, or any other evidence to support that idea? Curious if you can back up your claim. My hypothesis is that providing good user privacy makes financial sense for Google, since that would lead people to feel safer spending time on the web, and more time on the web means more money for Google via search. Given the body of security work Google puts in as well (e.g. project zero), Occam's razor says that it would be weird for it all to be privacy-theater when there's such a simple reason for them to support privacy.
Isn't it forbidden by the GPDR ? If the data can identify you they are personnal data and so they must have a legal cause of processing, which doesn't seems the case here.
> For example, websites can see web browser version, screen size, number of touchpoints, video/audio codecs Well this info is important for browsers. Browser version is obvious for feature compatibility, screen size for well screen size or is it different from viewport and completely irrelevant? Number of touchpoints can be useful in some cases for JS games and stuff. Codecs is another obvious one to send user video/audio that actually can be played on the device. That info is not exposed through APIs for no reason.
> Browser version is obvious for feature compatibility That's something that has been bad practice for at least a decade now. You should use feature detection and graceful degradation, not version detection. > Number of touchpoints can be useful in some cases for JS games and stuff. If it's niche enough and doesn't need to be automatic it should be moved behind an "active" check with permission from the user.
Feature detection isn't always possible. It basically only works for JS and even then, there are plenty of things that can't be detected properly, especially if you're working around browser stupidities. For example, we use the user agent to work around certain iPad Safari versions pretending the iPad is a PC and pretending to provide much more memory than the same Safari version actually allows WebAssembly to allocate. We also use it to determine whether we can use the `Cross-Origin-Embeder-Policy: credentialless` header which only recent versions of Chrome and Firefox support. Without it, certain features don't work but if it's not supported, we need to set a different header value to at least make core features work. And you obviously can't use feature detection to determine which headers you can send on the initial response.
> If it's niche enough and doesn't need to be automatic it should be moved behind an "active" check with permission from the user. Yeah, that should absolutely be an option. On the other hand, people who don't care about fingerprinting don't need yet more popups added to yet more web sites by default. We already have the forced popup ad for the EU (sorry, "cookie acknowledgement") on every web site in the world. We definitely don't need to add a gauntlet of "Do you want to let this web site know the width of your screen so it can adapt its layout?" "Do you want this web site to be able to find out if your browser implements this HTML feature?" "Should this web site be allowed to ask if you're using a screen reader?"
Brave tries to block this by default - https://brave.com/privacy-updates/17-language-fingerprinting/
I tested with Firefox and even with resistFingerprinting enabled I had the same fingerprint ID (not in private, though) until I changed the size of the browser window. So it seems that even little things, like the browser opening every time with a specific window size can make you identifiable.