Safari Safe Browsing, China, and Privacy
Matthew Green (tweet, Hacker News):
It appears that, at least on iOS 13, Apple is sharing some portion of your web browsing history with the Chinese conglomerate Tencent. This is being done as part of Apple’s “Fraudulent Website Warning”, which uses the Google-developed Safe Browsing technology as the back end. This feature appears to be “on” by default in iOS Safari, meaning that millions of users could potentially be affected.
[…]
Google first computes the SHA256 hash of each unsafe URL in its database, and truncates each hash down to a 32-bit prefix to save space.
[…]
If the prefix is found in the browser’s local copy, your browser now sends the prefix to Google’s servers, which ship back a list of all full 256-bit hashes of the matching URLs, so your browser can check for an exact match.
[…]
The weakness in this approach is that it only provides some privacy. The typical user won’t just visit a single URL, they’ll browse thousands of URLs over time. This means a malicious provider will have many “bites at the apple” (no pun intended) in order to de-anonymize that user. A user who browses many related websites — say, these websites — will gradually leak details about their browsing history to the provider, assuming the provider is malicious and can link the requests. (There has been some academic research on such threats.)
MacJournals covered Safe Browsing back in 2008:
We must point out here that this system provides, indirectly, a way for Google to estimate what pages you’re visiting. If the URL of a page you want to visit matches the hash prefix of a known malicious page, Safari 3.2 appears to send that prefix to Google and ask for the entire 256-byte hash to make sure that this really is a malicious page (and also to verify that the page hasn’t been removed from Google’s lists since Safari’s last list update). Millions and millions of URLs could produce hashes that start with the same 32 bits, but if Google gets several requests for the same value, the company could reasonably infer that people were visiting the malicious page it had tracked—and since the request from Safari to Google comes from your IP address, Google might infer data from that as well. Mozilla’s privacy policy would forbid use of that data except to improve the service, but Apple’s privacy policy does not. Neither Apple nor Google state anywhere that they would only use such data to improve the phishing and malware protection features.
[…]
Safari 3.2’s “SafeBrowsing.db” file does not appear to contain data for Google’s whitelist, but the specification confirms that some clients can, with Google’s permission, use an “enhanced mode” that looks up each page you visit rather than maintaining the list on the client computer.
Rene Ritchie (MacRumors, Hacker News):
First, here’s Apple’s statement[…]
[…]
Because Safari is communicating with Google and Tencent, they do see the IP address of the device, and because they have the hash prefix, they do know the general pool to which the site belongs.
I assume the URLs are not very private, despite being hashed, because with knowledge of the full set of URLs and visit frequency, it’s probably possible to estimate what the hash prefixes map to. The main source of privacy is not the hashing but the fact that most URLs are only checked locally.
And you’d hope that only “unsafe” URLs would be looked up with Google/Tencent. But the implementation, at least initially, used a Bloom filter to save space. Since Bloom filters allow false positives, this means that the browser would be sending lookup requests even for some URLs whose prefixes didn’t actually match the local data set, i.e. ones that were not even suspected to be dangerous.
In a perfect world, a more privacy-centric company like Duck Duck Go or Apple would be able to maintain and use their own lists, both internationally and inside China. In the meantime, some system that anonymizes and relays requests, like Siri does or like Sign in with Apple, perhaps, could improve privacy within the current implementation.
This likely wouldn’t have much performance impact, since it would only affect URLs whose hash prefixes already matched.
My assumption was that Apple was only using Tencent in mainland China, where Google services are banned. Apple’s statement today makes it clear that that is true. But Apple brought this mini-controversy upon itself, because Apple’s own description of the feature doesn’t specify when the Fraudulent Website Warning feature uses Google and when it uses Tencent.
Via Dino Dai Zovi, a user on Hacker News disassembled the code for Safari’s Fraudulent Website Warning feature and verified that it only uses Tencent (instead of Google) if the region code is set to mainland China.
Previously:
Update (2019-12-16): Rosyna Keller:
Oh yeah, the Safari Privacy statement was updated in iOS 13.3 to more accurately describe how fraudulent websites work.
1 Comment RSS · Twitter
>I assume the URLs are not very private, despite being hashed
Yeah, the idea that hashing URLs makes them private and okay to share with third parties is ridiculous.