Hi there,
We have been trying to set up URL filtering for our app but have run into a wall with generating the bloom filter.
Firstly, some context about our set up:
-
OHTTP handlers
- Uses pre-warmed lambdas to expose the gateway and the configs endpoints using the javascript libary referenced here - https://developers.cloudflare.com/privacy-gateway/get-started/#resources
- Status = untested
- We have not yet got access to Apples relay servers
-
PIR service
- We run the PIR service through AWS ECS behind an ALB
- The container clones the following repo https://github.com/apple/swift-homomorphic-encryption, outside of config changes, we do not have any custom functionality
- Status = working
- From the logs, everything seems to be working here because it is responding to queries when they are sent, and never blocking anything it shouldn’t
-
Bloom filter generation
-
We generate a bloom filter from the following url list:
https://example.com http://example.com example.com -
Then we put the result into the url filtering example application from here - https://developer.apple.com/documentation/networkextension/filtering-traffic-by-url
-
The info generated from the above URLs is:
{ "bits": 44, "hashes": 11, "seed": 2538058380, "content": "m+yLyZ4O" } -
Status = broken
-
We think this is broken because we are getting requests to our PIR server for every single website we visit
-
We would have expected to only receive requests to the PIR server when going to example.com because it’s in our block list
-
It’s possible that behind the scenes Apple runs sporadically makes requests regardless of the bloom filter result, but that isn’t what we’d expect
-
We are generating our bloom filter in the following way:
-
We double hash the URL using fnv1a for the first, and murmurhash3 for the second
hashTwice(value: any, seed?: any): any { return { first: Number(fnv1a(value, { size: 32 })), second: murmurhash3(value, seed), }; } -
We calculate the index positions from the following function/formula , as seen in https://github.com/ameshkov/swift-bloom/blob/master/Sources/BloomFilter/BloomFilter.swift#L96
doubleHashing(n: number, hashA: number, hashB: number, size: number): number { return Math.abs((hashA + n * hashB) % size); }
Questions:
- What hashing algorithms are used and can you link an implementation that you know is compatible with Apple’s?
- How are the index positions calculated from the iteration number, the size, and the hash results?
- There was mention of a tool for generating a bloom filter that could be used for Apple’s URL filtering implementation, when can we expect the release of this tool?
-