Tuesday, September 8, 2020

Applebot

Apple (via Hacker News):

Applebot is the web crawler for Apple. Products like Siri and Spotlight Suggestions use Applebot.

Traffic coming from Applebot is identified by its user agent, and reverse DNS shows it in the *.applebot.apple.com domain, originating from the 17.0.0.0 net block.

jd20:

Some fun facts:

  • Applebot was originally written in Go (and uncovered a user agent bug on redirects, revealing it’s Go origins to the world, which Russ Cox fixed the next day).
  • Up until the release of iOS 9, Applebot ran entirely on four Mac Pro’s in an office. Those four Mac Pro’s could crawl close to 1B web pages a day.
  • In it’s first week of existence, it nearly took Apple’s internal DNS servers offline. It was then modified to do it’s own DNS resolution and caching, fond memories…

Previously:

2 Comments RSS · Twitter

>Up until the release of iOS 9, Applebot ran entirely
>on four Mac Pro’s in an office (...) Those four Mac
>Pro’s could crawl close to 1B web pages a day

That's bonkers. I often wonder about how high the barrier of entry to competing with Google actually is. I'd love to know what duckduckgo's infrastructure looks like.

At this point, maybe the search engine market would be vulnerable to another pagerank-like disruption that genuinely improves search results.

Google may have indexed more pages (probably), but more critically, I sadly find it to be smarter about the result ordering, even with pages I presume DDG does have indexed. I don't think throwing more machines at it as enough to solve that problem.

Leave a Comment