Tuesday, October 28, 2025

Reddit Sues SerpApi

Practically overnight, a class of companies like SerpApi — known as “data scrapers” — found a new business selling data scraped from Google to companies looking to train their A.I. chatbots.

On Wednesday, the internet message board Reddit decided to fight the data scrapers. It filed a lawsuit in the U.S. District Court for the Southern District of New York claiming that four companies had illegally stolen its data by scraping Google search results in which Reddit content appeared.

Three of those companies — SerpApi; a Lithuanian start-up, Oxylabs; and a Russian company, AWMProxy — sold data to A.I. companies like OpenAI and Meta, according to the lawsuit. The fourth company, Perplexity, is a San Francisco start-up that makes an A.I. search engine.

Via John Gruber (Mastodon):

The entire premise of their business is crazy. SerpApi prints the crime right on the tin, describing their service as a “Google Search API” and “Scrape Google and other search engines from our fast, easy, and complete API.” What makes this so crazy is that Google doesn’t offer a search API. SerpApi is offering the Google search API that Google itself doesn’t offer, and charging companies money for it. Everyone, upon hearing the premise and nature of SerpApi, asks the same question: How is this legal? The answer is, it probably isn’t. But right on SerpApi’s home page they claim to offer customers a “U.S. Legal Shield”[…]

[…]

Why Google hasn’t sued them yet, I don’t understand.

This is a weird case. SerpApi is not like Common Crawl, building an index by scraping the Web. It’s scraping Google search results. Google actually does have legal access to scrape Reddit. And SerpApi is probably right that there’s First Amendment protection for indexing public search results, just as there is for indexing other public content. But, obviously, they’re trying to get at the Reddit data without paying to license it, and maybe the means for doing this violate the DMCA. On the one hand, hiring a hitman is illegal; you don’t get a legal shield by contracting out the crime. On the other hand, it’s not exactly clear to me which step of this chain is illegal, especially if Google seems not to object. Whatever, the result, I expect it to have far-reaching consequences for the Web.

Mike Masnick:

Reddit is NOT arguing that these companies are illegally scraping Reddit, but rather that they are illegally scraping… Google (which is not a party to the lawsuit) and in doing so violating the DMCA’s anti-circumvention clause, over content Reddit holds no copyright over. And, then, Perplexity is effectively being sued for linking to Reddit.

[…]

And, incredibly, within their lawsuit, Reddit defends its arguments by claiming it’s filing this lawsuit to protect the open internet. It is not. It is doing the exact opposite.

[…]

Reddit has a license to the content users post in order to operate the service, but they don’t hold the copyright on it. Indeed, Reddit’s terms state clearly that users retain “any ownership rights you have in Your content.” Because of Reddit’s agreement that it can license content, the deal with Google could sorta squeeze under that term, but that doesn’t give Reddit the right to then sue over users’ copyrights (as it’s doing in this case).

[…]

But here, Reddit is doing something even crazier. Because it’s saying that since these companies (allegedly) get around Google’s technological measures, then somehow Reddit can accuse them of violating 1201.

Nick Heer:

I am glad Masnick wrote about this despite my disagreement with his views on how much control a website owner ought to have over scraping. This is a necessary dissection of the suit, though I would appreciate views on it from actual intellectual property lawyers. They might be able to explain how a positive outcome of this case for Reddit would have clear rules delineating this conduct from the ways in which artificial intelligence companies have so far benefitted from a generous reading of fair use and terms of service documents.

Jeff Johnson:

OpenAI is blatantly ignoring my robots.txt User-agent: ChatGPT-User Disallow: /

ClaudeBot too, apparently.

John Gruber (Mastodon):

At the bottom of their “Use Cases” page, SerpApi lists the following companies and organizations as customers (“They trust us. You are in good company. Join them.”)

[…]

Was Apple removed from the list because they’re no longer (or never were?) a customer, or because they remain a customer but don’t want to be listed?

Previously:

Apple Intelligence Artificial Intelligence ChatGPT Claude Copyright Digital Millennium Copyright Act (DMCA) Google Google Search Lawsuit Legal Perplexity Reddit SerpApi Web Web Crawlers

1 Comment RSS · Twitter · Mastodon

Bart

October 29, 2025 10:11 AM

The irony of these companies fighting over content they didn’t generate. And which their current business models threaten the very future of.

I hear the sound of a golden goose being strangled.

Reddit Sues SerpApi

1 Comment RSS · Twitter · Mastodon

Leave a Comment