Wednesday, April 8, 2026

Apple Scraping YouTube for AI Training Data

Joe Rossignol:

Three established YouTube channels have sued Apple, alleging that the company violated the U.S. Digital Millennium Copyright Act (DMCA) by unlawfully accessing and scraping millions of copyrighted videos from YouTube to train its AI models.

[…]

Apple “deliberately circumvented” YouTube’s protections against video scraping and “profited substantially” by doing so.

Apple’s research papers indicate that some of the YouTube videos uploaded by the plaintiffs were used to train its AI models, the complaint alleges.

Malcolm Owen:

This apparently involved using computers with rotating IP addresses to scrape the data.

[…]

This data was then used to create an archive that was used to train “Apple AI Video.” As proof of this, the suit refers to an academic paper from Apple’s researchers disclosing it had trained using Panda-70M.

Panda-70M is described as a dataset made entirely of YouTube videos. All acquired via scraping YouTube for content. Ted Entertainment’s content is in a total of 438 videos, with MrShortGameGolf’s content in 8 videos, and Golfholics in 62 videos.

And yet when Musi made an app where users could watch individual YouTube videos, with no circumvention, Apple pulled it from the App Store.

Previously:

2 Comments RSS · Twitter · Mastodon


Someone else

Seems like that’s conflating two dissimilar things.

Also, isn’t there a ‘fair use’ copyright exemption for research?


"In recent months, the same three YouTube channels have filed similar lawsuits against other tech giants, including Meta, Nvidia, ByteDance, and Snap."

Fixed.

Leave a Comment