{"id":51522,"date":"2026-04-08T14:26:52","date_gmt":"2026-04-08T18:26:52","guid":{"rendered":"https:\/\/mjtsai.com\/blog\/?p=51522"},"modified":"2026-04-08T14:26:52","modified_gmt":"2026-04-08T18:26:52","slug":"apple-scraping-youtube-for-ai-training-data","status":"publish","type":"post","link":"https:\/\/mjtsai.com\/blog\/2026\/04\/08\/apple-scraping-youtube-for-ai-training-data\/","title":{"rendered":"Apple Scraping YouTube for AI Training Data"},"content":{"rendered":"<p><a href=\"https:\/\/www.macrumors.com\/2026\/04\/06\/apple-sued-by-three-youtube-channels\/\">Joe Rossignol<\/a>:<\/p>\n<blockquote cite=\"https:\/\/www.macrumors.com\/2026\/04\/06\/apple-sued-by-three-youtube-channels\/\">\n<p>Three established YouTube channels have <a href=\"https:\/\/www.scribd.com\/document\/1022659389\/Ted-Entertainment-v-Apple\">sued Apple<\/a>, alleging that the company violated the U.S. Digital Millennium Copyright Act (DMCA) by unlawfully accessing and scraping millions of copyrighted videos from YouTube to train its AI models.<\/p>\n<p>[&#8230;]<\/p>\n<p>Apple &ldquo;deliberately circumvented&rdquo; YouTube&rsquo;s protections against video scraping and &ldquo;profited substantially&rdquo; by doing so.<\/p>\n<p>Apple&rsquo;s research papers indicate that some of the YouTube videos uploaded by the plaintiffs were used to train its AI models, the complaint alleges.<\/p>\n<\/blockquote>\n\n<p><a href=\"https:\/\/appleinsider.com\/articles\/26\/04\/06\/apple-may-have-scraped-youtube-videos-without-permission-for-ai-training\">Malcolm Owen<\/a>:<\/p>\n<blockquote cite=\"https:\/\/appleinsider.com\/articles\/26\/04\/06\/apple-may-have-scraped-youtube-videos-without-permission-for-ai-training\">\n<p>This apparently involved using computers with rotating IP addresses to scrape the data.<\/p>\n<p>[&#8230;]<\/p>\n<p>This data was then used to create an archive that was used to train &ldquo;Apple AI Video.&rdquo; As proof of this, the suit refers to an academic paper from Apple&rsquo;s researchers disclosing it had trained using Panda-70M. <\/p>\n<p>Panda-70M is described as a dataset made entirely of YouTube videos. All acquired via scraping YouTube for content. Ted Entertainment&rsquo;s content is in a total of 438 videos, with MrShortGameGolf&rsquo;s content in 8 videos, and Golfholics in 62 videos.<\/p>\n<\/blockquote>\n\n<p>And yet when Musi made an app where <em>users<\/em> could watch individual YouTube videos, with no circumvention, Apple pulled it from the App Store.<\/p>\n\n<p>Previously:<\/p>\n<ul>\n<li><a href=\"https:\/\/mjtsai.com\/blog\/2026\/03\/19\/apple-wins-musi-app-store-removal-lawsuit\/\">Apple Wins Musi App Store Removal Lawsuit<\/a><\/li>\n<li><a href=\"https:\/\/mjtsai.com\/blog\/2024\/06\/24\/ai-companies-ignoring-robots-txt\/\">AI Companies Ignoring Robots.txt<\/a><\/li>\n<li><a href=\"https:\/\/mjtsai.com\/blog\/2024\/06\/19\/apple-intelligence-training\/\">Apple Intelligence Training<\/a><\/li>\n<\/ul>","protected":false},"excerpt":{"rendered":"<p>Joe Rossignol: Three established YouTube channels have sued Apple, alleging that the company violated the U.S. Digital Millennium Copyright Act (DMCA) by unlawfully accessing and scraping millions of copyrighted videos from YouTube to train its AI models. [&#8230;] Apple &ldquo;deliberately circumvented&rdquo; YouTube&rsquo;s protections against video scraping and &ldquo;profited substantially&rdquo; by doing so. Apple&rsquo;s research papers [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"apple_news_api_created_at":"2026-04-08T18:26:55Z","apple_news_api_id":"ea6abd04-4353-4e20-b69b-0ac00d1469b4","apple_news_api_modified_at":"2026-04-08T18:26:55Z","apple_news_api_revision":"AAAAAAAAAAD\/\/\/\/\/\/\/\/\/\/w==","apple_news_api_share_url":"https:\/\/apple.news\/A6mq9BENTTiC2mwrADRRptA","apple_news_coverimage":0,"apple_news_coverimage_caption":"","apple_news_is_hidden":false,"apple_news_is_paid":false,"apple_news_is_preview":false,"apple_news_is_sponsored":false,"apple_news_maturity_rating":"","apple_news_metadata":"\"\"","apple_news_pullquote":"","apple_news_pullquote_position":"","apple_news_slug":"","apple_news_sections":"\"\"","apple_news_suppress_video_url":false,"apple_news_use_image_component":false,"footnotes":""},"categories":[2],"tags":[38,2602,1351,167,844,31,41,209,30,96,555],"class_list":["post-51522","post","type-post","status-publish","format-standard","hentry","category-technology","tag-apple","tag-apple-intelligence","tag-artificial-intelligence","tag-copyright","tag-digital-millennium-copyright-act-dmca","tag-ios","tag-lawsuit","tag-legal","tag-mac","tag-web","tag-youtube"],"apple_news_notices":[],"_links":{"self":[{"href":"https:\/\/mjtsai.com\/blog\/wp-json\/wp\/v2\/posts\/51522","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/mjtsai.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/mjtsai.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/mjtsai.com\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/mjtsai.com\/blog\/wp-json\/wp\/v2\/comments?post=51522"}],"version-history":[{"count":1,"href":"https:\/\/mjtsai.com\/blog\/wp-json\/wp\/v2\/posts\/51522\/revisions"}],"predecessor-version":[{"id":51523,"href":"https:\/\/mjtsai.com\/blog\/wp-json\/wp\/v2\/posts\/51522\/revisions\/51523"}],"wp:attachment":[{"href":"https:\/\/mjtsai.com\/blog\/wp-json\/wp\/v2\/media?parent=51522"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/mjtsai.com\/blog\/wp-json\/wp\/v2\/categories?post=51522"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/mjtsai.com\/blog\/wp-json\/wp\/v2\/tags?post=51522"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}