{"id":50199,"date":"2025-11-26T15:01:58","date_gmt":"2025-11-26T20:01:58","guid":{"rendered":"https:\/\/mjtsai.com\/blog\/?p=50199"},"modified":"2025-11-26T15:01:58","modified_gmt":"2025-11-26T20:01:58","slug":"apple-intelligence-training-lawsuit","status":"publish","type":"post","link":"https:\/\/mjtsai.com\/blog\/2025\/11\/26\/apple-intelligence-training-lawsuit\/","title":{"rendered":"Apple Intelligence Training Lawsuit"},"content":{"rendered":"<p><a href=\"https:\/\/www.engadget.com\/ai\/apple-faces-lawsuit-over-alleged-use-of-pirated-books-for-ai-training-160016161.html\">Mariella Moon<\/a>:<\/p>\n<blockquote cite=\"https:\/\/www.engadget.com\/ai\/apple-faces-lawsuit-over-alleged-use-of-pirated-books-for-ai-training-160016161.html\"><p>Two authors have <a href=\"https:\/\/www.reuters.com\/sustainability\/boards-policy-regulation\/apple-sued-by-authors-over-use-books-ai-training-2025-09-05\/\">filed a lawsuit<\/a> against Apple, accusing the company of infringing on their copyright by using their books to train its artificial intelligence model without their consent. The plaintiffs, Grady Hendrix and Jennifer Roberson, claimed that Apple used a dataset of pirated copyrighted books that include their works for AI training. They said in their complaint that Applebot, the company&rsquo;s scraper, can &ldquo;reach &lsquo;shadow libraries&rsquo;&rdquo; made up of unlicensed copyrighted books, including (on information) their own. The lawsuit is currently seeking class action status, due to the sheer number of books and authors found in shadow libraries.<\/p><\/blockquote>\n\n<p><a href=\"https:\/\/appleinsider.com\/articles\/25\/09\/06\/authors-claim-apple-intelligence-may-be-trained-with-pirated-books\">Malcolm Owen<\/a>:<\/p>\n<blockquote cite=\"https:\/\/appleinsider.com\/articles\/25\/09\/06\/authors-claim-apple-intelligence-may-be-trained-with-pirated-books\"><p>The suit hinges on whether Apple used the dataset referred to as &ldquo;Books3.&rdquo; The suit alleges that Books3 is based on the contents of a &ldquo;shadow library&rdquo; website known as Bibliotik, which allegedly hosted the contents of thousands of books.<\/p><p>The dataset was available on HuggingFace before being removed in October 2023, and it was also included as part of the RedPajama dataset. RedPajama was used as part of the OpenELM open-source models, which Apple made <a href=\"https:\/\/appleinsider.com\/articles\/24\/04\/24\/apples-four-new-open-source-models-could-help-make-future-ai-more-accurate\">available in 2024<\/a>.<\/p><p>Since Apple used a dataset that was connected to pirated books for OpenELM, the suit believes that Apple probably used the same techniques to train its <a href=\"https:\/\/appleinsider.com\/articles\/25\/06\/09\/apple-intelligence-opened-up-to-all-developers-with-foundation-models-framework\">Foundation Language Models<\/a>.<\/p><p>[&#8230;]<\/p><p>In July, Apple doubled down on its claims of being ethical, including items accessible from the Internet. In a <a href=\"https:\/\/machinelearning.apple.com\/papers\/apple_intelligence_foundation_language_models_tech_report_2025.pdf\">research paper<\/a>, it explained that, if a publisher didn&rsquo;t agree to data being scraped for training, it will not scrape the content.<\/p><\/blockquote>\n\n<p>Previously:<\/p>\n<ul>\n<li><a href=\"https:\/\/mjtsai.com\/blog\/2024\/06\/24\/ai-companies-ignoring-robots-txt\/\">AI Companies Ignoring Robots.txt<\/a><\/li>\n<li><a href=\"https:\/\/mjtsai.com\/blog\/2024\/06\/19\/apple-intelligence-training\/\">Apple Intelligence Training<\/a><\/li>\n<\/ul>","protected":false},"excerpt":{"rendered":"<p>Mariella Moon: Two authors have filed a lawsuit against Apple, accusing the company of infringing on their copyright by using their books to train its artificial intelligence model without their consent. The plaintiffs, Grady Hendrix and Jennifer Roberson, claimed that Apple used a dataset of pirated copyrighted books that include their works for AI training. [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"apple_news_api_created_at":"2025-11-26T20:02:02Z","apple_news_api_id":"381ce01f-548c-4023-9386-a46eeb868f02","apple_news_api_modified_at":"2025-11-26T20:02:02Z","apple_news_api_revision":"AAAAAAAAAAD\/\/\/\/\/\/\/\/\/\/w==","apple_news_api_share_url":"https:\/\/apple.news\/AOBzgH1SMQCOThqRu64aPAg","apple_news_coverimage":0,"apple_news_coverimage_caption":"","apple_news_is_hidden":false,"apple_news_is_paid":false,"apple_news_is_preview":false,"apple_news_is_sponsored":false,"apple_news_maturity_rating":"","apple_news_metadata":"\"\"","apple_news_pullquote":"","apple_news_pullquote_position":"","apple_news_slug":"","apple_news_sections":"\"\"","apple_news_suppress_video_url":false,"apple_news_use_image_component":false,"footnotes":""},"categories":[2],"tags":[38,2602,1351,167,41,209],"class_list":["post-50199","post","type-post","status-publish","format-standard","hentry","category-technology","tag-apple","tag-apple-intelligence","tag-artificial-intelligence","tag-copyright","tag-lawsuit","tag-legal"],"apple_news_notices":[],"_links":{"self":[{"href":"https:\/\/mjtsai.com\/blog\/wp-json\/wp\/v2\/posts\/50199","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/mjtsai.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/mjtsai.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/mjtsai.com\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/mjtsai.com\/blog\/wp-json\/wp\/v2\/comments?post=50199"}],"version-history":[{"count":1,"href":"https:\/\/mjtsai.com\/blog\/wp-json\/wp\/v2\/posts\/50199\/revisions"}],"predecessor-version":[{"id":50200,"href":"https:\/\/mjtsai.com\/blog\/wp-json\/wp\/v2\/posts\/50199\/revisions\/50200"}],"wp:attachment":[{"href":"https:\/\/mjtsai.com\/blog\/wp-json\/wp\/v2\/media?parent=50199"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/mjtsai.com\/blog\/wp-json\/wp\/v2\/categories?post=50199"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/mjtsai.com\/blog\/wp-json\/wp\/v2\/tags?post=50199"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}