Archive for November 26, 2025

Wednesday, November 26, 2025

Internet Archive Wayback Machine Link Fixer

Internet Archive Wayback Machine Link Fixer is a WordPress plugin designed to combat link rot—the gradual decay of web links as pages are moved, changed, or taken down. It automatically scans your post content—on save and across existing posts—to detect outbound links. For each one, it checks the Internet Archive’s Wayback Machine for an archived version and creates a snapshot if one isn’t available.

When a linked page disappears, the plugin helps preserve your user experience by redirecting visitors to a reliable archived version. It also works proactively by archiving your own posts every time they’re updated, creating a consistent backup of your content’s history.

This is such a great idea. I’ve had it installed for a few weeks now but have mixed thoughts on the execution. The initial version had a bunch of significant bugs, and they seem to be doing a good job of fixing them. It seems to be thoughtfully designed to process a large number of old posts without overloading your server. The queueing functionality is also important because the Internet Archive’s own servers frequently go down.

The part where it submits your own posts, and the pages your post links to, to the archive seems to work well. I think this is the most important part because you can always go back and fix broken links, but you can’t go back and archive pages that weren’t archived. However, some of my posts since installing the plug-in (e.g. this one) don’t seem to have made it into the archive. This may be because the archive was down at the time of the post. Presumably, the Auto Archiver will eventually come back around and submit them again.

The part where it replaces broken links with archive links is implemented in JavaScript. I like that it doesn’t modify the post content in your database. It seems safe to install the plug-in without worrying about it messing anything up. However, I had kind of hoped that it would fix the links as part of the PHP rendering process. Doing it in JavaScript means that the fixed links are not available in the actual HTML tags on the page. And the data that the JavaScript uses is stored in an invisible <div> under the attribute data-iawmlf-post-links, which makes the page fail validation.

I have in the past manually inserted Internet Archive links when I came across links that were broken, and I thought I might use the plug-in to help with that instead of relying on the JavaScript fix-ups. However, when you set it to show broken links that are archived, I don’t see any such links. It’s currently showing me 188 pages of links where the Archive Status is “Link is excluded from being archived.” I tried sorting by Archive Status, but it still doesn’t show any that are both broken and archived.

The part where it finds broken links that are not archived is also not very useful because there are a huge number of links where it shows a 403 error even though the link still works. There doesn’t seem to be a way to separate the URLs that are genuinely gone from the ones that the Internet Archive doesn’t have permission to access.

Ashley Belanger:

Last month, the Internet Archive’s Wayback Machine archived its trillionth webpage, and the nonprofit invited its more than 1,200 library partners and 800,000 daily users to join a celebration of the moment. To honor “three decades of safeguarding the world’s online heritage,” the city of San Francisco declared October 22 to be “Internet Archive Day.”

[…]

An Internet Archive spokesperson confirmed to Ars that the archive currently faces no major lawsuits and no active threats to its collections.

Previously:

Datacide Internet Archive This Blog Web WordPress

Comments

Viewing Metadata in the Finder

Howard Oakley:

The Finder can display more information about files than their size and datestamps, and for some types of file can extend to a lot of useful metadata. These are shown in the Preview pane containing the file’s QuickLook thumbnail, in the Get Info dialog, and some can be added to the columns shown in List View.
[…]
To a degree, the user determines which fields are displayed in the Information shown in the Preview pane, although Apple doesn’t mention the key setting involved. Select the file, ensure the blue text to the right of Information is set to Show Less, then open its Preview Options using the Finder’s View menu.
[…]
It’s only when the Preview pane is showing less information that your Preview Options are applied, and they’re now used the same for all types of Image.

Unfortunately:

Most of the metadata can’t be displayed unless the file is in a folder indexed by Spotlight. It can’t even tell you the dimensions of an image.
With my typical window sizes, there’s barely any space below the preview to see the metadata. I like to hide the Quick Actions and Last Opened date, to make more space for what I care about, but this has to be set separately for each type of file. I assume these settings are fragile and will have to be reapplied many times.
There’s no way to adjust the order (like you can in Lightroom). And the order in the Preview Options inspector doesn’t fully match the order in the actual Finder window (e.g. Tags is at the top in one and the bottom in the other).

Previously:

Folder Preview 1.6

Exchangeable Image File Format (EXIF) Finder Mac macOS Tahoe 26 Metadata Spotlight

1 Comment

UK iCloud Lawsuit

Tim Hardwick (2024):

[British consumer group] Which? alleges that the company makes it difficult for customers to use alternative cloud storage providers “by giving its iCloud storage service preferential treatment,” and “‘trapping’ customers with Apple devices into using iCloud.”
The consumer group filed the legal action with the Competition Appeal Tribunal, and said it was seeking damages for 40 million Apple users in the UK. If successful, the lawsuit could result in a £70 payout per customer. According to the Consumer Rights Act 2015, all those eligible are automatically included in the claim unless they choose to opt out.
Which? said it was urging Apple “to resolve this claim without the need for litigation by offering consumers their money back and opening up iOS to allow users a real choice for cloud services.”

Part of Apple’s defense is that almost 50% of customers don’t pay for iCloud+, which probably means that their photos and other data aren’t backed up. iOS doesn’t support backing up to other cloud services, and local backups now have added friction.

Hartley Charlton:

Apple told the Competition Appeal Tribunal that Which had not provided enough clarity about its third-party funder, Litigation Capital Management (LCM), which is paying for the legal action. LCM recently suffered a severe financial decline, losing 99% of its share value from its November 2024 level, leaving it worth about $16 million. Apple argued that this collapse raised questions about whether LCM could still support the lawsuit.
It also said that if it were allowed to pursue an appeal later in the process or if Which’s funding is withdrawn, Apple could face a significant risk of not being able to recover its legal costs because LCM might not be able to pay them.

Previously:

Antitrust Apple Backup iCloud iOS Lawsuit Legal United Kingdom

1 Comment

Apple Intelligence Training Lawsuit

Mariella Moon:

Two authors have filed a lawsuit against Apple, accusing the company of infringing on their copyright by using their books to train its artificial intelligence model without their consent. The plaintiffs, Grady Hendrix and Jennifer Roberson, claimed that Apple used a dataset of pirated copyrighted books that include their works for AI training. They said in their complaint that Applebot, the company’s scraper, can “reach ‘shadow libraries’” made up of unlicensed copyrighted books, including (on information) their own. The lawsuit is currently seeking class action status, due to the sheer number of books and authors found in shadow libraries.

Malcolm Owen:

The suit hinges on whether Apple used the dataset referred to as “Books3.” The suit alleges that Books3 is based on the contents of a “shadow library” website known as Bibliotik, which allegedly hosted the contents of thousands of books.
The dataset was available on HuggingFace before being removed in October 2023, and it was also included as part of the RedPajama dataset. RedPajama was used as part of the OpenELM open-source models, which Apple made available in 2024.
Since Apple used a dataset that was connected to pirated books for OpenELM, the suit believes that Apple probably used the same techniques to train its Foundation Language Models.
[…]
In July, Apple doubled down on its claims of being ethical, including items accessible from the Internet. In a research paper, it explained that, if a publisher didn’t agree to data being scraped for training, it will not scrape the content.

Previously:

Apple Apple Intelligence Artificial Intelligence Copyright Lawsuit Legal

5 Comments