Wednesday, December 21, 2016

More macOS Preview PDF Trouble

Brooks Duncan (via Eddie Smith):

In the comments to my blog post about ScanSnap on Sierra, awesome DocumentSnap reader Alex writes this:

Since updating to macOS 10.12.2 I have found that Preview destroys the OCR layer of PDFs scanned and OCR’d with the latest ScanSnap Manager software if you make any sort of edit with Preview (e.g. deleting or reordering pages). After editing and saving with Preview, the PDF is no longer searchable and text is not selectable. Managed to replicate the problem on another Mac running 10.12.2. Doesn’t seem to affect PDFs scanned and OCR’d with other scanners or applications. Just wanted to warn everyone to perhaps wait before updating, and check that they haven’t unwittingly destroyed their OCR if they have already updated.


As you can see, it seems to be something to do with Preview on macOS Sierra 12.12.2. Alex said that he didn’t see the issue with other scanners, but I ran into it with both ScanSnap and Doxie. Both of those scanners use ABBYY for OCR, so that may be relevant.

I ran into a lot of PDF bugs in macOS 10.12.0. None have been fixed, as far as I can tell, and I’ve already filed two Radars for new issues in 10.12.2. It’s sad that basic functionality remains broken for so long—especially given that PDF was an area where Apple used to excel.

Update (2017-01-02): Adam C. Engst:

It pains me to say this, speaking as the co-author of “Take Control of Preview,” but I have to recommend that Sierra users avoid using Preview to edit PDF documents until Apple fixes these bugs. If editing a PDF in Preview in unavoidable, be sure to work only on a copy of the file and retain the original in case editing introduces corruption of any sort. Smile’s PDFpen is the obvious alternative for PDF manipulation of all sorts (and for documentation, we have “Take Control of PDFpen 8” too), although Adobe’s Acrobat DC is also an option, albeit an expensive one.

In the meantime, we’ll be watching closely to see which of these PDF-related bugs Apple fixes in 10.12.3, which is currently in beta testing.

John Gruber (tweet):

On the bright side, when this happened with the iWork suite, the Mac apps eventually gained back most of the functionality that was removed for parity with iOS. But it sure seems like Apple pulled the trigger on this at least a year before it was ready.

Update (2017-01-03): Chuq Von Rospach:

“parity with IOS took priority” over backward compatibility. As it did with Keynote, Pages, Numbers, iMovie, Photos… Very Apple.

See also: MacRumors and Hacker News.

Update (2017-01-05): Lloyd Chambers:

[Data] loss supports the “disdain and contempt” theory, but does not rule out sheer incompetence.

Note the “common core” thing—a very dangerous trend for future APIs in terms of reliability, compatibility and data integrity particularly since Apple seems to have no idea what unit testing is.

Whose data of any kind is safe when Apple has no qualms about rewriting APIs that damage user files?

Update (2017-01-23): David Sparks:

I am receiving a lot of email lately from readers encountering PDF problems on the Mac. If that’s you, you’re not alone.

Update (2017-03-28): macOS 10.12.4 fixes some of the PDF display bugs.

Update (2017-04-03): Adam C. Engst:

Last week, I polled the developers who had commented on the topic for my first article. The consensus was that Apple’s rewritten-for-Sierra PDFKit framework continues to improve, while simultaneously introducing new bugs.

I also found a new PDF display bug since talking with Engst.

Update (2017-05-17): Certain types of PDF scrolling remain broken in macOS 10.12.5.

28 Comments RSS · Twitter

Meanwhile Preview pretty much strips mages from PDFs on my Mac, requiring them to be opened in Adobe Reader. Quick Look fails, too.

Eduard Rozenberg

I don't use Preview to make edits to PDF any longer. In addition to problems already mentioned, I found that for certain files, adding minor markup using Preview would cause the saved file size to become either much larger or hugely larger. It appears Preview was re-interepreting and re-saving all of the contents of the PDF (images, etc), not just adding the markup, causing the size to balloon. Reported in Radars a long time ago, no fixes seen yet. I'm using PDF Expert for my edits now, and PDF Pen might work equally well.

An aside - certain markup done via some of these 3rd party tools may not be visible in Preview. For ex. lines I drew in PDF Expert were not visible when using QuickLook to view the saved PDF. When I need to send a PDF with markup to someone else, I do a Print -> Save as PDF from the PDF editor tool to make sure the recipient sees all the markup. A bit sad that PDF editing and markup is not all consistency and roses in MacLand these days.

Yep, I have the same behaviour as Brooks Duncan - edit the PDF and the OCR goes away.

10.12.2 started stripping off visible signatures as well (Preview never bothered to handle digital signatures, but at least it was showing their visible part until now).

They also seemed to have totally broken saving to an encrypted PDFs as well!

Guess they gutted the entire thing! Nice job Apple!

@John There’s also a bug displaying a regular PDF after displaying an encrypted one.

Ohhh! There may be light at the end of the tunnel! I just installed 10.12.3 Beta 2 and everything SEEMS to be working again!!!

@John Glad to hear that some issues are fixed for you. I’m still seeing a lot of rendering bugs, broken text selection, broken scroll wheel and swiping.

I scanned a doc via ScanSnap, saved it, opened it back up (looked fine), and saved it again as encrypted opened it up and it looks fine! So at least that much is working fine.. I'll have to dig into more of my docs to see if they're looking right..

Let's hope Apple screwed their heads back on right with this and put back in the old PDFKit..

@John Does it now preserve the invisible text layer if your ScanSnap is set to use OCR?

I'm not sure, how can I check?

@John You should be able to Select All and Copy and then paste the text into another app.

Okay.. Let me try..

YUP! It totally works!!! WHOOOHOOOOO!!!!!

Actually doing more testing it appears that if you edit the file after the scan then it wipes it out! Bummer!

The OCR layer that is...

[…] it’s being neglected. Apple doesn’t have the resources and attention to make sure that PDF files work or that their customers have a good Wi-Fi experience. And of course there’s plenty of iOS and […]

I just installed and tried Beta 3 and it appears that it's ALL working again! But yeah, you're totally right they seem to be just worrying about how MacOS and iOS can be more alike forgetting the fact that we rely on MacOS features..

Let's hope they'll remember this..

@John I just tried Beta 4, and all the PDF bugs I’ve been tracking are still present. In fact, one of the early Sierra ones that had been fixed is back, too. :-(

Really? Wow, I just installed B4 and no issues.. Dang! I'd like to find out if they stuck in the old PDFKit or are fixing the new one.. Hmmmm..

So you still have the ScanSnap OCR issues? What else?

@John I haven’t tested the ScanSnap stuff. I’m still seeing lots of PDFs that don’t display properly, text selection totally broken, scroll wheel and swiping broken.

[…] has the potential to help developers work around the bugs and limitations in Apple’s PDF Kit. However, it does not include a replacement for the top-level PDFView […]

[…] Various PDF rendering and user interface issues remain. […]

[…] Previously: PSPDFKit for macOS, More macOS Preview PDF Trouble. […]

[…] I don’t intend to save. That remains a problem because Preview has been crashy ever since the PDF rewrite, so it can overwrite your document, then crash, and then you have to remember to try to restore the […]

I'm sad to report that the ScanSnap OCR issue has returned for me with Big Sur. Couldn't find any mention of it anywhere yet, I guess the core user base has learned from the past and not upgraded to Big Sur yet? Or maybe people just don't notice it happening, because the bug is so hard to discover. (Modify PDF in Preview, save the file, close and reopen the file, and then the invisible text layer is just garbage data.)

[…] to assume, however that’s no longer the first time Apple messed this up. Certain, even Apple can’t story for all use cases when altering complex stuff like internal PDF […]

[…] slow decline of Preview and PDF rendering in general since MacOS Sierra is one of the more heartbreaking products of Apple’s annual software churn cycle. To be […]

Leave a Comment