Thursday, August 29, 2019

Spotlight Excludes Mail Folder on macOS 10.15

iQser_Developer:

Spotlight search for emails neither works with MDQueryRef nor with mdfind in the Terminal.app even if the user has granted full disc access for the app in the security settings. The application logic of our app works for High Sierra bot not for Catalina any more. Is this a bug or a feature? Hopefully it’s not a feature. Other content like documents, calendar events and contacts can be retrieved by MDQueryRef.

If I search with the default Spotlight interface (command-space), I can find emails. But even if I select “Show in Finder” in the result list, the Finder window is empty.

This seems to be due to Mail using Core Spotlight. This newer API makes it possible to index items that don’t have a one-to-one correspondence with files on disk. However, though the Spotlight user interface integrates results from both the old file-based Spotlight and Core Spotlight, it seems that the APIs for apps to do their own global querying only work with the former. Mail has been shifting towards using Core Spotlight for several releases, but until now the actual files in ~/Library/Mail were still searchable. The message files are still there in Catalina, and there’s still a Spotlight importer plug-in that can read them, but you can’t actually search them.

And this extends to system apps that use those APIs:

We realized, that even Automator has a problem, if one use Spotlight in a workflow. Alternatively one can use Find Email Messages, but this takes minutes before Automator shows a result.

houdah:

In prior versions of macOS, Spotlight searches allowed programatic access to mail messages. Mail metadata was readily available and well documented as properties on MDItem. The removal of this OS feature breaks what in essence was public API.

[…]

As a user I am worried to see more and more application data confined to closed “silos”. Previous versions of macOS / OS X have removed Safari bookmarks and history, Apple notes, etc. from indexing and thus from access by third party applications. This reduces the extensibility, scriptability, flexibility and thus usefulness of these core applications and ultimately the platform as a whole.

eskimo:

There’s an obvious conflict between the original Spotlight architecture (every app has access to everything) and the Core Spotlight architecture (every app is siloed), and it’s also obvious that the latter is better aligned with Apple’s ongoing privacy efforts.

houdah:

I don’t think the siloing of Core Spotlight is part of Apple’s privacy effort or that it actually aligns with this. The privacy effort is focussed on user consent. Once consent is given the data should be readily available. This allows for application integration, automation, platform extension and avoids duplicated effort. To me it seems we are actually looking at an incomplete implementation. API to search Core Spotlight is missing. API to access mail messages is missing.

For example: once access to photos is granted, photos can be accessed via the file system, via scripting, via PHPhotoLibrary, and via MLMediaLibrary frameworks. All sorts of things become (resp. remain) possible. All hinges on user consent.

The current siloing and move to Core Spotlight has two problems:

  • Much information is no longer available as siloing proceeds faster than API evolution. E.g. there is no API to access not notes or email messages. This limits integration and automation opportunities. In some cases, third-party developers can resort to duplicating effort. E.g. by direct access to IMAP servers
  • Where public API to silos exist (PHPhotoLibrary, Contacts, …) the API lack the unifying nature of Spotlight / NSMetadataQuery. A public API to Core Spotlight should solve that.

Update (2019-11-27): Pierre Bernard:

Spotlight was the de-facto API for accessing Mail messages. It gave access to messages, their subject, sender and recipient names, as well as a wealth of other well-documented metadata. Spotlight also provided notifications when new mail was downloaded.

This allowed applications and scripts to work with mail without duplicating the effort of connecting to mail servers. Automation tools could set up actions to run upon receiving email messages. E.g. a mail to self to “turn on screen sharing”.

[…]

This may appear to be a cautious approach that favors security and privacy over application interoperability and productivity. In truth, the new situation is likely to undo privacy benefits provided by the “Full Disk Access” protection introduced with macOS Mojave.

Power users and third-party applications are likely to create their own search indexes. These additional copies of the private data contained in mail messages will not benefit from SIP / “Full Disk Access” protection.

[…]

Since there is no way for third-party applications to search Core Spotlight, no third-party can offer a full-featured alternative to the Spotlight window.

5 Comments RSS · Twitter

I find Spotlight to be unreliable at searching Mail already. It will find certain terms in emails, but not other similar terms, in the same email (such as a list of part numbers from an online order). I mean on one hand, I'm glad it's not Windows Search and for the most part Spotlight mostly works. But it's also the kind of thing that needs to be 100% all the time so I can trust it. Because if I search for a part number and Spotlight says "No Results" then I should 100% know that that part is nowhere in my email archive, i.e. I never ordered it. But as of now, I'm only 99% sure because sometimes if I manually scan my emails one by one, sure enough, there's the part number in a past order confirmation even though Spotlight never found it.

@Ben I have also had problems with Mail not finding terms that are there, as well as finding messages that should not have been matched given the query. Another reason to use EagleFiler for reliable archive searching…

As an addition to this, I've been fiddling with some code of mine that uses NSMetadataQuery, and it appears that if your app does not yet have access to one of these protected folders (e.g. Desktop, Downloads), then the query will simply not return any results from that folder. Once the app has been given permission by the user, the queries then start returning results from that folder. So I think this is pretty much baked into the Spotlight architecture on 10.15.

I am demo'ing EagleFiler for the reasons noted.

But I cannot get Spotlight search to work on EagleFiler's database. I have excluded then re-included the folder in ~/Documents where the EagleFiler database is stored, restarted, etc all with no luck. Can Spotlight find emails within EagleFiler or did I just misunderstand its capabilities?

@Rob EagleFiler has its own search index that it uses when searching within the app. Since the EagleFiler library is just a regular folder of files, its contents are also available to Spotlight. However, as always with Spotlight, it can only search file types that it has importer plug-ins for. Spotlight does not know how to index e-mail messages that are stored in mbox files, which is the default format that EagleFiler uses.

Leave a Comment