Live Help Menu Searching via NSUserInterfaceItemSearching
Gus Mueller (tweet):
When rendering the documentation (542MB of it, 1.22GB pre-render!), FMWrite also creates a SQLite index (1MB) of all the text content, which I then copy into Acorn’s resources folder at build time. Acorn then ships with this SQLite file.
You don’t really need to build your own documentation app. But you do need an index of your documentation to ship with your app. SQLite worked great for us.
Step 2: Let me introduce you to NSUserInterfaceItemSearching, which is a class which shipped in 10.6 but I didn’t notice till about six months ago. It’s a pretty simple protocol—you just register a class which conforms to it, and you’re asked for entries when the user searches for something via the Help menu.
Note that you don’t have to implement your own index and conform to this protocol to get live searching in the Help menu. You get that for free if you’re using Apple Help and have an hiutil index. The protocol is for when you want to search the help locally but host it on your server.
3 Comments RSS · Twitter
Inspired by Gus Mueller idea, I posted some thoughts on FTS4, and using contentless tables for this kind of setup, where one is indexing large immutable documents that don't really need to be in the database itself: http://cocoamine.net/blog/2015/09/07/contentless-fts4-for-large-immutable-documents/
> Would be interesting to compare with Search Kit.
Definitely curious about performance, both in terms of file size and query speed.
A number of features of SearchKit, like stop words, tokenization, etc. are more advanced than SQLite, though normalization/tokenization can be done using various OS X APIs.
Then, an important part of SearchKit is its ability to index any type of file for which there is a Spotlight importer. There are a bunch of built-in ones, for e.g. PDFs, HTML, MS Office, etc. As far as I can tell, there is no way to take advantage of the text representation the importers provide to the system:
> kMDItemTextContent
> Contains a text representation of the content of the document. Data in multiple fields should be combined using a whitespace character as a separator. An application's Spotlight importer provides the content of this attribute.
> Applications can create queries using this attribute, but are not able to read the value of this attribute directly.
So you are left writing your own parsers or using private APIs.