Archive for December 4, 2015

Friday, December 4, 2015

Edit Distance and Edit Steps

Dave DeLong:

One of the interesting things about NSMetadataQuery is that after it has done its initial “gathering” of the results, all further updates are reported as a single array of results. Did something get added to the results? Here’s a new array of the state of all results now. Did something get removed? Here’s another array.

[…]

In other words, the Levenshtein algorithm can easily be generalized to work on any CollectionType of Equatables. Suddenly, it looks like just the thing we need to implement more-efficient array diffing.

One downside of the Levenshtein algorithm is that it only returns an Int. It only tells us how many steps we would need, but not what the actual steps are. For that, we’ll turn to a specific implementation of the Levenshtein algorithm, called the Wagner-Fischer algorithm.

Counting Steps With Multiple Devices

David Smith:

As you go about your daily life with an iPhone and Apple Watch each is constantly trying to measure your movement to determine if you are walking. Depending on what your current activity is, one or the other will be doing a better job of capturing what you are doing.

If, for example, you are pushing a stroller the iPhone in your pocket will do a much better job than the watch on your wrist. Conversely if your iPhone is in your purse, the watch on your wrist is much more accurate. The challenge this update solved is how to merge these two data sources in a way that provides a consistent and reliable picture of your day.

You might be wondering why I don’t use Apple’s Health.app merging system for this. After extensive testing about how that works I determined that it doesn’t really do a good job for step data. The Apple Health algorithm works around the concept of a ‘priority’ device. This priority device’s steps are then used in all instances except where that device is completely unavailable. In which case the secondary devices data is used to fill in the gaps.

The Search for a Faster CRC32

Rob Norris:

With the assistance of Linux’s perf utility, we found that most of our CPU time (~10%) was spent in one of the many Cyrus processes, in a function called crc32(). This function computes a checksum (using the common CRC32 algorithm) of some arbitrary chunk of data. The idea is to store the data and the checksum separately and then later, when you read the data, you recompute the checksum and compare with the original. If they’re different, then you know that either the data or the checksum have been corrupted and you can take appropriate action. Over the years, we’ve added checksums all over Cyrus, particularly in its data storage engine (known as twoskip [PDF]), and they’ve saved us more than once.

At that point it became obvious - we calculate billions of CRC32 checksums, and when you add it all up, that's a lot of CPU time. So we started looking into alternative implementations, because even a small gain will translate into a big win once you run it a few billion times.

The TTY Demystified

Linus Åkesson (via Hacker News):

The TTY subsystem is central to the design of Linux, and UNIX in general. Unfortunately, its importance is often overlooked, and it is difficult to find good introductory articles about it. I believe that a basic understanding of TTYs in Linux is essential for the developer and the advanced user.

Beware, though: What you are about to see is not particularly elegant. In fact, the TTY subsystem — while quite functional from a user’s point of view — is a twisty little mess of special cases. To understand how this came to be, we have to go back in time.

Launching PDF Expert for Mac

Denys Zhadanov (via Hacker News):

Watching 51 seconds and understanding the concept is much easier than reading a webpage or article, no matter how great they are. That’s why I strongly recommend shooting a professional video for your product launch.

[…]

Apple is being very helpful these days, and they want developers to succeed. That is why you really should keep in touch with App Store Business Management and keep them updated on what you’re building and when you’re getting ready for a launch.

[…]

 I initially thought that Mac App Store is pretty small, but it’s possible to pull solid revenue just there. #1 position in Top Charts will give you 1000–1200 installs a day in the US alone.

However, they also have a direct sale version because they really wanted to offer trials. They send out multiple e-mails to potential customers (who have signed up) before and after the trial period expires.

Update (2016-09-06): Denys Zhadanov:

I believe that developers are able to make good extra revenue on the Mac App Store, as opposed to distributing apps via the web site alone.

[…]

Since it’s more difficult to create a great Mac app, and the trend has been to abandon the Mac App Store, there’s less competition there! If you manage to create a great product that you know people will love, you can be a stand out there.