Friday, June 10, 2022

Swift Regex

Meet Swift Regex:

Learn how you can process strings more effectively when you take advantage of Swift Regex. Come for concise literals but stay for Regex builders — a new, declarative approach to string processing. We’ll also explore the Unicode models in String and share how Swift Regex can make Unicode-correct processing easy.

I’m really excited about Swift Regex. It should be much more ergonomic than NSRegularExpression, which has always been awkward to use, and faster, too, since it can work directly in Swift’s native string encoding.

Also, in theory it should handle Unicode edge cases better. I’ve run into problems where NSRegularExpression returns capture ranges that are valid for NSString and String.UTF16View but which do not exactly map to valid indexes into the String itself.

I’d like to see Apple bring Swift Regex to Core Data, too. First, it would be nice to get consistent results (both matching and performance characteristics) by using the same engine throughout an app. Second, regex matching within SQLite queries is currently slow because it converts each database string from UTF-8 to UTF-16 before invoking ICU.

Swift Regex: Beyond the basics:

Go beyond the basics of string processing with Swift Regex. We’ll share an overview of Regex and how it works, explore Foundation’s rich data parsers and discover how to integrate your own, and delve into captures. We’ll also provide best practices for matching strings and wielding Regex-powered algorithms with ease.

Ron Avitzur:

Tis nifty watching Xcode refactor /(([0-9]*\.?[0-9]+)([eE][-+]?[0-9]+)?)|NaN/ to a Regex Builder, even if the compiler is unable to type-check this expression in reasonable time.

Steve Canon:

The converter that Xcode uses for regex -> builder is in the experimental-string-processing repo.


Update (2022-06-16): Frank Illenberger:

I’m disappointed by the performance of the new Swift Regex. Take a look at these two versions of a CSS variable lookup. On my Intel iMac, the first one with NSRegularExpression takes 0.0002s to complete an enum over all matches on a 2000 line CSS file. The second one takes 0.25s.

Update (2022-10-11): Keith Harrison:

You can improve the type safety of captured values by adding a transform block to the capture. This allow you to transform the generic capture output to a known type. For example, to transform the captured digits to an (optional) integer[…]

And you can use TryCapture to make the regex backtrack if the Swift code returns nil.

You can use the Foundation date, number, currency and URL parses with regex builder.

Update (2022-12-02): Shane Crawford:

We’ll be digging into Regex Builder to discover its wide-reaching capabilities.

Sindre Sorhus:

The new regex stuff in Swift is truly amazing! The following would have been an unreadable mess in most other languages.

Comments RSS · Twitter

Leave a Comment