{"id":32031,"date":"2021-03-31T10:08:26","date_gmt":"2021-03-31T14:08:26","guid":{"rendered":"https:\/\/mjtsai.com\/blog\/?p=32031"},"modified":"2021-03-31T10:49:26","modified_gmt":"2021-03-31T14:49:26","slug":"making-nsfetchrequest-fetchbatchsize-work-with-swift","status":"publish","type":"post","link":"https:\/\/mjtsai.com\/blog\/2021\/03\/31\/making-nsfetchrequest-fetchbatchsize-work-with-swift\/","title":{"rendered":"Making NSFetchRequest.fetchBatchSize Work With Swift"},"content":{"rendered":"<p><a href=\"https:\/\/developer.apple.com\/forums\/thread\/651325\">Apple Frameworks Engineer<\/a>:<\/p>\n<blockquote cite=\"https:\/\/developer.apple.com\/forums\/thread\/651325\"><p><code>Set<\/code> in Swift is an immutable value type. We do not recommend making Core Data relationships typed this way despite the obvious convenience. Core Data makes heavy use of Futures, especially for relationship values. These are reference types expressed as <code>NSSet<\/code>. The concrete instance is a future subclass however. This lets us optimize memory and performance across your object graph. Declaring an accessor as <code>Set<\/code> forces an immediate copy of the entire relationship so it can be an immutable Swift <code>Set<\/code>. This loads the entire relationship up front and fulfills the Future all the time, immediately. You probably do not want that.<\/p><\/blockquote>\n\n<p>It&rsquo;s so convenient, though, and often it doesn&rsquo;t matter because it&rsquo;s a small relationship or one that you will be fully accessing anyway. Perhaps the answer is to provide a duplicate set of <code>NSSet<\/code> accessors for use when you want the lazy behavior enabled by the class cluster.<\/p>\n\n<blockquote cite=\"https:\/\/developer.apple.com\/forums\/thread\/651325\"><p>Similarly for fetch requests with batching enabled, you do not want a Swift <code>Array<\/code> but instead an <code>NSArray<\/code> to avoid making an immediate copy of the future.<\/p><\/blockquote>\n\n<p>Needless to say, the <a href=\"https:\/\/developer.apple.com\/documentation\/coredata\/nsfetchrequest\/1506558-fetchbatchsize\">documentation<\/a> doesn&rsquo;t mention this, but it does do a good job of explaining what <code>fetchBatchSize<\/code> does:<\/p>\n<blockquote cite=\"https:\/\/developer.apple.com\/documentation\/coredata\/nsfetchrequest\/1506558-fetchbatchsize\"><p>If you set a nonzero batch size, the collection of objects returned when an instance of <code>NSFetchRequest<\/code> is executed is broken into batches. When the fetch is executed, the entire request is evaluated and the identities of all matching objects recorded, but only data for objects up to the <code>batchSize<\/code> will be fetched from the persistent store at a time. The array returned from executing the request is a proxy object that transparently faults batches on demand. (In database terms, this is an in-memory cursor.)<\/p><p>You can use this feature to restrict the working set of data in your application. In combination with <code>fetchLimit<\/code>, you can create a subrange of an arbitrary result set.<\/p><\/blockquote>\n\n<p>Under the hood, this works by eagerly fetching the object IDs and lazily fetching and caching the objects, in batches, as they are accessed. The implementation is more optimized than what you could implement yourself, passing the object IDs to SQLite via temporary tables rather than as parameters to the SQL statement. There are some caveats to be aware of:<\/p>\n\n<ul>\n<li><p>If you&rsquo;re using a coordinator with multiple stores, it will get the sorting right, fetching multiple batches and merging them together without ever doing a giant fetch. However, it does seem to eventually load all the objects into memory at once, which mostly defeats the purpose of batching. If you <em>can<\/em> hold everything in memory but just prefer not to, I guess you could refault all the objects after the sorting has completed and let the special array bring them back as needed. Or, you can avoid combining <code>fetchBatchSize<\/code> with multiple stores and instead use a dictionary fetch request to get just the object IDs and the properties needed for sorting, save the IDs, and manually fetch batches of full objects as needed.<\/p><\/li>\n\n<li><p>I&rsquo;m a little worried that there are bugs related to multiple stores. Disassembling <code>_PFBatchFaultingArray<\/code> shows code that anticipates sometimes receiving more object IDs than it expected to fetch, and this has <a href=\"https:\/\/stackoverflow.com\/questions\/35688067\/coredata-error-batched-fetch-request-asked-to-fetch-1-objects-but-received-2-o\">occurred in the wild<\/a>. It&rsquo;s looks as if Core Data is querying the <code>Z_PK<\/code> without regard for which store it&rsquo;s supposed to be in. However, I tried to reproduce this situation by deliberately creating objects in multiple stores with the same <code>Z_PK<\/code> and everything seemed to work as expected on macOS 10.15.7.<\/p><\/li>\n<\/ul>\n\n<p>So, how do you get the optimized <code>fetchBatchSize<\/code> behavior when using Swift? The Apple engineer suggests using an <code>NSArray<\/code>, which I take to mean casting the result of the fetch via <code>as NSArray<\/code> to disabling automatic bridging and give your Swift code the original <code>NSArray<\/code>. However, my experience is that this doesn&rsquo;t work. All the objects get fetched before your code even accesses the array. I think it&rsquo;s because the special <code>as<\/code> behavior is for disabling bridging when calling Objective-C APIs from Swift, but <code>NSManagedObjectContext.fetch(_:)<\/code> is an overlay method implemented in Swift, not just a renaming of <code>-[NSManagedObjectContext executeFetchRequest:error:]<\/code>.<\/p>\n\n<p>This can be worked around by using an Objective-C category to expose the original method:<\/p>\n<pre>@interface NSManagedObjectContext (MJT)\n- (nullable NSArray *)mjtExecuteFetchRequest:(NSFetchRequest *)request error:(NSError **)error;\n@end\n\n@implementation NSManagedObjectContext (MJT)\n- (nullable NSArray *)mjtExecuteFetchRequest:(NSFetchRequest *)request error:(NSError **)error {\n    return [self executeFetchRequest:request error:error];\n}\n@end<\/pre>\n\n<p>Then you can implement a fetching method that preserves the batching behavior:<\/p>\n\n<pre>public extension NSManagedObjectContext {\n    func fetchNSArray&lt;T: NSManagedObject&gt;(_ request: NSFetchRequest&lt;T&gt;) throws -&gt; NSArray {\n        \/\/ @SwiftIssue: Doesn't seem like this cast should be necessary.\n        let protocolRequest = request as! NSFetchRequest&lt;NSFetchRequestResult&gt;        \n        return try mjtExecute(protocolRequest) as NSArray\n    }\n\n    func fetch&lt;T: NSManagedObject&gt;(_ request: NSFetchRequest&lt;T&gt;,\n                                   batchSize: Int) throws -&gt; MJTBatchFaultingCollection&lt;T&gt; {\n        request.fetchBatchSize = batchSize\n        return MJTBatchFaultingCollection(array: try fetchNSArray(request))\n    }\n}<\/pre>\n\n<p>The first method gives you the <code>NSArray<\/code>, but that is not very ergonomic to use from Swift. First, you have to cast the objects back to your <code>NSManagedObject<\/code> subclass. Second, it doesn&rsquo;t behave well when an object is deleted (or some other SQLite error occurs) between your fetch and when Core Data tries to fulfill the fault.<\/p>\n\n<p>If you&rsquo;re using Swift, you can&rsquo;t catch the <code>NSObjectInaccessibleException<\/code>, so you should be using <code>context.shouldDeleteInaccessibleFaults = true<\/code>. This means that instead of an exception you get a sort of tombstone object that&rsquo;s of the right class, but with all its properties erased.<\/p>\n\n<p>But it&rsquo;s hard to remember to check for that each time you use one of the objects in the <code>NSArray<\/code>, and you probably don&rsquo;t want to accidentally operate on the empty properties. So the second method uses a helper type to try to make the abstraction less leaky, always giving you either a valid, non-fault object or <code>nil<\/code>:<\/p>\n\n<pre>public struct MJTBatchFaultingCollection&lt;T: NSManagedObject&gt; {\n    let array: NSArray\n    let bounds: Range&lt;Int&gt;\n\n    \/\/ array is presumed to be a _PFBatchFaultingArray from a fetch request\n    \/\/ using fetchBatchSize.\n    public init(array: NSArray, bounds: Range&lt;Int&gt;? = nil) {\n        self.array = array\n        self.bounds = bounds ?? 0..&lt;array.count\n    }\n}\n\nextension MJTBatchFaultingCollection: RandomAccessCollection {\n    public typealias Element = T?\n    public typealias Index = Int\n    public typealias SubSequence = MJTBatchFaultingCollection&lt;T&gt;\n    public typealias Indices = Range&lt;Int&gt;\n    \n    public var startIndex: Int { bounds.lowerBound }\n    public var endIndex: Int { bounds.upperBound }\n       \n    public subscript(position: Index) -&gt; T? {\n        guard\n            let possibleFault = array[position] as? T,\n            let context = possibleFault.managedObjectContext,\n            \/\/ Unfault so that isDeleted will detect an inaccessible object.\n            let object = try? context.existingObject(with: possibleFault.objectID),\n            let t = object as? T else { return nil }\n        return t.isDeleted ? nil : t\n    }\n\n    public subscript(bounds: Range&lt;Index&gt;) -&gt; SubSequence {\n        MJTBatchFaultingCollection&lt;T&gt;(array: array, bounds: bounds)\n    }\n}\n\nextension MJTBatchFaultingCollection: CustomStringConvertible {\n    public var description: String {\n        \/\/ The default implementation would realize all the objects by printing\n        \/\/ the underlying NSArray.\n        return \"&lt;MJTBatchFaultingCollection&lt;\\(T.self)&gt; bounds: \\(bounds)&gt;\"\n    }\n}<\/pre>\n\n<p>It&rsquo;s still a bit leaky, because you have to be careful to only access the collection from the context&rsquo;s queue. But this is somewhat obvious because it has a separate type, so you&rsquo;ll get an error if you try to pass it to a method that takes an <code>Array<\/code>.<\/p>\n\n<p>The batch faulting behavior and batch size are preserved if you iterate over the collection or slice it. (When iterating the <code>NSArray<\/code> directly, small batch sizes don&rsquo;t work as expected because <a href=\"https:\/\/github.com\/apple\/swift\/blob\/a73a8087968f9111149073107c5242d83635107a\/stdlib\/public\/Darwin\/Foundation\/NSArray.swift#L120\">NSFastEnumerationIterator<\/a> will always load at least 16 objects at a time.)<\/p>\n\n<p>Previously:<\/p>\n<ul>\n<li><a href=\"https:\/\/mjtsai.com\/blog\/2021\/03\/31\/replacing-vs-migrating-core-data-stores\/\">Replacing vs. Migrating Core Data Stores<\/a><\/li>\n<li><a href=\"https:\/\/mjtsai.com\/blog\/2021\/03\/02\/apple-developer-forums-can-now-monitor-threads\/\">Apple Developer Forums Can Now Monitor Threads<\/a><\/li>\n<li><a href=\"https:\/\/mjtsai.com\/blog\/2017\/09\/25\/surprising-behavior-of-non-optional-nsmanaged-properties\/\">Surprising Behavior of Non-optional @NSManaged Properties<\/a><\/li>\n<li><a href=\"https:\/\/mjtsai.com\/blog\/2017\/08\/29\/swift-4-bridging-peephole-for-as-casts\/\">Swift 4: Bridging Peephole for &ldquo;as&rdquo; Casts<\/a><\/li>\n<li><a href=\"https:\/\/mjtsai.com\/blog\/2016\/04\/21\/core-data-type-safety-with-swift\/\">Core Data Type Safety With Swift<\/a><\/li>\n<li><a href=\"https:\/\/mjtsai.com\/blog\/2015\/12\/11\/double-core-data-accessors-by-omitting-nsmanaged\/\">Double Core Data Accessors by Omitting @NSManaged<\/a><\/li>\n<li><a href=\"https:\/\/mjtsai.com\/blog\/2015\/10\/07\/core-data-in-el-capitan\/\">Core Data in El Capitan<\/a><\/li>\n<li><a href=\"https:\/\/mjtsai.com\/blog\/2015\/03\/17\/using-core-data-with-swift\/\">Using Core Data With Swift<\/a><\/li>\n<\/ul>","protected":false},"excerpt":{"rendered":"<p>Apple Frameworks Engineer: Set in Swift is an immutable value type. We do not recommend making Core Data relationships typed this way despite the obvious convenience. Core Data makes heavy use of Futures, especially for relationship values. These are reference types expressed as NSSet. The concrete instance is a future subclass however. This lets us [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"apple_news_api_created_at":"2021-03-31T14:08:30Z","apple_news_api_id":"3349cde3-e3e6-452e-9a62-116d5a0d2b02","apple_news_api_modified_at":"2021-03-31T14:49:31Z","apple_news_api_revision":"AAAAAAAAAAAAAAAAAAAAAg==","apple_news_api_share_url":"https:\/\/apple.news\/AM0nN4-PmRS6aYhFtWg0rAg","apple_news_coverimage":0,"apple_news_coverimage_caption":"","apple_news_is_hidden":false,"apple_news_is_paid":false,"apple_news_is_preview":false,"apple_news_is_sponsored":false,"apple_news_maturity_rating":"","apple_news_metadata":"\"\"","apple_news_pullquote":"","apple_news_pullquote_position":"","apple_news_slug":"","apple_news_sections":"\"\"","apple_news_suppress_video_url":false,"apple_news_use_image_component":false,"footnotes":""},"categories":[4],"tags":[109,164,31,1837,30,1666,1891,54,138,71,901],"class_list":["post-32031","post","type-post","status-publish","format-standard","hentry","category-programming-category","tag-coredata","tag-documentation","tag-ios","tag-ios-14","tag-mac","tag-macos-10-15","tag-macos-11-0","tag-objective-c","tag-optimization","tag-programming","tag-swift-programming-language"],"apple_news_notices":[],"_links":{"self":[{"href":"https:\/\/mjtsai.com\/blog\/wp-json\/wp\/v2\/posts\/32031","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/mjtsai.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/mjtsai.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/mjtsai.com\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/mjtsai.com\/blog\/wp-json\/wp\/v2\/comments?post=32031"}],"version-history":[{"count":4,"href":"https:\/\/mjtsai.com\/blog\/wp-json\/wp\/v2\/posts\/32031\/revisions"}],"predecessor-version":[{"id":32043,"href":"https:\/\/mjtsai.com\/blog\/wp-json\/wp\/v2\/posts\/32031\/revisions\/32043"}],"wp:attachment":[{"href":"https:\/\/mjtsai.com\/blog\/wp-json\/wp\/v2\/media?parent=32031"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/mjtsai.com\/blog\/wp-json\/wp\/v2\/categories?post=32031"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/mjtsai.com\/blog\/wp-json\/wp\/v2\/tags?post=32031"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}