{"id":18459,"date":"2017-07-25T14:23:04","date_gmt":"2017-07-25T18:23:04","guid":{"rendered":"https:\/\/mjtsai.com\/blog\/?p=18459"},"modified":"2017-07-25T14:23:04","modified_gmt":"2017-07-25T18:23:04","slug":"dissecting-objc_msgsend-on-arm64","status":"publish","type":"post","link":"https:\/\/mjtsai.com\/blog\/2017\/07\/25\/dissecting-objc_msgsend-on-arm64\/","title":{"rendered":"Dissecting objc_msgSend on ARM64"},"content":{"rendered":"<p><a href=\"https:\/\/mikeash.com\/pyblog\/friday-qa-2017-06-30-dissecting-objc_msgsend-on-arm64.html\">Mike Ash<\/a> (<a href=\"https:\/\/news.ycombinator.com\/item?id=14676236\">Hacker News<\/a>):<\/p>\n<blockquote cite=\"https:\/\/mikeash.com\/pyblog\/friday-qa-2017-06-30-dissecting-objc_msgsend-on-arm64.html\">\n<p><code>objc_msgSend<\/code> has a few different paths it can take depending on circumstances. It has special code for handling things like messages to <code>nil<\/code>, tagged pointers, and hash table collisions. I&rsquo;ll start by looking at the most common, straight-line case where a message is sent to a non-<code>nil<\/code>, non-tagged pointer and the method is found in the cache without any need to scan. I&rsquo;ll note the various branching-off points as we go through them, and then once we&rsquo;re done with the common path I&rsquo;ll circle back and look at all of the others.<\/p>\n<\/blockquote>\n\n<p><a href=\"https:\/\/mikeash.com\/pyblog\/friday-qa-2017-06-30-dissecting-objc_msgsend-on-arm64.html#comment-febe20b6c6941ee9d1481d42db88fd12\">Greg Parker<\/a>:<\/p>\n<blockquote cite=\"https:\/\/mikeash.com\/pyblog\/friday-qa-2017-06-30-dissecting-objc_msgsend-on-arm64.html#comment-febe20b6c6941ee9d1481d42db88fd12\"><p>Incrementing to the end of the cache requires an extra instruction or two to calculate where the end of the cache is. The start of the cache is already known - it&rsquo;s the pointer we loaded from the class - so we decrement towards that.<\/p>\n<p>[&#8230;]<\/p>\n<p>The extra scanned-twice check prevents power-draining infinite loops in some cases of memory corruption or invalid objects. For example, heap corruption could fill the cache with non-zero data, or set the cache mask to zero. Corruption like this would otherwise cause the cache scan loop to run forever without a cache hit or a cache miss. The extra check stops the loop so we can turn the problem into a crash log instead.<\/p>\n<p>There are also cases where another thread simultaneously modifying the cache can cause this thread to neither hit nor miss on the first scan. The C code does extra work to resolve that race. A previous version of <code>objc_msgSend<\/code> handled this incorrectly - it immediately aborted instead of falling back to the C code - which caused rare crashes when the threads were unlucky.<\/p><\/blockquote>\n\n<p><a href=\"https:\/\/news.ycombinator.com\/item?id=14677544\">Mike Ash<\/a>:<\/p>\n<blockquote cite=\"https:\/\/news.ycombinator.com\/item?id=14677544\"><p>However, Objective-C does not require <code>objc_msgSend<\/code>.<\/p>\n<p>[&#8230;]<\/p>\n<p>Instead of <code>objc_msgSend<\/code>, the runtime can provide a function which looks up the method implementation and returns it to the caller. The caller can then invoke that implementation itself. This is how the GNU runtime does it, since it needs to be more portable. Their lookup function is called <code>objc_msg_lookup<\/code>.<\/p>\n<p>[&#8230;]<\/p>\n<p>However, each call now suffers the overhead of two function calls, so it&rsquo;s a bit slower. Apple prefers to put in the extra effort of writing assembly code to avoid this, since it&rsquo;s so critical to their platform.<\/p><\/blockquote>\n\n<p><a href=\"https:\/\/news.ycombinator.com\/item?id=14678174\">Louis Gerbarg<\/a>:<\/p>\n<blockquote cite=\"https:\/\/news.ycombinator.com\/item?id=14678174\"><p>It actually is not the extra function call that is the big hit, since if you think about it <code>objc_msgSend<\/code> also does two calls (the call to <code>msgSend<\/code>, which at the end then tail calls the imp). The dynamic instruction count is also roughly the same.<\/p>\n<p>In fact <code>objc_msgLookup<\/code> actually ends up being faster in a some micro benches since it plays a lot better with modern CPU branch predictors: <code>objc_msgSend<\/code> defeats them by making every call site jump to the same dispatch function, which then makes a completely unpredictable jump to the imp. By using <code>msgLookup<\/code> you essentially decouple the branch source from the lookup which greatly improves predictably. Also, with a &ldquo;sufficiently smart&rdquo; compiler it can be win because it allows you to do things like hoist the lookup out of loops, etc (essentially really clever automated <code>IMP<\/code> caching tricks).<\/p><\/blockquote>","protected":false},"excerpt":{"rendered":"<p>Mike Ash (Hacker News): objc_msgSend has a few different paths it can take depending on circumstances. It has special code for handling things like messages to nil, tagged pointers, and hash table collisions. I&rsquo;ll start by looking at the most common, straight-line case where a message is sent to a non-nil, non-tagged pointer and the [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"apple_news_api_created_at":"","apple_news_api_id":"","apple_news_api_modified_at":"","apple_news_api_revision":"","apple_news_api_share_url":"","apple_news_coverimage":0,"apple_news_coverimage_caption":"","apple_news_is_hidden":false,"apple_news_is_paid":false,"apple_news_is_preview":false,"apple_news_is_sponsored":false,"apple_news_maturity_rating":"","apple_news_metadata":"\"\"","apple_news_pullquote":"","apple_news_pullquote_position":"","apple_news_slug":"","apple_news_sections":"\"\"","apple_news_suppress_video_url":false,"apple_news_use_image_component":false,"footnotes":""},"categories":[4],"tags":[262,770,800,31,1380,54,760,138,71],"class_list":["post-18459","post","type-post","status-publish","format-standard","hentry","category-programming-category","tag-arm","tag-assembly-language","tag-concurrency","tag-ios","tag-ios-10","tag-objective-c","tag-objective-c-runtime","tag-optimization","tag-programming"],"apple_news_notices":[],"_links":{"self":[{"href":"https:\/\/mjtsai.com\/blog\/wp-json\/wp\/v2\/posts\/18459","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/mjtsai.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/mjtsai.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/mjtsai.com\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/mjtsai.com\/blog\/wp-json\/wp\/v2\/comments?post=18459"}],"version-history":[{"count":1,"href":"https:\/\/mjtsai.com\/blog\/wp-json\/wp\/v2\/posts\/18459\/revisions"}],"predecessor-version":[{"id":18460,"href":"https:\/\/mjtsai.com\/blog\/wp-json\/wp\/v2\/posts\/18459\/revisions\/18460"}],"wp:attachment":[{"href":"https:\/\/mjtsai.com\/blog\/wp-json\/wp\/v2\/media?parent=18459"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/mjtsai.com\/blog\/wp-json\/wp\/v2\/categories?post=18459"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/mjtsai.com\/blog\/wp-json\/wp\/v2\/tags?post=18459"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}