{"id":10009,"date":"2014-10-25T23:10:46","date_gmt":"2014-10-26T03:10:46","guid":{"rendered":"http:\/\/mjtsai.com\/blog\/?p=10009"},"modified":"2014-10-25T23:11:26","modified_gmt":"2014-10-26T03:11:26","slug":"trust-no-one-not-even-performance-counters","status":"publish","type":"post","link":"https:\/\/mjtsai.com\/blog\/2014\/10\/25\/trust-no-one-not-even-performance-counters\/","title":{"rendered":"Trust No One, Not Even Performance Counters"},"content":{"rendered":"<a href=\"http:\/\/www.pvk.ca\/Blog\/2014\/10\/19\/performance-optimisation-~-writing-an-essay\/\">Paul Khuong<\/a> (via <a href=\"https:\/\/twitter.com\/Catfish_Man\/status\/525801888604626944\">David Smith<\/a>):\n<blockquote cite=\"http:\/\/www.pvk.ca\/Blog\/2014\/10\/19\/performance-optimisation-~-writing-an-essay\/\"><p>I can guess why we observe this effect; it&rsquo;s not like Intel is\nintentionally messing with us.  <code>mfence<\/code> is a full pipeline flush: it\nslows code down because it waits for all in-flight instructions to\ncomplete their execution.  Thus, while it&rsquo;s flushing that slows us\ndown, the profiling machinery will assign these cycles to any of the\ninstructions that are being flushed.  Locked instructions instead\naffect stores that are still queued.  By forcing such stores to\nretire, locked instructions become responsible for the extra cycles\nand end up &ldquo;paying&rdquo; for writes that would have taken up time anyway.<\/p><\/blockquote>","protected":false},"excerpt":{"rendered":"<p>Paul Khuong (via David Smith): I can guess why we observe this effect; it&rsquo;s not like Intel is intentionally messing with us. mfence is a full pipeline flush: it slows code down because it waits for all in-flight instructions to complete their execution. Thus, while it&rsquo;s flushing that slows us down, the profiling machinery will [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"apple_news_api_created_at":"","apple_news_api_id":"","apple_news_api_modified_at":"","apple_news_api_revision":"","apple_news_api_share_url":"","apple_news_coverimage":0,"apple_news_coverimage_caption":"","apple_news_is_hidden":false,"apple_news_is_paid":false,"apple_news_is_preview":false,"apple_news_is_sponsored":false,"apple_news_maturity_rating":"","apple_news_metadata":"\"\"","apple_news_pullquote":"","apple_news_pullquote_position":"","apple_news_slug":"","apple_news_sections":"\"\"","apple_news_suppress_video_url":false,"apple_news_use_image_component":false,"footnotes":""},"categories":[4],"tags":[770,138,71],"class_list":["post-10009","post","type-post","status-publish","format-standard","hentry","category-programming-category","tag-assembly-language","tag-optimization","tag-programming"],"apple_news_notices":[],"_links":{"self":[{"href":"https:\/\/mjtsai.com\/blog\/wp-json\/wp\/v2\/posts\/10009","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/mjtsai.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/mjtsai.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/mjtsai.com\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/mjtsai.com\/blog\/wp-json\/wp\/v2\/comments?post=10009"}],"version-history":[{"count":1,"href":"https:\/\/mjtsai.com\/blog\/wp-json\/wp\/v2\/posts\/10009\/revisions"}],"predecessor-version":[{"id":10010,"href":"https:\/\/mjtsai.com\/blog\/wp-json\/wp\/v2\/posts\/10009\/revisions\/10010"}],"wp:attachment":[{"href":"https:\/\/mjtsai.com\/blog\/wp-json\/wp\/v2\/media?parent=10009"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/mjtsai.com\/blog\/wp-json\/wp\/v2\/categories?post=10009"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/mjtsai.com\/blog\/wp-json\/wp\/v2\/tags?post=10009"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}