{"id":16555,"date":"2016-12-07T19:16:46","date_gmt":"2016-12-08T00:16:46","guid":{"rendered":"http:\/\/mjtsai.com\/blog\/?p=16555"},"modified":"2016-12-07T19:16:46","modified_gmt":"2016-12-08T00:16:46","slug":"why-does-calloc-exist","status":"publish","type":"post","link":"https:\/\/mjtsai.com\/blog\/2016\/12\/07\/why-does-calloc-exist\/","title":{"rendered":"Why Does calloc Exist?"},"content":{"rendered":"<p><a href=\"https:\/\/vorpus.org\/blog\/why-does-calloc-exist\/\">Nathaniel J. Smith<\/a> (via <a href=\"https:\/\/news.ycombinator.com\/item?id=13108434\">Hacker News<\/a>):<\/p>\n<blockquote cite=\"https:\/\/vorpus.org\/blog\/why-does-calloc-exist\/\"><p>So there are lots of books and\nwebpages out there that will claim that the <code>calloc<\/code> call above is\nequivalent to calling <code>malloc<\/code> and then calling <code>memset<\/code> to fill\nthe memory with zeros[&#8230;] So&#8230; why does <code>calloc<\/code> exist, if it&rsquo;s equivalent to these 2 lines?\nThe C library is not known for its excessive focus on providing\nconvenient shorthands!<\/p><p>[&#8230;]<\/p><p>When <code>calloc<\/code> multiplies <code>count * size<\/code>, it checks for overflow,\nand errors out if the multiplication returns a value that can&rsquo;t fit\ninto a 32- or 64-bit integer (whichever one is relevant for your\nplatform).<\/p><p>[&#8230;]<\/p><p>So that&rsquo;s the first way that <code>calloc<\/code> cheats: when you\ncall <code>malloc<\/code> to allocate a large buffer, then <em>probably<\/em> the memory\nwill come from the operating system and already be zeroed, so there&rsquo;s\nno need to call <code>memset<\/code>. But you don&rsquo;t know that for sure! Memory\nallocators are pretty inscrutable. So <em>you<\/em> have to call <code>memset<\/code>\nevery time just in case. But <code>calloc<\/code> lives inside the memory\nallocator, so <em>it<\/em> knows whether the memory it&rsquo;s returning is fresh\nfrom the operating system, and if it is then it skips calling\n<code>memset<\/code>.<\/p><p>[&#8230;]<\/p><p>It turns out that the kernel is also cheating! When we ask it for 1 GiB of memory, it doesn&rsquo;t actually go out and find that much RAM and write zeros to it and then hand it to our process. Instead, it fakes it, using virtual memory[&#8230;]<\/p><\/blockquote>\n<p><a href=\"https:\/\/news.ycombinator.com\/item?id=13109019\">bluefox<\/a>:<\/p>\n<blockquote cite=\"https:\/\/news.ycombinator.com\/item?id=13109019\"><p>That&rsquo;s a nice alternative history fiction.<\/p>\n<p>Here&rsquo;s an <a href=\"https:\/\/github.com\/dspinellis\/unix-history-repo\/blob\/Research-V7\/usr\/src\/libc\/gen\/calloc.c\">early implementation<\/a> [that just calls <code>malloc<\/code>].<\/p><\/blockquote>\n<p><a href=\"https:\/\/news.ycombinator.com\/item?id=13110538\">LukeShu<\/a>:<\/p>\n<blockquote cite=\"https:\/\/news.ycombinator.com\/item?id=13110538\"><p>There are several interesting things we learn from poking around V6 though:<\/p><ul><li><code>calloc<\/code> originated not on UNIX, but as part of Mike Lesk&rsquo;s &ldquo;iolib&rdquo;, which was written to make it easier to write C programs portable across PDP 11 UNIX, Honeywell 6000 GCOS, and IBM 370 OS[0]. Presumably the reason calloc is the-way-it-is is hidden in the history of the implementation for GCOS or IBM 370 OS, not UNIX. Unfortunately, I can&rsquo;t seem to track down a copy of Bell Labs &ldquo;Computing Science Technical Report #31&rdquo;, which seems to be the appropriate reference.<\/li><li><code>calloc<\/code> predates <code>malloc<\/code>. As you can see, there was a <code>malloc<\/code>-like function called just <code>alloc<\/code> (though there were also several other functions named <code>alloc<\/code> that allocated things other than memory)<\/li><\/ul><\/blockquote>\n<p><a href=\"https:\/\/news.ycombinator.com\/item?id=13111581\">ksherlock<\/a>:<\/p>\n<blockquote cite=\"https:\/\/news.ycombinator.com\/item?id=13111581\"><p>OpenBSD added <code>calloc<\/code> overflow checking on July 29th, 2002. glibc added <code>calloc<\/code> overflow checking on August 1, 2002. Probably not a coincidence. I&rsquo;m going to say nobody checked for overflow prior to the August 2002 security advisory.<\/p><\/blockquote>\n<p><a href=\"https:\/\/news.ycombinator.com\/item?id=13113245\">mnay<\/a>:<\/p>\n<blockquote cite=\"https:\/\/news.ycombinator.com\/item?id=13113245\"><p>It is not only a security flaw but also violation of C Standards (even the first version ratified in 1989, usually referred to as C89). [&#8230;] So if it cannot allocate space for <em>an array of nmemb objects, each of whose size is size,<\/em> then it has to return null pointer.<\/p><\/blockquote>\n<p><a href=\"https:\/\/news.ycombinator.com\/item?id=13111367\">nicolast<\/a>:<\/p>\n<blockquote cite=\"https:\/\/news.ycombinator.com\/item?id=13111367\"><p>And then there&rsquo;s of course when calloc returns non-zeroed memory <a href=\"https:\/\/bugzilla.redhat.com\/show_bug.cgi?id=1293976\">once in a while<\/a>, which causes... &lsquo;interesting&rsquo; bugs.<\/p><\/blockquote>","protected":false},"excerpt":{"rendered":"<p>Nathaniel J. Smith (via Hacker News): So there are lots of books and webpages out there that will claim that the calloc call above is equivalent to calling malloc and then calling memset to fill the memory with zeros[&#8230;] So&#8230; why does calloc exist, if it&rsquo;s equivalent to these 2 lines? The C library is [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"apple_news_api_created_at":"","apple_news_api_id":"","apple_news_api_modified_at":"","apple_news_api_revision":"","apple_news_api_share_url":"","apple_news_coverimage":0,"apple_news_coverimage_caption":"","apple_news_is_hidden":false,"apple_news_is_paid":false,"apple_news_is_preview":false,"apple_news_is_sponsored":false,"apple_news_maturity_rating":"","apple_news_metadata":"\"\"","apple_news_pullquote":"","apple_news_pullquote_position":"","apple_news_slug":"","apple_news_sections":"\"\"","apple_news_suppress_video_url":false,"apple_news_use_image_component":false,"footnotes":""},"categories":[4],"tags":[45,295,863,46,571,138,71,48,163],"class_list":["post-16555","post","type-post","status-publish","format-standard","hentry","category-programming-category","tag-c","tag-history","tag-integer-overflow","tag-languagedesign","tag-memory-management","tag-optimization","tag-programming","tag-security","tag-unix"],"apple_news_notices":[],"_links":{"self":[{"href":"https:\/\/mjtsai.com\/blog\/wp-json\/wp\/v2\/posts\/16555","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/mjtsai.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/mjtsai.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/mjtsai.com\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/mjtsai.com\/blog\/wp-json\/wp\/v2\/comments?post=16555"}],"version-history":[{"count":1,"href":"https:\/\/mjtsai.com\/blog\/wp-json\/wp\/v2\/posts\/16555\/revisions"}],"predecessor-version":[{"id":16556,"href":"https:\/\/mjtsai.com\/blog\/wp-json\/wp\/v2\/posts\/16555\/revisions\/16556"}],"wp:attachment":[{"href":"https:\/\/mjtsai.com\/blog\/wp-json\/wp\/v2\/media?parent=16555"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/mjtsai.com\/blog\/wp-json\/wp\/v2\/categories?post=16555"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/mjtsai.com\/blog\/wp-json\/wp\/v2\/tags?post=16555"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}