Wednesday, December 7, 2016

Why Does calloc Exist?

Nathaniel J. Smith (via Hacker News):

So there are lots of books and webpages out there that will claim that the calloc call above is equivalent to calling malloc and then calling memset to fill the memory with zeros[…] So… why does calloc exist, if it’s equivalent to these 2 lines? The C library is not known for its excessive focus on providing convenient shorthands!


When calloc multiplies count * size, it checks for overflow, and errors out if the multiplication returns a value that can’t fit into a 32- or 64-bit integer (whichever one is relevant for your platform).


So that’s the first way that calloc cheats: when you call malloc to allocate a large buffer, then probably the memory will come from the operating system and already be zeroed, so there’s no need to call memset. But you don’t know that for sure! Memory allocators are pretty inscrutable. So you have to call memset every time just in case. But calloc lives inside the memory allocator, so it knows whether the memory it’s returning is fresh from the operating system, and if it is then it skips calling memset.


It turns out that the kernel is also cheating! When we ask it for 1 GiB of memory, it doesn’t actually go out and find that much RAM and write zeros to it and then hand it to our process. Instead, it fakes it, using virtual memory[…]


That’s a nice alternative history fiction.

Here’s an early implementation [that just calls malloc].


There are several interesting things we learn from poking around V6 though:

  • calloc originated not on UNIX, but as part of Mike Lesk’s “iolib”, which was written to make it easier to write C programs portable across PDP 11 UNIX, Honeywell 6000 GCOS, and IBM 370 OS[0]. Presumably the reason calloc is the-way-it-is is hidden in the history of the implementation for GCOS or IBM 370 OS, not UNIX. Unfortunately, I can’t seem to track down a copy of Bell Labs “Computing Science Technical Report #31”, which seems to be the appropriate reference.
  • calloc predates malloc. As you can see, there was a malloc-like function called just alloc (though there were also several other functions named alloc that allocated things other than memory)


OpenBSD added calloc overflow checking on July 29th, 2002. glibc added calloc overflow checking on August 1, 2002. Probably not a coincidence. I’m going to say nobody checked for overflow prior to the August 2002 security advisory.


It is not only a security flaw but also violation of C Standards (even the first version ratified in 1989, usually referred to as C89). […] So if it cannot allocate space for an array of nmemb objects, each of whose size is size, then it has to return null pointer.


And then there’s of course when calloc returns non-zeroed memory once in a while, which causes... ‘interesting’ bugs.

Comments RSS · Twitter

Leave a Comment