Home > C++ > Holier Than Thou

Holier Than Thou

Since C++ (by deliberate design) does not include a native garbage collector or memory compactor, programs that perform dynamic memory allocation and de-allocation (via explicit or implicit use of the “new” and “delete” operators) cause small “holes” to accumulate in the free store over time. I guess you can say that C++ is “holier than thou“. ūüė¶ Stroustrup Mem Holes Mind you, drilling holes in the free store is not the same as leaking memory. A memory leak is a programming bug that needs to be squashed. A fragmented free store is a “feature“. His holiness can become an issue only for loooong running C++ programs and/or programs that run on hardware with small RAM footprints. For those types of niche systems, the best, and perhaps only, practical options available for preventing any holes from accumulating in your briefs are to eliminate all deletes from the code:

  • Perform all of your dynamic memory allocations at once, during program initialization (global data – D’oh!), and only utilize the CPU stack during runtime for object creation/destruction.
  • If your application inherently requires post-initialization dynamic memory usage, then use pre-allocated, fixed size, unfragmentable, pools and stacks to acquire/release data buffers during runtime.

If your system maps into the “holes can freakin’ kill someone” category and you don’t bake those precautions into your software design, then the C++ runtime may toss a big, bad-ass, std::bad_alloc exception into your punch bowl and crash your party if it can’t find a contiguous memory block big enough for your impending new request. For large software systems, refactoring the design to mitigate the risk of a fragmented free store after it’s been coded/tested falls squarely into the expensive and ominous “stuff that’s hard to change” camp. And by saying “stuff that’s hard to change“, I mean that it may be expensive, time consuming, and technically risky to reweave your basket (case). In an attempt to verify that the accumulation of free store holes will indeed crash a program, I wrote this hole-drilling simulator: MemFrag Each time through the while loop, the code randomly allocates and deallocates 1000 chunks of memory. Each chunk is comprised of a random number of bytes between 1 and 1MB. Thus, the absolute maximum amount of memory that can be allocated/deallocated during each pass through the loop is 1 GB – half of the 2 GB RAM configured on the linux virtual box I am currently running the code on. The hole-drilling simulator has been running for days¬† – at least 5 straight. I’m patiently waiting for it to exit – and hopefully when it does so, it does so gracefully. Do you know how I can change the code to speed up the time to exit?

Note1: This post doesn’t hold a candle to Bjarne Stroustrup’s thorough explanation of the “holiness” topic in chapter 25 of “Programming: Principles and Practice Using C++, Second Edition“. If you’re a C++ programmer (beginner, intermediate, or advanced) and you haven’t read that book cover-to-cover, then… you should.

Note2: For those of you who would like to run/improve the hole-drilling simulator, here is the non-.png source code listing:

#include 
#include 
#include 
#include 
#include 

int main() {

    //create a functor for generating uniformly
    //distributed, contiguous memory chunk sizes between 1 and
    //MAX_CHUNK_SIZE bytes
    constexpr int32_t MAX_CHUNK_SIZE{1000000};
    std::default_random_engine randEng{};
    std::uniform_int_distribution distribution{
        1, MAX_CHUNK_SIZE};

    //load up a vector of NUM_CHUNKS chunks;
    //with each chunk randomly sized
    //between 1 and MAX_CHUNK_SIZE long
    constexpr int32_t NUM_CHUNKS{1000};
    std::vector chunkSizes(NUM_CHUNKS); //no init braces!
    for(auto& el : chunkSizes) {
        el = distribution(randEng);
    }

    //declare the allocator/deallocator function
    //before calling it.
    void allocDealloc(std::vector& chunkSizes,
            std::default_random_engine& randEng);

    //loop until our free store is so "holy"
    //that we must bow down to the lord and leave
    //the premises!
    bool tooManyHoles{};
    while(!tooManyHoles) {
        try {
            allocDealloc(chunkSizes, randEng);
        }
        catch(const std::bad_alloc&) {
            tooManyHoles = true;
        }
    }

    std::cout << "The Free Store Is Too "
                  << "Holy to Continue!!!"
                  << std::endl;
}

//the function for randomly allocating/deallocating chunks of memory
void allocDealloc(std::vector& chunkSizes,
        std::default_random_engine& randEng) {

    //create a vector for storing the dynamic pointers
    //to each allocated memory block
    std::vector<char*> handles{};

    //randomize the memory allocations and store each pointer
    //returned from the runtime memory allocator
    std::shuffle(chunkSizes.begin(), chunkSizes.end(), randEng);
    for(const auto& numBytes : chunkSizes) {
        handles.emplace_back(new char[numBytes]{}); //array of Bytes
        std::cout << "Allocated "
                      << numBytes << " Bytes" << std::endl;
    }

    //randomize the memory deallocations
    std::shuffle(handles.begin(), handles.end(), randEng);
    for(const auto& elem : handles) {
        delete [] elem;
    }

}

If you do copy/paste/change the code, please let me know what you did and what you discovered in the process.

Note3: I have an intense love/hate relationship with the C++ posts that I write. I love them because they attract the most traffic into this gawd-forsaken site and I always learn something new about the language when I compose them. I hate my C++ posts because they take for-freakin’-ever to write. The number of create/reflect/correct iterations I execute for a C++ post prior to queuing it up for publication always dwarfs the number of mental iterations I perform for a non-C++ post.

  1. aljensen
    June 1, 2015 at 3:17 am

    Reblogged this on C Programming with Al Jensen.

  2. June 2, 2015 at 4:38 pm

    Isn’t this what jemalloc and such were created to address?

    • June 2, 2015 at 8:21 pm

      I think there are several third party bolt-on garbage collectors for C++.

  3. Davidbrcz
    June 2, 2015 at 8:21 pm

    Why not using placement new on a dedicated chunck of memory allocated once and for all before the loop ?

    • June 3, 2015 at 2:45 am

      Sorry, but I don’t understand how that would help.

    • Alan Witmer
      June 3, 2015 at 7:31 am

      Indeed, I believe this is what the author is referring to in terms of managing pools: you’d do your placement new in a managed pool of nearly-same-size buckets (perhaps fibonacci based sizes, for example). You pay slightly in general memory efficiency and a bit of up-front planning, with the benefit of “never?” pathologically fragmenting. And of course you can replace your memory management with this scheme, at which point I think you’re dangerously close to reinventing jemalloc ūüôā

  4. June 3, 2015 at 4:10 am

    If I read your code correctly, it is deallocating all of the memory that it has allocated, before starting the next cycle. Even primitive memory allocation routines will consolidate adjacent free blocks, so you’re always going to reset back to your starting state.
    Memory fragmentation occurs when you have some blocks retained whilst freeing others (as per your initial illustration).
    If you want to run out of memory quickly, try allocating 1,047,551 bytes, then 1,048,576 bytes, then free the 1,047,551 bytes and repeat. You should run out of memory twice as quickly as without the free’d allocs.

    • June 3, 2015 at 5:32 am

      Damn, I think you’re right John. Since Bjarne’s example code never frees the Nodes, it has a memory leak. Eventually the program will consume the whole heap – holes or no holes. As you point out, It will simply run out of memory faster if it drills holes during its death march. ūüôā

      I thought (wrongly?) that holes and leaks were independent of each other. Can I conclude that if a program doesn’t have any leaks, it won’t ever run out of memory – as long as it never requires all of the system’s memory at any given point in time?

      Thanks!

      • June 3, 2015 at 11:06 am

        No, even if you don’t have any leaks, you can still “run out of memory” in the sense that you can’t allocate a new object of a given size. For example, assume you are running a 32-bit program in which your access pattern is like this, and you need all the 1-byte objects (that is, this is not leaking those objects):

        Allocate 1 MB object A.
        Allocate 1 byte object B.
        Allocate 1 MB object C.
        Allocate 1 byte object D.

        Delete Object A.
        Delete Object C.

        repeat above 2500 times.

        So far, everything should be fine.

        Now try to allocate a 2 MByte object.

        This will fail because there is no contiguous block of 2 MBytes available.

        The proof is left as an exercise to the reader. ūüôā

  5. Stefan Atev
    June 3, 2015 at 10:46 am

    It is pointless to try big blocks – they get allocated as multiple pages, so unless you’re trying to show virtual address space fragmentation, you are very unlikely to get anywhere with this test program. Virtual address space fragmentation is a serious concern on 32-bit systems. As another comment said, if you deallocate everything, you end up reducing fragmentation, too, so you have to have at least a portion of the blocks (pretty high, too) stay allocated.

  6. June 3, 2015 at 10:56 am

    Fortunately this is easy to prevent, at least in new code that is written by someone who knows what he is doing.
    The basic rule is “NEVER ALLOCATE DYNAMIC MEMORY IN APPLICATION CODE!!!!!!”
    You should use RAII for anything that needs to allocate memory (or any other resources, for that matter, such as file handles).
    Then you will never run into this issue.
    And this is not merely theoretical; following this rule is how I have written large C++ programs that don’t have this issue.

    • June 3, 2015 at 11:08 am

      Actually, after re-reading, RAII won’t prevent this, as the example I gave in my other comment suggests. You still have to make sure that you aren’t leaving holes even if you aren’t leaking anything. One way to prevent this is to allocate only in fixed sizes so that you can reuse freed memory, but that requires careful attention to design.

      • June 3, 2015 at 12:17 pm

        So, I guess your first “easy to prevent” comment was maybe a little too pre-mature? ūüôā

      • June 3, 2015 at 12:23 pm

        Yep. Everything possible should of course be allocated on the stack (via RAII if applicable), but that won’t solve the problem if you need a longer lifespan for your objects.
        But RAII is still the right way to allocate resources, even if it isn’t a magic bullet for this issue.

      • June 3, 2015 at 12:19 pm

        Yes, by using pools and/or stacks as suggested in the post and in Stroustrup’s book.

  7. June 3, 2015 at 2:28 pm

    I have heard a rumor that modern versions of malloc() maintain multiple free lists of different sized nodes, thus making holes a smaller problem than for the older, simpler style of malloc(). This may be a solved problem.

    • June 3, 2015 at 3:05 pm

      I still don’t think it is solved. It may decrease the possibility, but without a background compactor running, i don’t know if it is solvable – in general. It is app-specific solvable via the use of pools/stacks.

  8. Gerald
    June 3, 2015 at 4:51 pm

    A “leak” is just a block you could free but don’t. One way to demonstrate “holes” is to fill memory with blocks, free every alternate block, and then try allocating another slightly larger block. Bear in mind that consecutive free blocks will be combined, that the memory manager will probably add a header to every block you allocate, and that it probably has some granularity in the block sizes it uses. Example Windows C code using malloc/free …

    void main ()
    {
    DWORD BlockLength = sizeof (PBYTE); // Length of each block
    PBYTE Huge = malloc (0x10000000); // Save some space for later
    PBYTE *Chain = malloc (BlockLength); // Alloc first block
    PBYTE *Ptr = Chain; // Set first block as current block
    DWORD AllocCount = 0; // Number of blocks allocated
    DWORD DeallocCount = 0; // Number of blocks deallocated
    DWORD ReallocLength = 0; // Length of shortest block that fails
    DWORD FreeBlock = FALSE; // Flag to free block

    if (! Huge) printf ("Huge alloc failed!");
    
    // Fill memory with chain of small blocks
    for (;;)
    {
        *Ptr = malloc (BlockLength);        // Alloc next block & add to chain
        Ptr = (PBYTE*) *Ptr;                // Set as current block
        if (! Ptr) break;                   // If alloc failed
        AllocCount++;
    }
    
    // Free alternate blocks
    while (Chain)
    {
        Ptr = (PBYTE*) *Chain;              // Get next block in chain
        if (FreeBlock)                      // If to free this block
        {
            free (Chain);
            DeallocCount++;
        }
        Chain = Ptr;                        // Point to next block
        FreeBlock ^= TRUE;                  // Toggle flag to free next block
    }
    
    // Allocate blocks of increasing size until one fails
    for (;;)
    {
        Ptr = malloc (ReallocLength);       // Try allocating block
        free (Ptr);                         // Free it
        if (! Ptr) break;                   // If alloc failed
        ReallocLength++;                    // Increase the size to alloc
    }
    
    free (Huge);                            // Reclaim some space!
    printf ("Allocated %u blocks of length 0x%X, total length 0x%X\n", AllocCount, BlockLength, AllocCount*BlockLength);
    printf ("Freed %u blocks with total length 0x%X\n", DeallocCount, DeallocCount*BlockLength);
    printf ("Allocating %u bytes failed!\n", ReallocLength);
    

    }

  9. June 3, 2015 at 4:56 pm

    On Linux, try the malloc_info() from malloc.h – it reports numbers on the current internal fragmentation in a XML format that you can then interpret. For apps with less than 32bit of address space allocated, you can also use the struct returned by mallinfo, esp. the uordblks, hblkhd and keepcost values afaik are interesting there.

  10. June 3, 2015 at 5:06 pm

    Hey everybody, thanks for all the interest and widely diverse comments. Much appreciated!

  11. June 4, 2015 at 2:42 am

    Great post. Probably the hole-driller proves a bad alloc, but you can’t be sure until you actually catch that!

    And since this code is copy-pasted a gazillion times from now, could you please, please please replace the catch-all by an actual catch (const std::bad_alloc&)?

  12. Dave Closs
    June 13, 2015 at 12:13 pm

    Wow, something I have experience with on your blog!
    My first task at my first job was to write and test a memory manager for an embedded system. The tester was a random-action random size-of-action tester that did very well ferreting out errors. It was pretty cool to go from catching an error in an hours running to ones that would crop up after a weeks running.

  1. No trackbacks yet.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: