Posts Tagged ‘programming’

Dude, Let’s Swarm!

September 12, 2015 Leave a comment

Whoo hoo! I just got back from a 3 day, $2,000 training course in “swarm programming.” After struggling to make it though the unforgiving syllabus, I earned my wings along with the unalienable right to practice in this new and exciting way of creating software. Hyper-productive Scrum teams are so yesterday. Turbo-charged swarm teams are the wave of the future!


I found it strangely interesting that not one of my 30 (30 X $2K = 60K) classmates failed the course. But hey, we’re now ready and willing to swarm our way to success. There’s no stoppin’ us.


Easier To Use, And More Expressive

August 20, 2015 15 comments

One of the goals for each evolutionary increment in C++ is to decrease the probability of an average programmer from making mistakes by supplanting “old style” features/idioms with new, easier to use, and more expressive alternatives. The following code sample attempts to show an example of this evolution from C++98/03 to C++11 to C++14.


In C++98/03, there were two ways of clearing out the set of inner vectors in the vector-of-vectors-of-doubles data structure encapsulated by MyClass. One could use a plain ole for-loop or the std::for_each() STL algorithm coupled with a remotely defined function object (ClearVecFunctor). I suspect that with the exception of clever language lawyers, blue collar programmers (like me) used the simple for-loop option because of its reduced verbosity and compactness of expression.

With the arrival of C++11 on the scene, two more options became available to programmers: the range-for loop, and the std::for_each() algorithm combined with an inline-defined lambda function. The range-for loop eliminated the chance of  “off-by-one” errors and the lambda function eliminated the inconvenience of having to write a remotely located functor class.

The ratification of the C++14 standard brought yet another convenient option to the table: the polymorphic lambda. By using auto in the lambda argument list, the programmer is relieved of the obligation to explicitly write out the full type name of the argument.

This example is just one of many evolutionary improvements incorporated into the language. Hopefully, C++17 will introduce many more.

Note: The code compiles with no warnings under gcc 4.9.2. However, as you can see in the image from the bug placed on line 41, the Eclipse CDT indexer has not caught up yet with the C++14 specification. Because auto is used in place of the explicit type name in the lambda argument list, the indexer cannot resolve the std::vector::clear() member function.

8/26/15 Update – As a result of reader code reviews provided in the comments section of this post, I’ve updated the code as follows:


8/29/15 Update – A Fix to the Line 27 begin-begin bug:


Categories: C++, C++11, C++14 Tags: , ,

Is It Safe?

Remember this classic torture scene in “Marathon Man“? D’oh! Is It Safe Now, suppose you had to create the representation of a message that simply aggregates several numerical measures into one data structure:

safe unsafe

Given no information other than the fact that some numerical computations must be performed on each individual target track attribute within your code, which implementation would you choose for your internal processing? The binary, type-safe, candidate, or the text, type-unsafe, option? If you chose the type-unsafe option, then you’d impose a performance penalty on your code every time you needed to perform a computation on your tracks. You’d have to deserialize and extract the individual track attribute(s) before implementing the computations:


If your application is required to send/receive track messages over a “wire” between processing nodes, then you’d need to choose some sort of serialization/deserialization protocol along with an over-the-wire message format. Even if you were to choose a text format (JSON, XML) for the wire, be sure to deserialize the input as soon as possible and serialize on output as late as possible. Otherwise you’ll impose an unnecessary performance hit on your code every time you have to numerically manipulate the fields in your message.


More generally….


The Big Ones


July 15, 2015 17 comments

In the embedded systems application domain, there is often the need to execute one or more background functions at a periodic rate. Before C++11 rolled onto the scene, a programmer had to use a third party library like ACE/Boost/Poco/Qt to incorporate that functionality into the product. However, with the inclusion of std::thread, std::bind, and std::chrono in C++11, there is no longer the need to include those well-crafted libraries into the code base to achieve that specific functionality.

Periodic Function Execution

The following figure shows the interface of a class template that I wrote to provide users with the capability to execute a function object (of type std::function<void()>) at a periodic rate over a fixed length of time.

Periodic Function Iface

Upon instantiation of a PeriodicFunction object, the class executes the Functor immediately and then re-executes it every periodMillis until expiration at durationMillis. The hasExpired(), stop(), and restart() functions provide users with convenient query/control options after construction.

The figure below depicts the usage of the PeriodicFunction class. In the top block of code, a lambda function is created and placed into execution every 200 milliseconds over a 1 second duration. After waiting for those 5 executions to complete, the code restarts the PeriodicFunction to run the lambda every 500 milliseconds for 1.5 seconds. The last code segment defines a Functor class, creates an object instance of it, and places the functor into execution at 50 millisecond intervals over a duration of 250 milliseconds.

Periodic Function Usage

The output of a single run of the program is shown below. Note that the actual output matches what is expected: 5 executions spaced at 200 millisecond intervals; 3 executions spaced at 500millisecond intervals; and 5 executions spaced at 50 millisecond intervals.

Periodic Function Output

In case you want to use, experiment with, or enhance the code, the implementation of the PeriodicFunction class template is provided in copy & paste form as follows:


#include <cstdint>
#include <thread>
#include <mutex>
#include <chrono>
#include <atomic>

namespace pf {

using std::chrono::steady_clock;
using std::chrono::duration;
using std::chrono::milliseconds;

template<typename Functor>
class PeriodicFunction {
  //Initialize the timer state and start the timing loop
  PeriodicFunction(uint32_t periodMillis, uint32_t durationMillis,
      Functor& callback, int32_t yieldMillis = DEFAULT_YIELD_MILLIS) :
              + steady_clock::duration(milliseconds(durationMillis))),
      nextCallTimeMillis_(steady_clock::now()), yieldMillis_(yieldMillis) {
    //Start the timing loop
    t_ = std::thread { PeriodicFunction::threadLoop, this };

  //Command & wait for the threadLoop to stop
  //before this object gets de-constructed
  ~PeriodicFunction() {

  bool hasExpired() const {
    return hasExpired_;

  void stop() {
    isRunning_ = false;
    if (t_.joinable())

  void restart(uint32_t periodMillis, uint32_t durationMillis) {
    std::lock_guard<std::mutex> lg(stateMutex_);
    //Stop the current timer if needed

    //What time is it right at this instant?
    auto now = steady_clock::now();

    //Set the state for the new timer
    expirationTime_ = now + milliseconds(durationMillis);
    nextCallTimeMillis_ = now;
    periodMillis_ = periodMillis;
    hasExpired_ = false;

    //Start the timing loop
    isRunning_ = true;
    t_ = std::thread { PeriodicFunction::threadLoop, this };

  //Since we retain a reference to a Functor object, prevent copying
  //and moving
  PeriodicFunction(const PeriodicFunction& rhs) = delete;
  PeriodicFunction& operator=(const PeriodicFunction& rhs) = delete;
  PeriodicFunction(PeriodicFunction&& rhs) = delete;
  PeriodicFunction& operator=(PeriodicFunction&& rhs) = delete;

  //The function to be executed periodically until we're done
  Functor& func_;

  //The period at which the function is executed
  uint32_t periodMillis_;

  //The absolute time at which we declare "done done!"
  steady_clock::time_point expirationTime_;

  //The next scheduled function execution time
  steady_clock::time_point nextCallTimeMillis_;

  //The thread sleep duration; the larger the value,
  //the more we decrease the periodic execution accuracy;
  //allows other sibling threads threads to use the cpu
  uint32_t yieldMillis_;

  //The default sleep duration of each pass thru
  //the timing loop
  static constexpr uint32_t DEFAULT_YIELD_MILLIS { 10 };

  //Indicates whether the timer has expired or not
  std::atomic<bool> hasExpired_ { false };

  //Indicates whether the monitoring loop is active
  //probably doesn't need to be atomic, but good practice
  std::atomic<bool> isRunning_ { true };

  //Our precious thread resource!
  std::thread t_ { };

  //Protects the timer state from data races
  //between our private thread and the caller's thread
  std::mutex stateMutex_ { };

  //The timing loop
  void threadLoop() {
    while (isRunning_) {
      auto now = steady_clock::now();//What time is it right at this instant?
      std::lock_guard<std::mutex> lg(stateMutex_);
      if (now >= expirationTime_) { //Has the timer expired?
        hasExpired_ = true;
      else if (now > nextCallTimeMillis_) {//Time to execute function?
        nextCallTimeMillis_ = now + milliseconds(periodMillis_);
        std::bind(&Functor::operator(), std::ref(func_))(); //Execute!
        continue; //Skip the sleep
      //Unselfish sharing; let other threads have the cpu for a bit
//End of the class definition
}//namespace pf


One potential improvement that comes to mind is the addition of the capability to periodically execute the user-supplied functor literally “forever” – and not cheating by setting durationMillis to 0xFFFFFFFF (which is not forever). Another improvement might be to support variadic template args (like the std::thread ctor does) to allow any function type to be placed into execution – not just those of type  std::function<void()>.

In case you don’t want to type in the test driver code, here it is in copy & paste form:

#include <iostream>
#include <chrono>
#include "PeriodicFunction.h"

int main() {
  //Create a lambda and plcae into execution
  //every 200 millis over a duration of 1 second
  using namespace std::chrono;
  steady_clock::time_point startTime { steady_clock::now() };
  steady_clock::time_point execTime { startTime };
  auto lambda = [&] { //We don't need the ()
    using namespace std::chrono;
    execTime = steady_clock::now();
    std::cout << "lambda executed at T="
              << duration_cast<milliseconds>(execTime - startTime).count()
              << std::endl;
    startTime = execTime;
  pf::PeriodicFunction<decltype(lambda)> pf1 { 200, 1000, lambda };
  while (!pf1.hasExpired()) ;
  std::cout << std::endl;

  //Re-run the lambda every half second for a second
  startTime = steady_clock::now();
  execTime = startTime;
  pf1.restart(500, 1500);
  while (!pf1.hasExpired()) ;
  std::cout << std::endl;

  //Define a stateful Functor class
  class Functor {
    Functor() :
        startTime_ { steady_clock::now() }, execTime_ { startTime_ } {
    void operator()() {
      execTime_ = steady_clock::now();
      std::cout << "Functor::operator() executed at T="
          << duration_cast<milliseconds>(execTime_ - startTime_).count()
          << std::endl;
      startTime_ = execTime_;
    steady_clock::time_point startTime_;
    steady_clock::time_point execTime_;

  //Create a Functor object and place into execution
  //every 50 millis over a duration of 250 millis
  Functor myFunction { };
  pf::PeriodicFunction<Functor> pf2 { 50, 250, myFunction, 0 };
  while (!pf2.hasExpired()) ;

Categories: C++11 Tags:

Time To Get Moving!

June 15, 2015 8 comments

Prior to C++11, for every user-defined type we wrote in C++03, all standards-conforming C++ compilers gave us:

  • a “free” copy constructor
  • a “free” copy assignment operator
  • a “free” default constructor
  • a “free” destructor

The caveat is that we only got them for free if we didn’t manually override the compiler and write them ourselves. And unless we defined reference or pointer members inside of our type, we didn’t have to manually write them.

Starting from C++11 on, we not only get those operations for free for our user-defined types, we also get these turbo-boosters:

  • a “free” move constructor
  • a “free” move assignment operator

In addition, all of the C++ standard library containers have been “move enabled“.

When I first learned how move semantics worked and why this new core language feature dramatically improved program performance over copying, I started wondering about user-defined types that wrapped move-enabled, standard library types. For example,  check out this simple user-defined Msg structure that encapsulates a move-enabled std::vector.


Logic would dictate that since I get “move” operations from the compiler for free with the Msg type as written, if I manually “moved” a Msg object in some application code, the compiler would “move” the vDoubs member under the covers along with it – for free.

Until now, I didn’t test out that deduction because I heard my bruh Herb Sutter say in a video talk that deep moves came for free with user-defined types as long as each class member in the hierarchical composition is also move-enabled. However, in a more recent video, I saw an excellent C++ teacher explicitly write a move constructor for a class similar to the Msg struct above:


D’oh! So now I was confused – and determined to figure out was was going on. Here is the program that I wrote to not only verify that manually written “move” operations are not required for the Msg struct, but to also measure the performance difference between moving and copying:


First, the program built cleanly as expected because the compiler provided the free “move” operations for the Msg struct. Second, the following, 5-run, output results proved that the compiler did indeed perform the deep, under the covers, “move” that my man Herb promised it would do. If the deep move wasn’t executed, there would have been no noticeable difference in performance between the move and copy operations.


From the eye-popping performance difference shown in the results, we should conclude that it’s time to start replacing copy operations in our code with “move” operations wherever it makes sense. The only thing to watch out for when moving objects from one place to another is that in the scope of the code that performs the move, the internal state of the moved-from object is not needed or used by the code following the move. The following code snippet, which prints out 0, highlights this behavior.


Holier Than Thou

June 1, 2015 25 comments

Since C++ (by deliberate design) does not include a native garbage collector or memory compactor, programs that perform dynamic memory allocation and de-allocation (via explicit or implicit use of the “new” and “delete” operators) cause small “holes” to accumulate in the free store over time. I guess you can say that C++ is “holier than thou“. :( Stroustrup Mem Holes Mind you, drilling holes in the free store is not the same as leaking memory. A memory leak is a programming bug that needs to be squashed. A fragmented free store is a “feature“. His holiness can become an issue only for loooong running C++ programs and/or programs that run on hardware with small RAM footprints. For those types of niche systems, the best, and perhaps only, practical options available for preventing any holes from accumulating in your briefs are to eliminate all deletes from the code:

  • Perform all of your dynamic memory allocations at once, during program initialization (global data – D’oh!), and only utilize the CPU stack during runtime for object creation/destruction.
  • If your application inherently requires post-initialization dynamic memory usage, then use pre-allocated, fixed size, unfragmentable, pools and stacks to acquire/release data buffers during runtime.

If your system maps into the “holes can freakin’ kill someone” category and you don’t bake those precautions into your software design, then the C++ runtime may toss a big, bad-ass, std::bad_alloc exception into your punch bowl and crash your party if it can’t find a contiguous memory block big enough for your impending new request. For large software systems, refactoring the design to mitigate the risk of a fragmented free store after it’s been coded/tested falls squarely into the expensive and ominous “stuff that’s hard to change” camp. And by saying “stuff that’s hard to change“, I mean that it may be expensive, time consuming, and technically risky to reweave your basket (case). In an attempt to verify that the accumulation of free store holes will indeed crash a program, I wrote this hole-drilling simulator: MemFrag Each time through the while loop, the code randomly allocates and deallocates 1000 chunks of memory. Each chunk is comprised of a random number of bytes between 1 and 1MB. Thus, the absolute maximum amount of memory that can be allocated/deallocated during each pass through the loop is 1 GB – half of the 2 GB RAM configured on the linux virtual box I am currently running the code on. The hole-drilling simulator has been running for days  – at least 5 straight. I’m patiently waiting for it to exit – and hopefully when it does so, it does so gracefully. Do you know how I can change the code to speed up the time to exit?

Note1: This post doesn’t hold a candle to Bjarne Stroustrup’s thorough explanation of the “holiness” topic in chapter 25 of “Programming: Principles and Practice Using C++, Second Edition“. If you’re a C++ programmer (beginner, intermediate, or advanced) and you haven’t read that book cover-to-cover, then… you should.

Note2: For those of you who would like to run/improve the hole-drilling simulator, here is the non-.png source code listing:


int main() {

    //create a functor for generating uniformly
    //distributed, contiguous memory chunk sizes between 1 and
    //MAX_CHUNK_SIZE bytes
    constexpr int32_t MAX_CHUNK_SIZE{1000000};
    std::default_random_engine randEng{};
    std::uniform_int_distribution distribution{
        1, MAX_CHUNK_SIZE};

    //load up a vector of NUM_CHUNKS chunks;
    //with each chunk randomly sized
    //between 1 and MAX_CHUNK_SIZE long
    constexpr int32_t NUM_CHUNKS{1000};
    std::vector chunkSizes(NUM_CHUNKS); //no init braces!
    for(auto& el : chunkSizes) {
        el = distribution(randEng);

    //declare the allocator/deallocator function
    //before calling it.
    void allocDealloc(std::vector& chunkSizes,
            std::default_random_engine& randEng);

    //loop until our free store is so "holy"
    //that we must bow down to the lord and leave
    //the premises!
    bool tooManyHoles{};
    while(!tooManyHoles) {
        try {
            allocDealloc(chunkSizes, randEng);
        catch(const std::bad_alloc&) {
            tooManyHoles = true;

    std::cout << "The Free Store Is Too "
                  << "Holy to Continue!!!"
                  << std::endl;

//the function for randomly allocating/deallocating chunks of memory
void allocDealloc(std::vector& chunkSizes,
        std::default_random_engine& randEng) {

    //create a vector for storing the dynamic pointers
    //to each allocated memory block
    std::vector<char*> handles{};

    //randomize the memory allocations and store each pointer
    //returned from the runtime memory allocator
    std::shuffle(chunkSizes.begin(), chunkSizes.end(), randEng);
    for(const auto& numBytes : chunkSizes) {
        handles.emplace_back(new char[numBytes]{}); //array of Bytes
        std::cout << "Allocated "
                      << numBytes << " Bytes" << std::endl;

    //randomize the memory deallocations
    std::shuffle(handles.begin(), handles.end(), randEng);
    for(const auto& elem : handles) {
        delete [] elem;


If you do copy/paste/change the code, please let me know what you did and what you discovered in the process.

Note3: I have an intense love/hate relationship with the C++ posts that I write. I love them because they attract the most traffic into this gawd-forsaken site and I always learn something new about the language when I compose them. I hate my C++ posts because they take for-freakin’-ever to write. The number of create/reflect/correct iterations I execute for a C++ post prior to queuing it up for publication always dwarfs the number of mental iterations I perform for a non-C++ post.


Get every new post delivered to your Inbox.

Join 589 other followers

%d bloggers like this: