In this interesting 2006 slide deck, “C++ in safety-critical applications: the JSF++ coding standard“, Bjarne Stroustrup and Kevin Carroll provide the rationale for selecting C++ as the programming language for the JSF (Joint Strike Fighter) jet project:
First, on the language selection:
- “Did not want to translate OO design into language that does not support OO capabilities“.
- “Prospective engineers expressed very little interest in Ada. Ada tool chains were in decline.“
- “C++ satisfied language selection criteria as well as staffing concerns.“
They also articulated the design philosophy behind the set of rules as:
- “Provide “safer” alternatives to known “unsafe” facilities.”
- “Craft rule-set to specifically address undefined behavior.”
- “Ban features with behaviors that are not 100% predictable (from a performance perspective).”
Note that because of the last bullet, post-initialization dynamic memory allocation (using new/delete) and exception handling (using throw/try/catch) were verboten.
Interestingly, Bjarne and Kevin also flipped the coin and exposed the weaknesses of language subsetting:
What they didn’t discuss in the slide deck was whether the strengths of imposing a large coding standard on a development team outweigh the nasty weaknesses above. I suspect it was because the decision to impose a coding standard was already a done deal.
Much as we don’t want to admit it, it all comes down to economics. How much is the lowering of the risk of loss of life worth? No rule set can ever guarantee 100% safety. Like trying to move from 8 nines of availability to 9 nines, the financial and schedule costs in trying to achieve a Utopian “certainty” of safety start exploding exponentially. To add insult to injury, there is always tremendous business pressure to deliver ASAP and, thus, unconsciously cut corners like jettisoning corner-case system-level testing and fixing hundreds of ”annoying” rules violations.
Does anyone have any data on whether imposing a strict coding standard actually increases the safety of a system? Better yet, is there any data that indicates imposing a standard actually decreases the safety of a system? I doubt that either of these questions can be answered with any unbiased data. We’ll just continue on auto-believing that the answer to the first question is yes because it’s supposed to be self-evident.
Having learned the “Liskov Substitution Principle” as an object-oriented design aid many years ago, I was delighted to discover this video of Ms. Liskov accepting her Turing award in 2009: Barbara Liskov Lecture Video – A.M. Turing Award Winner.
If you don’t have the time to watch the hour long lecture by this extraordinary lady, but you’re curious to know more about Ms. Liskov, here are my notes:
- She “accidentally” entered programming.
- She created the CLU and ARGUS programming languages.
- “Let’s face it, in systems we write programs in C & C++ because efficiency really matters“.
- She was surprised that her paper on sub-types took off and her name became acronym-ized: “LSP”.
- She’s not a fan of DSLs or AOP because “readability is more important than writeability“.
- Abstract types, via encapsulation, were a boon to “reasoning about correctness“: understanding where you are in the code and how you got there.
- While creating CLU, she didn’t care about inheritance because it breaks encapsulation and makes “reasoning about correctness” more difficult. She simply used composition and called the methods of child objects via delegation.
- Since she is a practitioner, she stayed away from theoretical work, e.g. polymorphic lambda calculus.
Over a year ago, I hoisted some C++ code on the “Milliseconds Since The Epoch” post for those who were looking for a way to generate timestamps for message tagging and/or event logging and/or code performance measurements. That code was based on the Boost.Date_Time library. However, with the addition of the <chrono> library to C++11, the code to generate millisecond (or microsecond or nanosecond) precision timestamps is not only simpler, it’s now standard:
Here’s how it works:
- steady_clock::now() returns a timepoint object relative to the epoch of the steady_clock (which may or may not be the same as the Unix epoch of 1/1/1970).
- steady_clock::timepoint::time_since_epoch() returns a steady_clock::duration object that contains the number of tick-counts that have elapsed between the epoch and the occurrence of the “now” timepoint.
- The duration_cast<T> function template converts the steady_clock::duration tick-counts from whatever internal time units they represent (e.g. seconds, microseconds, nanoseconds, etc) into time units of milliseconds. The millisecond count is then retrieved and returned to the caller via the duration::count() function.
I concocted this code from the excellent tutorial on clocks/timepoints/durations in Nicolai Josuttis’s “The Standard C++ Library (2nd Edition)“. Specifically, “5.7. Clocks and Timers“.
The C++11 standard library provides three clocks:
I used the steady_clock in the code because it’s the only clock that’s guaranteed to never be “adjusted” by some external system action (user change, NTP update). Thus, the timepoints obtained from it via the now() member function never decrease as real-time marches forward.
Note: If you need microsecond resolution timestamps, here’s the equivalent code:
So, what about “rollover“, you ask. As the highlight below from Nicolai’s book shows, the number of bits required by C++11 library implementers increases with increased resolution. Assuming each clock ticks along relative to the Unix epoch time, rollover won’t occur for a very, very, very, very long time; no matter which resolution you use.
Of course, the “real” resolution you actually get depends on the underlying hardware of your platform. Nicolai provides source code to discover what these real, platform-specific resolutions and epochs are for each of the three C++11 clock types. Buy the book if you want to build and run that code on your hardware.
I recently spent 3.5 days hunting down and squashing a “showstopper” bug that ended up being a side effect of an earlier fix that I had made to eradicate a long-standing “nuisance” bug.
SIDEBAR: The nuisance bug occurred only during system shutdown. The system would crash on exit, but no data was lost or corrupted. It was long thought to be an “out of sequence” object destruction problem, but because hundreds of lines of nested destructor code are called during shutdown in multiple threads of execution and the customer never formally reported the bug (because the system is very rarely shutdown), it’s annihilation was put on the shelf – until I “fixed” it.
When I was initially told about the showstopper, I was confounded because the bug seemed to be located somewhere in the code of a simple feature that I had thought I tested pretty thoroughly. And yet, there it was, plainly obvious and easily reproducible – a crash that happens during runtime whenever a specific sequence of operator actions are performed. WTF?
With the system model below embedded in my genius mind, I thought the bug HAD to be located somewhere in the massive, preexisting, 150K line legacy code base. After all, the number of code lines I added to the beast in order to implement the feature was so small and unassuming that the odds favored my hypothesis.
Even though my hypothesis was that the new code I added had uncovered and triggered some other dormant bug deep in the bowels of the software, I first inspected the measly few lines I recently added to the code base. Of course, the inspection yielded no “aha, there it is” moment. Bummer.
Next, I fired up the debugger, sprinkled a bunch of breakpoints throughout the code, and stepped through my brilliant and elegantly simple code. I found that when the control of execution descended into the netherworld below my impeccable work, the bug came out of hiding – crash! However, since a bunch of event driven callbacks were triggered each time the execution of control left my code, I couldn’t trace the execution path so easily.
Exasperated that the debugger didn’t tell me exactly where and what the freakin’ bug was, I started reading and reverse engineering (via targeted UML class and sequence diagram sketches) segments of the legacy code. Since it was my first focused foray down into the dungeon, it was a slow going, but beneficial, learning experience.
Finally, after a couple of days of inspection, reverse engineering, and a bazillion debugger runs, I stumbled upon a note written by yours truly in one of the infrastructure callback functions:
“The line of code below was commented out because it triggers a crash on shutdown“.
Bingo, a light went off! Quickly, I uncommented out the line of code and reran the program. Yepp, the bug was gone! As I initially thought, the critter did turn out to be living within the infrastructure, but I had unwittingly put it there a while ago in order to kill the long-standing “nuisance” shutdown bug. Ain’t life grand?
Of course, the tradeoff for re-enabling the line of code that killed the nasty bug is that the nuisance bug is alive and well again. And no, unless I’m directly ordered to, I ain’t gonna go uh huntin’ fer it aginn. No good deed goes unpunished.
ln “Software’s Hidden Clockwork: A General Theory of Software Defects“, Les Hatton presents these two interesting charts:
The thing I find hard to believe is that Les has concluded that there is no obvious significant relationship between defect density and the choice of programming language. But notice that he doesn’t seem to have any data points on his first chart for the relatively newer, less “tricky“, and easier-to-program languages like Java, C#, Ruby, Python, et al.
So, do you think Les might have jumped the gun here by prematurely asserting the virtual independence of defect density on programming language?
When a failure occurs in a complex, networked, socio-technical system, the probability is high that the root cause is located far away from the failure detection point in time, space, or both. The progression in time goes something like this:
fault ———–> error———-> error—————–>error——>failure discovered!
An unanticipated fault begets an error, which begets another error(s), which begets another error(s), etc, until the failure is manifest via loss of life or money somewhere and sometime downstream in the system. In the case of a software system, the time from fault to catastrophic failure may take milliseconds, but the distance between fault and failure can be in the 100s of thousands of lines of source code sprinkled across multiple machines and networks.
Let’s face it. Envisioning, designing, coding, and testing for end-to-end “system level” error conditions in software systems is unglamorous and tedious (unless you’re using Erlang – which is thoughtfully designed to lessen the pain). It’s usually one of the first things to get jettisoned when the pressure is ratcheted up to meet some arbitrary schedule premised on a baseless, one-time, estimate elicited under duress when the project was kicked-off. Bummer.
This is one of those picture-only posts where BD00 invites you to fill in the words of the missing story…
For most cases, the “assignment” and “function” styles of initializing objects in C++ are the same. However, as the example below shows, in some edge cases, the function style of initialization can be more efficient. Nevertheless, for all practical purposes they are essentially the same since the compiler may optimize away the actual assignment step in the 2 step “assignment” style.
The motivation for this post came from a somewhat lengthy debate with a fellow member of the “C++ Developers Group” on LinkedIn.com. I knew that a subtle difference between the two initialization styles existed, but I couldn’t remember where I read about it. However, after I wrote this post, I browsed through my C++ references again and I found the source. The difference is explained in a much more intelligible and elegant way in “Efficient C++ Performance Programming Techniques“. Specifically, the discussion and example code in Chapter 5, “Temporaries – Object Definition“, does the trick.
As my colleague friend pointed out, the above post is outright wrong with respect to the possibility of the assignment operator being used during initialization. Assignment is only used to copy values from one existing object into another existing object – not when an object is being created. That’s what constructors do. The “Doing Copy Assignment” text in the above code only prints to the console because of the last a2 = a1 statement in main(), which I put there to stop the g++ compiler from complaining about an unused variable. D’oh!
The example in the “Efficient” book that triggered our discussion is provided here:
The authors go on to state:
Only the first form of initialization is guaranteed, across compiler implementations, not to generate a temporary object. If you use forms 2 or 3, you may end up with a temporary, depending on the compiler implementation. In practice, however, most compilers should optimize the temporary away, and the three initialization forms presented here would be equivalent in their efficiency.
Ever since I read that book many years ago, I’ve always preferred to use the “function” style initialization over the “assignment” style. But it’s just a personal preference.