Assume that we’ve just finished designing, testing, and integrating the system below:
Now let’s zoom in on the “as-built“, four class, design of SS2 (SubSystem 2). Assume its physical source tree is laid out as follows:
Given this design data after the fact, some questions may come to mind: How did the four class design cluster come into being? Did the design emerge first, the production code second, and the unit tests come third in a neat and orderly fashion? Did the tests come first and the design emerge second? Who gives a sh-t what the order and linearity of creation was, and perhaps more importantly, why would someone give a sh-t?
It seems that the TDD community thinks the way a design manifests is of supreme concern. You see, some hard core TDD zealots think that designing and writing the test code first ala a strict “red-green-refactor” personal process guarantees a “better” final design than any other way. And damn it, if you don’t do TDD, you’re a second class citizen.
BD00 thinks that as long as refactoring feedback loops exist between the designing-coding-testing efforts, it really doesn’t freakin’ matter which is the cart and which is the horse, nor even which comes first. TDD starts with a local, myopic view and iteratively moves upward towards global abstraction. DDT (Design Driven Test) starts with a global, hyperopic view and iteratively moves downward towards local implementation. A chaotic, hybrid, myopia-hyperopia approach starts anywhere and jumps back and forth as the developer sees fit. It’s all about the freedom to choose what’s best in the moment for you.
Notice that TDD says nothing about how the purely abstract, higher level, three-subsystem cluster (especially the inter-subsystem interfaces) that comprise the “finished” system should come into being. Perhaps the TDD community can (should?) concoct and mandate a new and hip personal process to cover software system level design?
It seems to have taken awhile, but people are finally speaking out strongly against the papal infallibility of the TDD high priesthood. Michał Bartyzel, Andrew Binstock, and Jim Coplien (who actually has been speaking out against it for years) are a few of these blasphemous heretics.
Any of the “driven” techniques can work, but to insinuate that they are the “only way(s)” to build superior products is arrogant, hubristic, and plain stupid. In mentally challenging, knowledge-intensive work, people have, and always will have, different ways of creating beautiful products.
When I speak with adherents of test-driven development (TDD) in particular, there is a seeming non-comprehension that truly excellent, reliable code was ever developed prior to the advent of this one practice. I sense their view that the long history of code that put man on the moon, ran phone switches, airline reservation systems, and electric grids was all the result of luck or unique talents, rather than a function of careful discipline and development rigor. – Andrew Binstock
“…these promises were not supported by unambiguous and verifiable data in early years of TDD. The enthusiastic reaction to TDD came first and then some (selective?) measurements were made to verify its promises. – Michał Bartyzel
“If you find your testers (or yourself) splitting up functions to support the testing process, you’re destroying your system architecture and code comprehension along with it. Test at a coarser level of granularity.” – Jim Coplien
Personally, I start sketching out quite a bit of a design upfront (OMG!!!!!) in “bent” UML (OMG, OMG!!!!!) prior to writing a single line of code. I then start writing the code and I subsequently write/run some selective unit tests on that code. In summary, I dynamically build/test the code base toward the coarse, “upfront” design as I go. Of course, according to the high priests of TDD, I’m unprofessional and I obviously produce inferior designs and bug-ridden, unmaintainable code. Gosh, it sux to be me.
For more details on how I develop software, check out this four year old post: PAYGO II. And no, I’m not promoting it as better than your personal process or, gasp, the unassailable holy grail of software development… TDD! No books, magazine articles, conference talks, or two-day certification courses are planned. It’s simply better for me, and perhaps only me.
Plucked from his deliciously titled “Real Architecture: Engineering or Pompous Bullshit?” slide deck, I give you Tom Gilb‘s personal principles of software architecture engineering:
Tom’s proactive approach seems like a far cry from the reactive approaches of the “emergent architecture” and TDA (Test Driven Architecture) communities, doesn’t it?
:)! Tom’s list actually uses the words “engineering” and “the architect“. Maybe that’s why I have always appreciated his work so much.
The figure below depicts an architectural view of a real-time embedded sub-system that I and a team of 8 others built and delivered 10 (freakin!) years ago. At revision number 9, the diagram ended up being the final “as-built” model of the 20,000+ lines-of-code system. Since the software was written in C and, thus, not object-oriented, I chose not to use UML to capture the design at the time. Doing so would have introduced an impedance mismatch and a large intellectual gap of misunderstanding between the procedural C code base and the OO design artifacts. I used structured analysis and functional decomposition to concoct the design and I employed “pseudo” Data Flow Diagrams (DFD) instead.
At the beginning of this “waterfall” project, I created revision 0 of the diagram as the first “build-to” snapshot. Of course, as learning accrued and the system evolved throughout development, I diligently kept the diagram updated and synchronized with the code base in true PAYGO fashion.
As you can see from the picture, the system of 30+ asynchronous application tasks ran under the tutelage of the industrial-strength VxWorks Real Time Operating System (RTOS). Asynchronous inter-task communication was performed via message passing through a series of lock-protected queues. The embedded physical board was powered by a Motorola PowerPC CPU (remember those dinosaurs?). The board housed a myriad of serial and ethernet interface ports for communication to other external sub-systems.
The above diagram was not the sole artifact that I used to record the design. It was simply the highest level, catch-all, overview of the system. I also developed a complementary set of lower level functional diagrams; each of which captured a sliced view of an end-to-end strand of critical functionality. One of these diagrams, the “Uplink/Downlink Processing View“, is shown below. Note that the final “as-built” diagram settled out as revision number 5.
The purpose of this post was simply to give you a taste of how I typically design and evolve a non-trivial software-intensive system that I can’t entirely keep in my head. I use the same PAYGO process for all of my efforts regardless of whether the project is being managed as an agile or waterfall endeavor. To me, project process is way over-emphasized and overblown. “Business Value” creation ultimately distills down to architecture, design, coding, and testing at all levels of abstraction.
The purpose of abstraction is not to be vague, but to create a new semantic level in which one can be absolutely precise. — Edsger Dijkstra
With Edsger’s delicious quote in mind, let’s explore seven levels of abstraction that can be used to reason about big, distributed, systems:
At level zero, we have the finest grained, most concrete unit of design, a single puny line of “source code“. At level seven, we have the coarsest grained, most abstract unit of design, the mysterious and scary “system” level. A line of code is simple to reason about, but a “system” is not. Just when you think you understand what a system does, BAM! It exhibits some weird, perhaps dangerous, behavior that is counter-intuitive and totally unexpected – especially when humans are the key processing “nodes” in the beast.
Here are some questions to ponder regarding the seven level stack: Given that you’re hired to build a big, distributed system, at what level would you start your development effort? Would you start immediately coding up classes using the much revered TDD “best practice” and let all the upper levels of abstraction serendipitously “emerge”? Relatively speaking, how much time “up front” should you spend specifying, designing, recording, communicating the structures and behaviors of the top 3 levels of the stack? Again, relatively speaking, how much time should be allocated to the unit, integration, functional, and system levels of testing?
I’m not a fan of “emergent global architecture“, but I AM a fan of “emergent local design“. To mitigate downstream technical and financial risk, I believe that one has to generate and formally document an architecture at a high level of abstraction before starting to write code. To do otherwise would be irresponsible.
The figure below shows a portion of an initial “local” design that I plucked out of a more “global” architectural design. When I started coding and unit testing the cluster of classes in the snippet, I “discovered” that the structure wasn’t going work out. The API of the architectural framework within which the class cluster runs wouldn’t allow it to work without some major, internal, restructuring and retesting of the framework itself.
After wrestling with the dilemma for a bit, the following workable local design emerged out of the learning acquired via several wretched attempts to make the original design work. Of course, I had to throw away a bunch of previously written skeletal product and test code, but that’s life. Now I’m back on track and moving forward again. W00t!
Assume we have a valuable, revenue-critical software system in operation. The figure below shows one nice and tidy, powerpoint-worthy way to model the system; as a static, enumerated set of executables and libraries.
Given the model above, we can express the size of the system as:
Now, say we run a tool on the code base and it spits out a system size of 200K “somethings” (lines of code, function points, loops, branches, etc).
What does this 200K number of “somethings” absolutely tell us about the non-functional qualities of the system? It tells us absolutely nothing. All we know at the moment is that the system is operating and supporting the critical, revenue generating processes of our borg. Even relatively speaking, when we compare our 200K “somethings” system against a 100K “somethings” system, it still doesn’t tell us squat about the qualities of our system.
So, what’s missing here? One missing link is that our nice and tidy enumerations view and equation don’t tell us nuttin’ about what Russ Ackoff calls “the product of the interactions of the parts” (e.g Lib-to-Lib, Exe-Exe). To remedy the situation, let’s update our nice and tidy model with the part-to-part associations that enable our heap of individual parts to behave as a system:
Our updated model is still nice and tidy, but just not as nice and tidy as before. But wait! We are still missing something important. We’re missing a visual cue of our system’s interactions with “other” systems external to us; you know, “them”. The “them” we blame when something goes wrong during operation with the supra-system containing us and them.
Our updated model is once again still nice and tidy, but just not as nice and tidy as before. Next, let’s take a single snapshot of the flow of (red) “blood” in our system at a given point of time:
Finally, if we super-impose the astronomic number of all possible blood flow snapshots onto one diagram, we get:
D’oh! We’re not so nice and tidy anymore. Time for some heroic debugging on the bleeding mess. Is there a doctor in da house?
Having recently watched a newer incarnation of Barbara Liskov‘s terrific Turing award acceptance speech on InfoQ.com, “The Power Of Abstraction“, I started doodling on my visio canvas to see where it would take me. Somehow, I wanted to explore how the use of abstraction imbues power to its wielders.
The figure below attempts to represent 3 different software designs that can result from the analysis of a given set of requirements (how the requirements came to be “given” in the first place is a whole ‘nother issue).
On the left, we have a seven class solution candidate (C1….. C7 ) organized as three layers of abstraction. On the right, we have a three class flat solution (FC1, FC2, FC3) that implements the same functionality (e.g. FC1 encapsulates the functionality of C1 + C4 + C7). For dramatic contrast, we have a fugly, single-class, monolith in the middle with all the solution functionality entombed within the MC1 class sarcophagus.
So, what advantage, if any, does the three tier, abstract design give stakeholders over the two, flat, down-to-earth designs? Depending on the requirements specifics, it may offer up no advantage and might actually be the worst candidate in terms of code-ability, understandability, and maintainability. There are more “parts” and more inter-part interfaces. It may be overkill to transform the requirements into 3 layers of abstraction before (or during?) coding.
However, as a system to be coded gets larger and more complex, the intelligent use of abstract vertical layering and horizontally balancing can speed up system development and decrease maintenance costs via increased readability and understandability from multiple viewing angles. For large systems, conceptual “chunking“, both vertically in the form of layering and horizontally in the form of balancing is a winning strategy; especially when coupled with Miller’s magic number 7 (no more than 7 +/- 2 abstract elements within a given layer and no more than 7 +/- 2 abstract layers in the stack). Relatively speaking, the smaller, bounded parts can be doled out to team members more easily and integration will be less painful.
Note that doing some just-enough “pre-planning” in terms of layering/balancing the system’s structure/behavior seems to fly in the face of TDD – where you sprinkle a bunch of user stories from the backlog onto a group of programmers and have them start writing tests so that the design can miraculously emerge. But, as the saying goes: “whatever floats your boat“.
When not ranting and raving on this blawg about “great injustices” (LOL) that I perceive are keeping the world from becoming a better place, I design, write, and test real-time radar system software for a living. I use the UML before, during, and after coding to capture, expose, and reason about my software designs. The UML artifacts I concoct serve as a high level coding road map for me; and a communication tool for subject matter experts (in my case, radar system engineers) who don’t know how to (or care to) read C++ code but are keenly interested in how I map their domain-specific requirements/designs into an implementable software design.
I’m not a UML language lawyer and I never intend to be one. Luckily, I’m not forced to use a formal UML-centric tool to generate/evolve my “bent” UML designs (see what I mean by “bent” UML here: Bend It Like Fowler). I simply use MSFT Visio to freely splat symbols and connections on an e-canvas in any way I see fit. Thus, I’m unencumbered by a nanny tool telling me I’m syntactically/semantically “wrong!” and rudely interrupting my thought flow every five minutes.
The 2nd graphic below illustrates an example of one of my typical class diagrams. It models a small, logically cohesive cluster of cooperating classes that represent the “transmit timeline” functionality embedded within a larger “scheduler” component. The scheduler component itself is embedded within yet another, larger scale component composed of a complex amalgam of cooperating hardware and software components; the radar itself.
When fully developed and tested, the radar will be fielded within a hostile environment where it will (hopefully) perform its noble mission of detecting and tracking aircraft in the midst of random noise, unwanted clutter reflections, cleverly uncooperative “enemy” pilots, and atmospheric attenuation/distortion. But I digress, so let me get back to the original intent of this post, which I think has something to do with how and why I use the UML.
The radar transmit timeline is where other necessarily closely coupled scheduler sub-components add/insert commands that tell the radar hardware what to do and when to do it; sometime in the future relative to “now“. As the radar rotates and fires its sophisticated, radio frequency pulse trains out into the ether looking for targets, the scheduler is always “thinking” a few steps ahead of where the antenna beam is currently pointing. The scheduler relentlessly fills the TxTimeline in real time with beam-specific commands. It issues those commands to the hardware early enough for the hardware to be able to queue, setup, and execute the minute transmit details when the antenna arrives at the desired command point. Geeze! I’m digressing yet again off the UML path, so lemme try once more to get back to what I originally wanted to ramble about.
Being an unapologetic UML bender, and not a fan of analysis-paralysis, I never attempt to meticulously show every class attribute, operation, or association on a design diagram. I weave in non-UML symbology as I see fit and I show only those elements I deem important for creating a shared understanding between myself and other interested parties. After all, some low level attributes/operations/classes/associations will “go away” as my learning unfolds and others will “emerge” during coding anyway, so why waste the time?
Notice the “revision number” in the lower right hand corner of the above class diagram. It hints that I continuously keep the diagram in sync with the code as I write it. In fact, I keep the applicable diagram(s) open right next to my code editor as I hack away. As a PAYGO practitioner, I bounce back and forth between code & UML artifacts whenever I want to.
The UML sequence diagram below depicts a visualization of the participatory role of the TxTimeline object in a larger system context comprised of other peer objects within the scheduler. For fear of unethically disclosing intellectual property, I’m not gonna walk through a textual explanation of the operational behavior of the scheduler component as “a whole“. The purpose of presenting the sequence diagram is simply to show you a real case example that “one diagram is not enough” for me to capture the design of any software component containing a substantial amount of “essential complexity“. As a matter of fact, at this current moment in time, I have generated a set of 7+ leveled and balanced class/sequence/activity diagrams to steer my coding effort. I always start coding/testing with class skeletons and I iteratively add muscles/tendons/ligaments/organs to the Frankensteinian beast over time.
In this post, I opened up my trench coat and
showed you my… attempted to share with you an intimate glimpse into the way I personally design & develop software. In my process, the design is not done “all upfront“, but a purely subjective mix of mostly high and low level details is indeed created upfront. I think of it as “Big Design, But Not All Upfront“.
Despite what some code-centric, design-agnostic, software development processes advocate, in my mind, it’s not just about the code. The code is simply the lowest level, most concrete, model of the solution. The practices of design generation/capture and code slinging/testing in my world are intimately and inextricably coupled. I’m not smart enough to go directly to code from a user story, a one-liner work backlog entry, a whiteboard doodle, or a set of casual, undocumented, face-to-face conversations. In my domain, real-time surveillance radar systems, expressing and capturing a fair amount of formal detail is (rightly) required up front. So, screw you to any and all NoUML, no-documentation, jihadists who happen to stumble upon this post. :)