September 29, 2014

Code Inspections: A Powerful Testing Tool

Quick post about code inspections.

Now, when I say "code inspections" to most developers, it conjures up images of teams sitting in airless meeting rooms staring at someone's code and raising issues about style, about formatting, about naming conventions, about architectural patterns and so on.

But this is not what I really mean by "code inspections". For me, first and foremost, inspections are a king of testing. In fact, statistically speaking, inspections are still the most effective testing technique we know of.

Armed with a clear understanding of what is required of a program (and in this context, "program" means a unit of executable code), we inspect that code and ask if it does what we think it should - "at this point, is what we think should be true actually true?" - and look for possibilities of when it could go wrong and ask "could that ever happen?"



For example, there's a bug in this code. Can you see it?

Instead of writing a shitload of unit tests, hoping the bug will reveal itself, we can do what is often referred to as a bit of whitebox testing. That is, we test the code in full knowledge of what the code is internally. The economics of inspections become clear when we realise the full advantage of not going into testing blind, like we do with blackbox testing. We can see where the problems might be.

There are numerous ways we can approach code inspections, but I find a combination of two specific techniques most effective:

1. Guided Inspection - specifically, inspections guided by test cases. We choose an interesting test case, perhaps by looking at the code and seeing where interesting boundaries might occur. Then we execute the code in our heads, or on paper, showing how program state is affected with each executable step and asking "what should be true now?" at key points in execution. Think of this as pre-emptive debugging.

2. Program Decomposition - break the code down into the smallest executable units, and reason about the contracts that apply to each unit. (Yes, everything in code that does something has pre- and post-conditions.)

In practice, these two techniques work hand in hand. Reasoning about the contracts that apply to the smallest units of executable code can reveal interesting test cases we didn't think of. It also helps us to "fast-forward" to an interesting part of a wider test scenario without having to work through the whole process up to that point.

For example, let's focus on the program statement:

sequence[i] = sequence[i - 1] + sequence[i - 2]

What happens if the result of evaluating sequence[i - 1] + sequence[i - 2] is a number too large to fit in a Java int? The maximum value of an int in Java is 2,147,483,647. This code might only work when the sum of the previous two numbers is less than or equal to that max value.

Fibonacci numbers get pretty big pretty fast. In fact, the sum of the 48th and 49th numbers in the sequence is greater than 2,147,483,647.

And there's our bug. This code will only work up to the 49th Fibonacci number. To make it correct, we have choices; we could add a guard condition that blocks any sequenceLength greater than 49. (Or, for fans of Design By Contract, simply make it a pre-condition that sequenceLength < 50.)

Or we could change the data type of the sequence array to something that will hold much larger numbers (e.g., a long integer - although that, too, has its limits, so we may still need to guard against sequence lengths that go too high eventually.)

Now, of course, unit tests might have revealed this bug. But without exception, when teams tackle the Fibonacci sequence TDD exercise in training, if they spot it all, they spot it by reading the code. And, in this example, it would be extraordinary luck to choose exactly the right input - sequenceLength = 50 - to tell us where the boundary lies. So we'd probably still have to debug the code after seeing the test fail.

Reading the code, you may well spy other potential bugs.

And, yes, I know. This is a very trivial example. but that only reinforces my point: it's a very trivial example with bugs in it!

Try it today on some of your code. Are there any bugs lurking in there that nobody's found yet? I'll bet you a shiny penny there are.



September 28, 2014

Software Development Isn't Just Programming

Alerted to a piece on wired.com this morning about a new website, exercism.io, that promises programmers the "deep practice" they need to become good enough to do it for a living.

You won't be at all surprised to learn that I disagree, or why.

Software development, as distinct from programming, requires a wide and complex set of disciplines. The part where we write the code is, in reality, just a small part of what we do.

In particular, how can such sites exercise that most important of developer muscles, for collaborating?

The Codemanship Team Dojo, which I occasionally enjoy putting client teams through and have run with some large groups at conferences and workshops, provides a very clear illustration of why a team full of strong programmers is not necessarily a strong team.

Problems that seem, in isolation, trivial for a programmer to solve can quickly become very complicated indeed the minute we add another programmer (Hell is other programmers.)

Folk have got to understand each other. Folk have got to agree on stuff. Folk have got to coordinate their work. And folk have got to achieve things with limited time and resources.

This, it turns out, is the hard stuff in software development.

I know some technically very strong developers who consistently seem to fall flat on their faces on real teams working on real problems for real customers.

Don't get me wrong; I think the technical skills are very important. But I consider them a foundation for a software developer, and not the be-all-and-end-all.

For my money, a well-rounded learning experience for an aspiring software developer would include collaboration within something approximating a real team, would include something approximating a human customer (or customers, just to make it more interesting), would require the resulting software to be deployed in front of something approximating real users, and would encompass a wider set of technical disciplines that we find are necessary to deliver working software.

And their contention that professional programmers need "hundreds of hours" of practice is missing a trailing zero.






September 25, 2014

Functional Programming Is Great. But It Ain't Magic.

An increasing annoyance in my day-to-day job as a coach and trainer is what I call FPF, or "Functional Programming Fanaticism". Typically, it emanates from people who've recently discovered FP in the last few years, and have yet to realise that - like all programming innovations since the 1940's - it doesn't actually solve all the problems for us.

Putting aside the widely-held perception that functional programs can be considerably less easy to understand, even for experienced FP-ers, (and this is no small consideration when you realise that trying to understand program code is where programmers spend at least half our time), there is the question of side effects.

More specifically, people keep telling me that functional programs don't have any. This is patently not true: a program with no side effects is a program which does nothing of any use to us. Somewhere, somehow, data's got to change. Try writing a word processor that doesn't have side effects.

FP helps us write more reliable code - in particular, more reliable concurrent code - by limiting and localising side effects. But only if you do it right.

It's entirely possible to write functional programs that are riddled with concurrency errors, and, indeed, that's what many teams are doing as we speak.

How can this be so, though, if functions are said to be "clean" - side-effect free? Well, that bank account balance that gets passed from one function to next may indeed be a copy (of a copy of a copy) of the original balance, but from the external user's perspective, whatever the current balance is, that is the balance (and it has changed.)

The moment we persist that change (e.g., by writing it to the database, or through transactional memory, or however we're handling shared data), the deed is done. Ipso facto: side effect.

Languages like Haskell, Clojure and that other one that sounds like "Camel" don't do our concurrent thinking for us. If joint account holder A checks their balance before trying to use the debit card, but joint account holder B uses their debit card before A does, then - you may be surprised to learn - these languages have no built-in feature for reconciling joint account transaction paradoxes like this. You have to THINK ABOUT HOW YOUR SOFTWARE SHOULD HANDLE CONCURRENT SCENARIOS from the user's perspective.

In non-FP work, we seek to make concurrent systems more reliable and more, well, concurrent, by strictly limiting and localising concurrent access to shared data. FP just embeds this concept within the languages themselves, making that easier and more reliable to do.

Just as workflow frameworks don't decide what should happen in your workflows, functional programs don't decide how your application should handle side-effects. The best they can do is give you the tools to realise the decisions you make.

What I'm seeing, though, (and this was case when we were all prostrating before the Great Workflow Ju Ju In The Sky a decade or so ago), is that teams mistakenly lean on the technology, believing through some kind of magic that it will handle these scenarios for them. But, like all computer programs, they will only do exactly what we tell them to.

It's not magic, folks. It's just code.


September 17, 2014

The 4 C's of Continuous Delivery

Continuous Delivery has become a fashionable idea in software development, and it's not hard to see why.

When the software we write is always in a fit state to be released or deployed, we give our customers a level of control that is very attractive.

The decision when to deploy becomes entirely a business decision; they can do it as often as they like. They can deploy as soon as a new feature or a change to an existing feature is ready, instead of having to wait weeks or even months for a Big Bang release. They can deploy one change at a time, seeing what effect that one change has and easily rolling it back if it's not successful without losing 1,001 other changes in the same release.

Small, frequent releases can have a profound effect on a business' ability to learn what works and what doesn't from real end users using the software in the real world. It's for this reason that many, including myself, see Continuous Delivery as a primary goal of software development teams - something we should all be striving for.

Regrettably, though, many software organisations don't appreciate the implications of Continuous Delivery on the technical discipline teams need to apply. It's not simply a matter of decreeing from above "from now on, we shall deliver continuously". I've watched many attempts to make an overnight transition fall flat on their faces. Continuous Delivery is something teams need to work up to, over months and years, and keep working at even after they've achieved it. You can always be better at Continuous Delivery, and for the majority of teams, it would pay dividends to improve their technical discipline.

So let's enumerate these disciplines; what are the 4 C's of Continuous Delivery?

1. Continuous Testing

Before we can release our software, we need confidence that it works. If our aim is to make the software available for release at a moment's notice, then we need to be continuously reassuring ourselves - through testing - that it still works after we've made even a small change. The secret sauce here is being able to test and re-test the software to a sufficiently high level of assurance quickly and cheaply, and for that we know of only one technical practice that seems to work: automate our tests. It's for this reason that a practice like Test-driven Development, which leaves behind a suite of fast-running automated tests (if you're doing TDD well) is a cornerstone of the advice I give for transitioning to Continuous Delivery.

2. Continuous Integration

As well as helping us to flag up problems in integrating our changes into a wider system, CI is also fundamental to Continuous Delivery. If it's not in source control, it's going to be difficult to include it in a release. CI is the metabolism of software development teams, and a foundation for Continuous Delivery. Again, automation is our friend here. Teams that have to manually trigger compilation of code, or do manual testing of the built software, will not be able to integrate very often. (Or, more likely, they will integrate, but the code in their VCS will likely as not be broken at any point in time.)

3. Continuous Inspection

With the best will in the world, if our code is hard to change, changing it will be hard. Code tends to deteriorate over time; it gets more complicated, it fills up with duplication, it becomes like spaghetti, and it gets harder and harder to understand. We need to be constantly vigilant to the kind of code smells that impede our progress. Pair Programming can help in this respect, but we find it insufficient to achieve the quality of code that's often needed. We need help in guarding against code smells and the ravages of entropy. Here, too, automation can help. More advanced teams use tools that analyse the code and detect and report code smells. This may be done as part of a build, or the pre-check-in process. The most rigorous teams will fail a build when a code smell is detected. Experience teaches us that when we let code quality problems through the gate, they tend to never get addressed. Implicit in ContInsp is Continuous Refactoring. Refactoring is a skill that many - let's be honest, most - developers are still lacking in, sadly.

Continuous Inspection doesn't only apply to the code; smart teams are very frequently showing the software to customers and getting feedback, for example. You may think that the software's ready to be released, because it passes some automated tests. But if the customer hasn't actually seen it yet, there's a significant risk that we end up releasing something that we've fundamentally misunderstood. Only the customer can tell us when we're really "done". This is a kind of inspection. Essentially, any quality of the software that we care about needs to be continuously inspected on.

4. Continuous Improvement

No matter how good we are at the first 3 C's, there's almost always value in being better. Developers will ask me "How will we know if we're over-doing TDD, or refactoring?", for example. The answer's simple: hell will have frozen over. I've never seen code that was too good, never seen tests that gave too much assurance. In theory, of course, there is a danger of investing more time and effort into these things than the pay-offs warrant, but I've never seen it in all my years as a professional developer. Sure, I've seen developers do these things badly. And I've seen teams waste a lot of time because of that. But that's not the same thing as over-doing it. In those cases, Continuous Improvement - continually working on getting better - helped.

DevOps in particular is one area where teams tend to be weak. Automating builds, setting up CI servers, configuring machines and dealing with issues like networking and security is low down on the average programmer's list of must-have skills. We even have a derogatory term for it: "shaving yaks". And yet, DevOps is pretty fundamental to Continuous Delivery. The smart teams work on getting better at that stuff. Some get so good at it they can offer it to other businesses as a service. This, folks, is essentially what cloud hosting is - outsourced DevOps.

Sadly, software organisations who make room for improvement are in a small minority. Many will argue "We don't have the time to work on improving". I would argue that's why they don't have the time.







September 16, 2014

Why We Iterate

So, in case you were wondering, here's my rigorous and highly scientific process for buying guitars...

It starts with a general idea of what I think I need. For example, for a couple of years now I've been thinking I need an 8-string electric guitar, to get those low notes for the metalz.

I then shop around. I read the magazines. I listen to records and find out what guitars those players used. I visit the manufacturers websites and read the specifications of the models that might fit. I scout the discussion forums for honest, uncensored feedback from real users. And gradually I build up a precise picture of exactly what I think I need, down to the wood, the pickups, the hardware, the finish etc.

And then I go to the guitar shop and buy a different guitar.

Why? Because I played it, and it was good.

Life's full of expectations: what would it be like to play one of Steve Vai's signature guitars? What would it be like to be a famous movie star? What would it be like to be married to Uma Thurman?

In the end, though, there's only one sure-fire way to know what it would be like. It's the most important test of all. Sure, an experience may tick all of the boxes on paper, but reality is messy and complicated, and very few experiences can be completely summed up by ticks in boxes.

And so it goes with software. We may work with the customer to build a detailed and precise requirements specification, setting out explicitly what boxes the software will need to tick for them. But there's no substitute for trying the software for themselves. From that experience, they will learn more than weeks or months or years of designing boxes to tick.

We're on a hiding to nothing sitting in rooms trying to force our customers to tell us what they really want. And the more precise and detailed the spec, the more suspicious I am of it. bottom line is they just don't know. But if you ask them, they will tell you. Something. Anything.

Now let me tell you how guitar custom shops - the good ones - operate.

They have a conversation with you about what guitar you want them to create for you. And then they build a prototype of what you asked for. And then - and this is where most of the design magic happens - they get you to play it, and they watch and they listen and they take notes, and they learn a little about what kind of guitar you really want.

Then they iterate the design, and get you to try that. And then rinse and repeat until your money runs out.

With every iteration, the guitar's design gets a little bit less wrong for you, until it's almost right - as right as they can get it with the time and money available.

Custom guitars can deviate quite significantly from what the customer initially asked for. But that is not a bad thing, because the goal here is to make them a guitar they really need; one that really suits them and their playing style.

In fact, I can think of all sorts of areas of life where what I originally asked for is just a jumping-off point for finding out what I really needed.

This is why I believe that testing - and then iterating - is most importantly a requirements discipline. It needs to be as much, if not more, about figuring out what the customer really needs as it is about finding out if we delivered what they asked for.

The alternative is that we force our customers to live with their first answers, refusing to allow them - and us - to learn what really works for them.

And anyone who tries to tell you that it's possible to get it right - or even almost right - first time, is a ninny. And you can tell them I said that.


September 13, 2014

What Is Continuous Inspection, Anyway?

Following on from my braindump about Continuous Inspection (ContInsp, as I've recently started to abbreviate it), a few people have asked "What is Continuous Inspection?"

I may have been getting ahead of myself, and overestimated how far this idea has reached. So here's a quick lowdown on Continuous Inspection for the uninitiated.

Just as we discovered that it works out better if we test our software frequently throughout development, instead of having one big testing phase near the end, teams have also learned that instead of having occasional Big Code Inspections - and we all know how useful they are - it's better to inspect the code frequently throughout development.

Catching code quality problems earlier can have similar benefits to catching bugs earlier. Firstly, they're cheaper to rectify when there's less code surrounding them (the equivalent of weeding before a thick jungle grows around the weeds that we'll have to hack our way through). And secondly, when we talk about "code quality problems", we're often-as-not talking about design issues that can hinder progress later on. So, the earlier we tackle code quality problems, the cheaper they tend to be to fix, and the faster we tend to go because of lower impedance in our code.

There's also the question of focus. It's said that if you show a programmer a line of code and ask "What's wrong with this code?", they'll tell you. If you show them 1,000 lines of code, they'll shrug their shoulders and say "Looks okay to me."

Better to deal with one design issue at a time, and to fix them as soon as they appear.

For this reason, we see a need - well, those of us who give a damn - to be continuously monitoring the code for the appearance of code quality problems, and to be eliminating them as soon as we spot them.

On XP teams, there are a number of techniques that can help in this respect:

1. Pair Programming - when done well (i.e., with sufficient attention paid to code quality), having a second pair of eyes on the code as it's being written can help spot problems earlier. Pair Programming can also help raise awareness of what code quality problems look like among team members.

2. Test-driven Development - for two reasons: a. because it focuses us on making one or two design decisions at a time, and b. because it explicitly provides a reminder after passing each test that tells us to LOOK CAREFULLY AT THE CODE NOW and refactor it until we're happy with the quality before we move on to the next failing test.

3. Automated Code Inspections - this final technique is still relatively rare among teams, but is growing in popularity. It has the same advantage as automated tests, in that it can help us to continually re-inspect the quality of our code, asking many questions about many lines of code, cheaply and quickly. Arguably, ContInsp doesn't really scale until you start using automation for at least some of it.

Teams are increasingly seeing automated code inspections as the best way to guard against code rot. At the very least, it can provide a clear line that developers must not cross when checking in their code, with alarm bells ringing when some does.

To give you an example, let's say we agree as a team that methods shouldn't make more than one decision or contain more than one loop. We duly program our static code analysis tool to flag up any methods with a cyclomatic complexity (a measure of the number of paths through the code) of more than 2. Bill writes a method with an IF statement inside a FOR loop, and tries to check his code in. The alarm goes off, the team goes to DEFCON ONE, and the problem has to be fixed - by, say, extracting the IF statement into its own self-describing method - before anyone else can check their code in.

This may seem very strict to you, but at the very least, teams should consider having the early warning, even if its just to trigger a discussion and a chance to consciously make a decision as to whether to let it through.

The alternative is what we currently have, where developers can merrily check in code riddled with new smells and nobody finds out until the cement has set, so to speak.

On new code, it's a good idea to set the quality bar as high as you comfortably can, because as the ode grows and becomes more complex, it's better to knowingly make small concessions and compromises on quality than to realise too late that the quality stinks and it's going to take you a year to make the code anything like as maintainable as you need for adding new features.

Legacy code, though, is a different matter. Likely, it's already crammed full of quality problems. Applying your usual high quality to bar will just reveal a yawning chasm that will demoralise the team. (Unless your goal is simply to benchmark, of course.)

I recommend in that instance using inspections in an exploratory way:; the goal being to discover what the internal quality of the code is. Once you've established the current reality, draw a line there and set up your ContInsp to enforce it not getting any worse. Then , as you being to refactor the code, and as the quality improves, raise the bar to meet it, always enforcing the policy that it should get no worse.

Gradually, as your code is rehabilitated back into a condition that can accommodate change, you tighten the screws on the vice of quality metrics that are holding it where it is, until all the dials are back in the green.

And do not, under any circumstances, make these code quality reports available to managers. Let them focus on what matters to the business, while you focus on sustaining the pace of frequent, reliable software deliveries.

Which brings me to my final point about Continuous Inspection; just as we see continuous testing and continuous integration and keys to achieving Continuous Delivery, so too should we consider adding ContInsp to that mix (along with Continuous Improvement), because if our code gets harder and harder to change, then we're going to end up Continuously Delivering software that has no added value.



September 12, 2014

Exothermic vs. Endothermic Change - Why Coaches Should Be A Match, Not The Sun

An analogy I sometimes use to explain my approach to promoting positive change in software development organisations is the difference between exothermic and endothermic reactions.

If you think back to high school chemistry, an exothermic reaction is that generates heat from within. The combustion of fuels like petrol and wood is an example of an exothermic reaction. Sitting around a campfire on a cold night is one way in which we benefit from exothermic reactions.

Conversely, an endothermic reaction is one that draws energy (heat) in from its surroundings to make it go. For example, photosynthesis in plants is an endothermic reaction powered by the sun.

The key here to understanding the Codemanship way is to appreciate that if the sun stops shining then photosynthesis stops, too. Whereas a campfire may keep on burning until all the useful fuel - in the form of combustible carbohydrates - is used up. In the case of the campfire, the reaction is triggered by an outside force - e.g., a match - but once the fire's going it sustains itself from within. An endothermic reaction needs continued outside stimulation - a constant input of external energy - or it stops.

Projecting that idea - albeit spuriously - on to fostering change in dev teams, as an outside force I would rather be a match lighting a campfire than the sun driving chemical reactions in a plant. (The two are, of course, related. The energy we're burning on the campfire came from the sun via photosynthesis, but that's the problem with analogies.)

My approach is to turn up, inject a big dose of external energy into the system, and try to get a fire started. For that, we need the system to have its own fuel. This is the people, and their energy and enthusiasm for doing things better.

The conditions need to be right, or once I stop injecting energy, the reaction will stop, too. Many development teams are the equivalent of damp wood, their enthusiasm having been dampened by years of hard grind and demotivation. They need some preparing before we can light the fire.

The calories to burn, though, are always there. It's not easy becoming even a mediocre software developer. There would have been a time when anyone who does it for living was enthused and motivated to work through the pain and learn how to make programs work. That enthusiasm is rarely lost forever, even though it may often be buried deep beneath the battle-scarred surface.

So my focus tends to be on recapturing that joy of programming so that the latent energy and enthusiasm can be more easily ignited, starting a self-sustaining process of change that mostly comes from within the teams and doesn't have to be continually driven from outside.

This is why, putting specific practices and technical knowledge aside, Codemanship is chiefly about addressing developer culture. Workshops on TDD, refactoring, OO design and all manner of goodly Extreme stuff are really just hooks on which to hang that hat: an excuse to have conversations about what being a software developer means to you, about the developer culture in your organisation, and to do more than a little rabble rousing. That you leave having learned something about red-green-refactor is arguably less important than if you leave thinking "I'm as mad as hell, and I'm not going to take it any more!"

This is all because I believe that writing software can be, and for some people, is the best job in the world. Well, maybe not the best - but it's certainly got the potential to be a great way to make a living. I wake up every day thankful that I get to do this. It pains me to see developers who've had that beaten out of them by the school of hard knocks.

Real long-term change seems always to come from within. It has to be a self-sustaining process, driven almost unconsciously by teams who love what they do. Ultimately, it's something teams must do for themselves. All I can do is light the match.




Exothermic vs. Endothermic Change - Why Coaches Should Be A Match, Not The Sun

An analogy I sometimes use to explain my approach to promoting positive change in software development organisations is the difference between exothermic and endothermic reactions.

If you think back to high school chemistry, an exothermic reaction is that generates heat from within. The combustion of fuels like petrol and wood is an example of an exothermic reaction. Sitting around a campfire on a cold night is one way in which we benefit from exothermic reactions.

Conversely, an endothermic reaction is one that draws energy (heat) in from its surroundings to make it go. For example, photosynthesis in plants is an endothermic reaction powered by the sun.

The key here to understanding the Codemanship way is to appreciate that if the sun stops shining then photosynthesis stops, too. Whereas a campfire may keep on burning until all the useful fuel - in the form of combustible carbohydrates - is used up. In the case of the campfire, the reaction is triggered by an outside force - e.g., a match - but once the fire's going it sustains itself from within. An endothermic reaction needs continued outside stimulation - a constant input of external energy - or it stops.

Projecting that idea - albeit spuriously - on to fostering change in dev teams, as an outside force I would rather be a match lighting a campfire than the sun driving chemical reactions in a plant. (The two are, of course, related. The energy we're burning on the campfire came from the sun via photosynthesis, but that's the problem with analogies.)

My approach is to turn up, inject a big dose of external energy into the system, and try to get a fire started. For that, we need the system to have its own fuel. This is the people, and their energy and enthusiasm for doing things better.

The conditions need to be right, or once I stop injecting energy, the reaction will stop, too. Many development teams are the equivalent of damp wood, their enthusiasm having been dampened by years of hard grind and demotivation. They need some preparing before we can light the fire.

The calories to burn, though, are always there. It's not easy becoming even a mediocre software developer. There would have been a time when anyone who does it for living was enthused and motivated to work through the pain and learn how to make programs work. That enthusiasm is rarely lost forever, even though it may often be buried deep beneath the battle-scarred surface.

So my focus tends to be on recapturing that joy of programming so that the latent energy and enthusiasm can be more easily ignited, starting a self-sustaining process of change that mostly comes from within the teams and doesn't have to be continually driven from outside.

This is why, putting specific practices and technical knowledge aside, Codemanship is chiefly about addressing developer culture. Workshops on TDD, refactoring, OO design and all manner of goodly Extreme stuff are really just hooks on which to hang that hat: an excuse to have conversations about what being a software developer means to you, about the developer culture in your organisation, and to do more than a little rabble rousing. That you leave having learned something about red-green-refactor is arguably less important than if you leave thinking "I'm as mad as hell, and I'm not going to take it any more!"

This is all because I believe that writing software can be, and for some people, is the best job in the world. Well, maybe not the best - but it's certainly got the potential to be a great way to make a living. I wake up every day thankful that I get to do this. It pains me to see developers who've had that beaten out of them by the school of hard knocks.

Real long-term change seems always to come from within. It has to be a self-sustaining process, driven almost unconsciously by teams who love what they do. Ultimately, it's something teams must do for themselves. All I can do is light the match.




September 11, 2014

Code Metrics, Continuous Inspection & What We Should Tell The Boss

So, just a bit of time during the lunch break to jot down thoughts about a debate we had this morning while running a client workshop on Continuous Inspection. (Yes, that's a thing.)

ContInsp - as I have just decided to call it, because "CI" is already taken - is the practice of frequently monitoring the code quality of our software, usually through a combination of techniques like pair programming, static analysis and more rigorous code reviews, to give us early warning about problems that have recently been introduced. Code quality problems are bugs of a non-functional nature, and bugs have associated costs. Like functional bugs, code quality bugs tend to cost exponentially more to fix the longer we leave them in the code (because the code continues to grow around them, making them harder to fix.) Code smells that end up in the software have therefore a tendency to still be there years later, impeding the progress of developers in the future and multiplying the cost of change. For this reason,, we find that it can be better to catch these problems and deal with them earlier - the sooner the better. Hence, in the spirit of Extreme Programming, we turn the code inspections dial up to "eleven" and do them early and as often as we can afford to. Just as it helps with continuous testing, and continuous integration, automation also makes ContInsp more affordable and more viable.

A decent ContInsp set-up might incorporate a static code analysis tool into the build cycle, providing effectively a small suite of non-functional tests into their Continuous Integration regimen. If, say, someone checks in code that adds too many branches to a method, a red flag goes up. Some teams will even have it set so that the build will fail until the problem's fixed.

Anyhoo, much discussion of ContInsp, and code analysis, and metrics and what we should measure/monitor. But then the question was raised "What do we tell the boss?"

It's an important question. It seems, from my own experience, to be a potentially costly mistake to report code metrics to managers. Either they don't really understand them - in which case, we may as well be reporting eigenvalues for all the good it will do - or they don't care. Or, even worse, they do care...

I've seen many a time code metrics used as a stick with which to beat development teams. Code quality deteriorates, the managers beat them with the stick, and the code quality deteriorates even more, so they get a bigger stick. And so on.

Really, this kind of information is only practically useful for people who are in a position to make the needles move on the dials. That, folks, is just us.

So the intended audience for things like cyclomatic complexity, class coupling, fan in/out/shake-it-all-about is people working on the code. Ideally, these people will know what to do to fix code quality problems - i.e., refactoring (a sadly rare skillset, even today) - and who are empowered to do so when they feel it's necessary (i.e., the code smell is inhibiting change.)

My own experience has taught me never to have conversations with managers about either code quality or about refactoring. To me, these issues are as fundamental to programming as FOR loops. And when did you last have a conversation with your boss about FOR loops?

But managers do have a stake in code quality. That is to say, they have a stake in the consequences of code quality or the lack thereof.

If code is hard to read, or takes hours to regression test, or has the consistency of spaghetti, or is riddled with chunks of copied-and-pasted logic, then the cost of changing that code will be higher than if it was readable, simple, modular and largely duplication-free. They might not understand the underlying causes, but they will feel the effects. Oh boy, will they feel them?!

So we've been having a lively debate about this question: what do we tell the boss?

But perhaps that's the wrong question. Perhaps the question should really be: what should the boss tell us?

Because the really interesting data - for the boss and for us - is what impact our decisions have in the wider context. For example, what is the relative cost of adding or changing a line of code in the software, and how does it change as time goes on?

I've seen product teams brought to their knees by high cost of change. In actual fact, I've seen multi-billion dollar corporations brought to their knees by the high cost of changing code in a core product.

Another important question might be "How long does it take us to deliver change, from conception to making that change available for actual use?"

Or "How frequently can we release the software, and how reliable is it when we do?"

Or "How much of our money are we spending fixing bugs vs. adding/changing features?"

All of these are questions about the medium-to-long-term viability of the software solution. I think this is stuff we need to know, particularly if we believe - as I strongly do - that most of the real value in software is added after the first release, and that iterating the design is absolutely necessary to solving the problem and reaping the rewards.

The most enlightened and effective teams monitor not just code quality, but the effects of code quality, and are able to connect the dots between the two.

Ah, well. Braindump done. So much for lunch. Off to grab a sandwich.




September 8, 2014

Iterating Is Fundamental

Just like it boggles my mind that, in this day and age of electric telephones and Teh Internets, we still debate whether an invisible man in the sky created the entire universe in 6 days, so too is my mind boggled that - in 2014 - we still seem to be having this debate about whether or not we should iterate our software designs.

To me, it seems pretty fundamental. I struggle to recall a piece of software I've worked on - of any appreciable complexity or sophistication - where getting it right first time was realistic. On my training courses, I see the need to take multiple passes on "trivial" problems that take maybe an hour to solve. Usually this is because, while the design of a solution may be a no-brainer, it's often the case that the first solution solves the wrong problem.

Try as I might to spell out the requirements for a problem in clear, plain English, there's still a need for me to hover over developers' shoulders and occasionally prod them to let them know that was not what I meant.

That's an example of early feedback. I would estimate that at least half the pairs in the average course would fail to solve the problem if I didn't clear up these little misunderstandings.

It's in no way an indictment of those developers. Put me in the exact same situation, and I'm just as likely to get it wrong. It's just the lossy, buggy nature of human communication.

That's why we agree tests; to narrow down interpretations until there's no room for misunderstandings.

In a true "waterfall" development process - bearing in mind that, as I've said many times, in reality there's no such thing - all that narrowing down would happen at the start, for the entire release. This is a lot of work, and requires formalisms and rigour that most teams are unfamiliar with and unwilling to attempt.

Part of the issue is that, when we bite off the whole thing, it beecomes much harder to chew and much harder to digest. Small, frequent releases allow us to focus on manageable bitesized chunks.

But the main issue with Big Design Up-Front is that, even if we pin down the requirements precisely and deliver a bug-free implementation of exactly what was required, those requirements themselves are open to question. Is that what the customer really needs? Does it, in reality, solve their problem?

With the best will in the world, validating a system's requirements to remove all doubt about whether or not it will work in the real world, when the system is still on the drawing board, is extremely difficult. At some point, users need something that's at the very least a realistic approximation of the real system to try out in what is, at the very least, a realistic approximation of the real world.

And here's the the thing; it's in the nature of software that a realistic approximation of a program is, in effect, the program. Software's all virtual, all simulation. The code is is the blueprint.

So, in practice, what this means is that we must eventually validate our software's design - which is the software itself - by trying out a working version in the kinds of environments it's intended to be used in to try and solve the kinds of problems the software's designed to solve.

And the sooner we do that, the sooner we learn what needs to be changed to make the software more fit for purpose.

Put "agility" and "business change" to the back of your mind. Even if the underlying problem we want to solve stays completely static throughout, our understanding of it will not.

I've seen it time and again; teams agonise over features and whether or not that's what the customer really needs, and then the software's released and all that debate becomes academic, as we bump heads with the reality of what actually works in the real world and what they actually really need.

Much - maybe most - of the value in a software product comes as a result of user feedback. Twitter is a classic example. Look how many features were actually invented by the users themselves. We invented the Retweet (RT). We invented addressing tweets to users (using @). We invented hastags (#) to follow conversations and topics. All of the things that make tweets go viral, we invented. Remember that the founders of Twitter envisioned a micro-blogging service in the beginning. Not a global, open messaging service.

Twitter saw what users were doing with their 140 characters, and assimilated it into the design, making it part of the software.

How much up-front design do you think it would have taken them to get it right in the first release? Was their any way of knowing what users would do with their software without giving them a working version and watching what they actually did? I suspect not.

That's why I believe iterating is fundamental to good software design, even for what many of us might consider trivial problems like posting 140-character updates on a website.

There are, of course, degrees of iterativeness (if that's a word). At one extreme, we might plan to do only one release, to get all the feedback once we think the software is "done". But, of course, it's never done. Which is why I say that "waterfall" is a myth. What typically happens is that teams do one very looooong iteration, which they might genuinely believe is the only pass they're going to take at solving the problem, but inevitably when the rubbers meets the road and working software is put in front of end users, changes become necessary. LOTS OF CHANGES.

Many teams disguise these changes by re-classifying them as bugs. Antony Marcano has written about the secret backlogs lurking in many a bug tracking system.

Ambiguity in the original spec helps with this disguise: is it what we asked for? Who can tell?

Test-driven design processes re-focus testers on figuring our the requirements. So too does the secret backlog, turning testers into requirements analysts in all but name only, who devote much of their time to figuring out in what ways the design needs to change to make it more useful.

But the fact remains that producing useful working software requires us to iterate, even if we save those iterations for last.

It's for these reasons that, regardless of the nature of the problem, I include iterating as one of my basics of software development. People may accuse me of being dogmatic in always recommending that teeams iterate their designs, but I really do struggle to think of a single instance in my 30+ years of programming when that wouldn't have been a better idea than trying to get it absolutely right in one pass. And, since we always end up iterating anyway, we might as well start as we will inevitably go on, and get some of that feedback sooner.

There may be those in the Formal Methods community, or working on safety-critical systems, who argue that - perhaps for compliance purposes - they are required to follow a waterfall process. But I've worked on projects using Formal Methods, and consulted with teams doing safety-critical systems development, and what i see the good ones doing is faking it to tick all the right boxes. The chassis may look like a waterfall, but under the hood, it's highly iterative, with small internal releases and frequent testing of all kinds. Because that's how we deliver valuable working software.