April 11, 2014

Reliable Everyday Software London, May 15th

Announcing the first of what I hope will be a regular meet-up in London for folk interested in bringing more reliable software to the mainstream.

Reliable Everyday Software London (#RESL) aims to bring together communities of practitioners, researchers, teachers and other interested folk to think about, talk about, and maybe even do something about making some of the techniques and tools we typically associate with more critical software into everyday software development.

Many of us share a belief that some of these techniques have been unwisely overlooked by the vast majority of teams, and know from experience that there are times on every project, and parts of every code base, that would benefit from a more rigorous eye.

We also know that, on a daily basis, we suffer at the hands of software that, while you might not think of it as "critical", can cause a constant low level of pain and inconvenience which every now and again explodes into something serious. Whether it's a utility bill payment that somehow got "lost in the post", or losing that vital piece of data, or a sensitive piece of information accidentally exposed, or getting stuck in an infinite logical loop that means we can't see our email because we forgot our password (and they'll only send a password reminder to that email address we can't see - grrr!), software defects have the potential to ruin our lives. At best, they are a constant low-level annoyance that eats up time and raises blood pressures (thus shortening our lives.)

It doesn't have to be this way. If the communities of practitioners and researchers can get past the "them & us" mentality that currently pervades, with both sides believing the other side are "doing it wrong", we may well see that we have much to learn from each other. The potential benefits of injecting a healthy dose of greater rigour into everyday software development, and a healthy dose of everyday realism into research, cannot be overstated.

So, to foster a spirit of enquiry and co-operation, we'll be meeting at Unruly Media at 6:30-8pm on Thursday May 15th to kick things off. Hope you can joins us. But if you can't, please sign up for our meetup.com group anyway and join in the discussion.

April 8, 2014

When Might We Use Algorithms To Calculate Expected Test Answers?

Just a short placeholder post today, inspired by a conversation with a client's team today.

The question raised was: is it ever right to use an algorithm to calculate expected test results?

The advice from TDD veterans is to try to avoid duplicating source code in our tests. This may have been misinterpreted - like much oversimplified advice (and I'm just as guilty of this) - as "don't use algorithms in your tests".

Looking at a very simple example, a program that generates Fibonacci sequences of a specified length, I perhaps can illustrate what the advice should be.

First of all, when might we use an algorithm or general formula to generate expected test answers instead of just supplying the data for specific test cases?

The answer to that question is another question: when might you use an algorithm to generate a Fibonaccu sequence instead of just hard-coding the sequence?

If someone said to me "Jason, write me a program to generate the first 8 Fibonacci numbers, my code would look something like:

return new int[] {0, 1, 1, 2, 3, 5, 8, 13 }

...because that would be much simpler than an algorithm.

If someone asked me to write a program to generate up to, say, 50 Fibonacci numbers, an algorithm would be much shorter and simpler than typing out that whole sequence.

The same goes for tests; if we have a handful of test cases, it might seem overkill to write an algorithm to generate the expected answers. On the other hand, if we wanted to exhaustively test our implementation (e.g., test all the Fibonacci numbers in a sequence of 50, or test a maths algorithm in a rage from 0 -> 10,000 incrementing by one each time, then hard-coding the answers would be a heck of a lot of work.

In those cases, I'd use an algorithm in the tests. But, and this is very important, not the same algorithm as the solution I'm testing.

Uncle Bob Martin sums it up best with his analogy for unit testing of "double-entry book-keeping". If you've ever done your own accounts, you may well have learned that - although it's extra work - it can save a lot of time and heartache later if we take time to double check all our figures.

Double-entry book-keeping works like sign-up forms that require us to type in our email address twice. It compares one piece of information to another piece of information that is in no way derived from the original (e.g., it's typed in twice) on the understanding that they should be, if correct, the same. Of course, we could enter it wrong both times, but the chances of that happening are greatly reduced. Maybe I type it in right 99% of the time, leaving a 1% chance of a mistake. With double-entry, the odds of getting it wrong both times are 0.01%.

Going back to the Fibonacci example; if I wanted to exhaustively test my solution across a large range of numbers, I might choose to use an iterative solution in my source code, and calculate expected answers - maybe as a one-off job for future test runs, if it's computationally very expensive - using a tail-recursive algorithm.

The two algorithms may well have different performance properties, but the answers they produce should be identical. The chances of both algorithms being wrong are dramatically smaller than one of them being wrong by itself.

Just to finish with a cautionary note; be sure to let the test code - and the duplication in it - to lead you to a general algorithm, rather than leaping in straight away and writing one in the hope you'll need it for squillions of future tests. There's an art to refactoring tests into general specifications, but that's for another post.

April 5, 2014

Why It's Better To Start With A Test

The art of Test-driven Development is, at its essence, the art of generalising from a set of examples to discover the patterns that bind them.

The TDD practice of triangulation, where we flesh out the design of our code one test case at a time, generalising as we go, has its analogues in other software specification practices.

For example, in the Catalysis approach to model-driven development, practitioners use object diagrams to illustrate the before and after of actions, and generalise from this to static models like class diagrams.

In user experience design, it's recommended to start with examples of the kinds of users who'll be working with our software and the tasks they'll be performing.

Going back to Ivar Jacobson's "usage cases" in the late 1960's, and probably beyond to the 1950's when enlightened teams used tests to drive their implementations, the received wisdom has been to start with examples and work our way back to a general solution that will satisfy all of them.

When we do it the other way around - e.g., we create object diagrams based on our class model to illustrate specific examples - that's what we call "testing" or "verification" (or "validation", if our goal is to validate that our speciffication really describes what the users need).

With good reason, we frown on starting with generalised designs and then comig up with examples afterwards. Firstly, what are our generalised designs based on? How did we know we needed an X or a Y in our design?

Psychologically, the effect of the order in which we do it - examples first, and the generalise, or generalise first and then test with examples - can be quite profound.

In Codemanship training courses, there'll often be one or two pairs in the group when we do the OO design exercise who skip agreeing acceptance tests* and go straight for a design. They call me over and ask "Is this the right design?" and I say "I don't know; will it pass your acceptance test?" and they look sheepishly back at me and say "Ah, we didn't agree a test". What follows is usually an attempt to agree an acceptance test that doesn't invalidate their design.

We can grow rapidly - almost instantly - attached to our solutions. So much so that I see teams actively avoiding any feedback that might invalidate them. Knowing this, I therefore very strongly recommend that teams start with the examples and work directly towards a simple design that they can be confident will work for all of those scenarios.

For sure, they still grow attached to their designs, but by starting with the test cases, there's far less danger thast the designs they grow attached to will fail to give users what they need.

* Yes, even on a course called "Test-driven Development"

March 21, 2014

Software Correctness - How For & Why They?

In recent weeks, this has been coming up regularly. So I think it's probably time we had a little chat about software correctness.

I get a sense that a lot of younger developers have skipped the theory on this, and I felt it would be good to cover that here so I can at least point people at the blog. It's sort of my FAQ.

First of all, what do we mean by "software correctness" (or "program correctness", as us old timers might call it)?

To risk being glib, a program - any executable body of code - is only correct if it does what we expect it should, and only when we expect that it should.

So, a program to calculate the square root of a number is only correct if the result multipled by itself is equal to the input. And we might expect that such a program will only work correctly if the input isn't zero or less.

Tony Hoare's major contribution to the field of software engineering is a precise definition of program correctness:

{P} C {Q}

C is the program. Q is what must be true after C has successfully executed. (The outcome.) And P is precisely when we can expect C to achieve Q. In other words, P describes when it is valid to invoke the program C.

e.g., {input > 0} Square Root {result * result = input }

These three elements, pre-condition, program/function and post-condition taken together are called a Hoare Triple. As strikingly simple as this definition of program corretness is, it's turned out to be very powerful logic, forming the basis of a great deal of what we think of as "software engineering".

To see how this translates into something practical, let's take a look at an example:

This is a simple algorithm for calculating the average price of a collection of houses. What might the Hoare Triple look like for this simple program (using pseudo-code)?

{ count of house > 0 } average() { result = sum of house prices/count of house}

You'll notice probably straight away that the outcome could form the basis of unit test assertions. Many of us are already testing post-conditions in this way. So, hooray for our side.

But what about the pre-condition? What does it mean when we say that the house array must not be empty for the program to work?

We have three possible choices;

1. Change the post-condition to handle this scenario.

In other words, the method average() will always handle any input.

For example, here I've changed the initialised count of houses to 1 if there are zero houses. That way, even if the array is empty, the total will always be divided by at least one. (And the average for an empty array will be zero, which kids of makes sense.)

2. Guard the body of average() from inputs that will break it.

This approach is called Defensive Programming.
Bear in mind now that calling code will need to know how to handle this exception meaningfully.

3. Only ever call average() when there's at least one house.

This approach is called Design By Contract.

It puts the responsibility on the calling code to ensure it doesn't break the pre-condition when invoking average().

Typically, developers practicing DbC will use assertions embedded in their code, the checking of which at runtime can be switched on and off, so we can have assertion checking during testing but then switch them off once we're happy to release the software. The distinction between failing assertions and having our code throw exceptions when rules are broken is very clear: in Design By Contract, when assertions fail it's because our code is wrong!

The advantage of DbC is that it tends to allow us to write cleaner, simpler implementations, since we assume that pre-conditions are satisfied and don't have to write extra code to handle a bunch of extra edge cases.

Remember that in strategies 1. and 2., handling the edge case is part of the program's correct behaviour. In DbC, if that edge case ever comes up, the program is broken and needs to be fixed.

The important thing to remember is that, whether it's handled in the post-condition (e.g., average price of zero houses = 0), whether it's guarded against before the body of the program/function, or whether it's forbidden to invoke methods when pre-conditions are broken, the interaction between client code and supplier code (the caller and the callee) must be correct overall. It has to be handled meaningfully somewhere.

When it comes to assuring ourselves that our code is correct, there's a cornucopia of techniques we can employ, ranging from testing to mathematical proofs of correctness (often by proving it correct for one case, then for all N+1 cases by induction).

But, regardless of the technique, a testable definition of correctess gives us the formal foundation we need to at the very least ask "What do mean by 'correct'?", and is the basis for almost all of them.

I'll finish off with one more example to illustrate. Let's think about how our definition of program correctness might be exploited in rigorous code inspections.

Revisit my early implied definition of "program"; what am I really saying? When it comes to testing and verification - especially inspections - I consider a program to be any chunk of executable code, from a software executable, down to individual functions, and even individual program statements and expressions. To me, these are all "programs" that have rules for their correctess - pre- and post-conditions.

If you're using a modern IDE like Eclipse (okay, maybe not that modern...), your refactoring tools can guide you as to where these units of executable code are. Essentially, if you can extract it into its own method (perhaps with a bit of jiggery pokery turning multiple return values into fields etc), then it has pre- and post-conditions.

Just for illustration, mainly, I've refactored the example program into composed methods, each containing the smallest unit of executable code from the which the complete program is composed.

Theoretically, everyone of these little methods could be tested individually. I'm not suggesting we should write unit tests for them all, of course. But stop and think about the granularity of your testing practices. In a guided inspection, where we walk through the code, guided by test inputs, we could be asking of all these teeny-tiny blocks of code: "What must this do, and when will it work? When won't it work?" You'd be surprised how easy it is to miss pre-conditions.

So there you have: the theoretical basis for the lion's share of software testng and verification - software correctness.

March 17, 2014

Blind Pair Programming Interviews - Early Experience Report

A week ago, I undertook an experiment for a client. We'd both been discussing the problem of interviewer bias in the recruitment process, and wanted to see what would happen if the interviewer had nothing to go on except for the candidate's code.

Blind interviews are a challenge to undertake. It's very difficult to learn enough about a candidate's employable qualities without personal details filtering in that might reveal, directly or indirectly, their probable age, their gender, their ethnicity, and so on.

Certainly, upon seeing the candidate, all sorts of personal prejudices may come into play. It's very hard in a face-to-face meeting to disguise the fact that you are, say, black or that you are a woman, or that you are middle-aged.

So, in a blind interview, we do not meet the candidate face-to-face.

The voice, too, can come loaded with information that plays to our prejudices. An accent may reveal that they are not from around here. It may reveal that they are possibly of a certain "class". It would almost certainly reveal their gender.

So, in a blind interview, we do not hear the candidate's voice.

This leaves us with the written word. In this modern age, it's possible to hold a conversation in real time using just text. Instant Messaging opens a door to interviewers that allows us to find out more about the candidate without seeing their face or hearing their voice. But even through that medium, personal details and clues about a candidate might be inadvertantly revealed. Use of certain colloquilisms might place them from a particular part of the world. A casual mention of a particular movie they like, or a pop record they listen to, might place them in the typical demographic for those products.

This is also true of CV's. If the interviewer gets to read even a CV, even if it's been redacted to remove their name, age, address and so on, it can at least reveal how long they might have been working in the industry and their probable age.

CVs also risk other biases; like whether or not they went to university (possibly one of the poorest indicators of ability as a software developer, we've found.)

The whole point of a blind interview is to know none of these things; not who they are, where they come from, what colour their skin is, which god or gods they pray to, what their favourite Pixar movie is... non of these things. We are attempting to remove reasons not to favour a candidate other than how good they are at writng software.

It's unfair to reject a candidate because they're too young or too old. But it's entirely fair to reject them because they used public setters to initialise an object instead of a constructor, or because they never run the tests after they've refactored, (or because they never refactor unless you prompt them to.)

It's also entirely fair if they fail to comprehend requirements, provided you are sure those requirements are expressed clearly enough (which I advise you to test beforehand.) Or if they're argumentative and always find reasons to not to what you're asking them to do. Or if they won't accept feedback, provided it's constructive (e.g., "do you thinnk we should write a test for this?")

And so we set sail on strange waters with a round of 6 pair programming sessions with 6 candidates about whom I knew absolutely nothing, except for what programming languages the candidates claimed to know. (Although even that can be revealing - if they say "FORTRAN, C and Lisp", they may be my age or older.)

Luckily, in this case, they all had to know Java, and I didn't need to know any more than that.

Each pairing session started exacty the same way. They IM'd me on Skype, through an account we'd set up for these interviews, to indicate readiness: "Ready".

I then ask them for details to log into a TeamViewer session hosted on their desktop, so I can see their screen.

I copied and pasted (gasp!) a set preamble to introduce the exercise we were about to do. In this experiment, I chose the Codemanship Refactoring Assault Course, as I tend to find refactoring a good indicator of general programming ability.

I ask them to find and refactor away as many code smells as they can in the assault course code within the next 30 minutes. That's really all the time I need to get the measuree of them as refactorers of code.

In the experiment, 2 of our candidates went offline at this point and I didn't hear from them again. This is to be expected. In typical interviews, a sizeable proportion of developers who claim refactoring experience on their CV don't even know what the word means.

I asked the remaining 4 to download the Java code and import it into their development environment.

Of the four remaining, 2 ran the tests as soon as they'd imported the code: an early indicator that proved accurate as to what was to follow.

The rest of the pairing session was pretty normal. I kept the conversation completely focused on the code in front of us, and why the candidates were doing what they were doing.

One candidate seemed unaware of "code smells" and proceeded to add copious comments to make the code more "readable". They also inlined a few methods, because they felt there were "too many methods". AT no point did they run the tests. Or even look at the tests.

Two candidates did pretty well, one of them in particular did very well and even tackled the switch statement successfully.

In the end, I wanted to keep things as objective as I could. So, instead of writing up my feelings about each candidate, I simply offered a summary of the code smells they found, whether or not they applied meaningful refactorings to them, and in what ways the code was improved. I also scored them for how frequently they ran the tests, for checking code was tested before attempting a refactoring (e.g., with a code coverage tool, or simply by editing a line of code and seeing if any tests failed - none of them did that), and for their apparant fluency with tools they claim to use every day.

From the session, I recommended two of the candidates - one more highly than the other by a country mile, I have to say - and suggested the other four be rejected.

I have no idea who these candidates were. But the client, of course, does. It will be interesting to see what happens now.

I've yet to find out who got hired, if any of them did. But it will be interesting to see if my recommendation - based purely on how they performed in the pairing session - is followed through.

March 13, 2014

Waterfall, Reality Avoidance & People Who Say "No"

I have this theory.

One of the problems some managers have with iterative software development is that, when it's done well - seeking early and frequent feedback and acting on it, as opposed to just incrementally executing a waterfall plan - it reduces the scope for avoiding reality.

On a waterfall project, reality can be avoided for months or even years. The illusion of progress can be maintained, through form filling and the generation of reams of reports that nobody ever reads, right up until the point that software needs to be seen to be working.

If it were my money, this would scare the shit out of me - not knowing what my money's been spent on until the last moment.

But I can see the attraction for managers. It's not their money. And typically they get rewarded for this illusion of progress, which can go as far as pretending the software is ready the night before it's deployed into a live environment.

One of my early experiences as a freelancer in London was leading a team on the development of a web site that the company had been kicking around as an idea for 2-3 years. Naturally, after 2 years of business analysis, and 6 months of database design - based on what use cases, one can only imagine - the team were given six whole weeks to implement the site.

Our project manager was pretty canny, and understood how much of a squeeze this was going to be. So, back in 1999, I first tried my hand at a new thing called "Extreme Programming", because we felt what we were doing was extremely ambitious, and the right thing to do with such short timescales was to iterate extremely.

But the customer wouldn't play ball. We wanted to show him working software, but he literally refused to come downstairs and take a look at what the company's money was being spent on. He insisted, instead, that the project manager wrote reports. Lots of reports. Daily reports. Detailed reports.

And when the reports said "we are not as far as the plan says we should be", the reports were rejected. And new reports had to be written, saying that, actually, we were on plan.

For doing this, the project manager got promoted to Chief Technology Officer. The developers, who unanimously refused to play along and kicked up quite a fuss, got let go almost immediately afterwards.

A new project manager was appointed, who was more than happy to live with the illusion. I recall distinctly listening to a conversation with the business in which he told them that a piece of work that could take months and had not even been started was actually done, tested and ready to deploy. There's delusion, and then there's DELUSION.

Of course, it dodn't deploy. It couldn't. It didn't exist. But, again, months of him telling the business what they wanted to hear, regardless of the reality that unfolded, got him promoted to senior programme manager. I was long gone by this point, thankfully. (On to the next delusional dotcom boom shitstorm.)

This was an important lesson for me - most failure is. I learned that, in many organisations, people aren't rewarded for what they achieve. They're rewarded for what they're prepared to claim they've achieved. And even when it turns out to be an out-and-out lie, and the business is left with egg all over their faces, they may still get rewarded.

To their own detriment, too many businesses reward people for saying "Yes", even when the real answer - the answer they need to hear - is "No".

Software developers - people who actually make software happen - do not have that luxury, though many try. Software either is, or isn't. It either does, or it doesn't. When the chips are down, we can't fake it. We may say "it's ready", but when it ships reality will pull its pants down for all the world to see.

Among other reasons, I believe this one of the key reasons why managers don't like us. They need us to make things happen, but we have this annoying habit of saying "No" and telling them things they didn't want to hear.

Iterative development throws light in dark corners they'd rather we didn't look, and as such that's why I believe waterfall is a reality avoidance mechanism for many managers. Specifically, they want to avoid having to go back to their bosses or their customers and say "No", because in the game they're playing - which can be s distinctly different game with distinctly different aims to the game developers play - they lose points for doing that.

The only meaningful way to bring reality back into play is to realign the goals of managers and developers and get everyone playing the Let's Actually Deliver Something Of Value game. Managers need to be rewarded for testable achievements, and steered away from peddling illusions. The reason this doesn't happen more often, I suspect, is because the value of illusion increases the further up the ranks you go. If a PM gets a pat on the back for saying "we're on track", the CTO gets a trip to Disneyland, and the CEO gets a new Mercedes. Hence, the delusion gets stronger as we go higher. People running governments tend to be the most delusional of all, such is their power and influence. This effect is what produces the sometimes gargantuan IT failures only governments seem capable of creating.

The Emperor has no clothes. Bah humbug.

March 11, 2014

Bring The Boss - London, May 8th

On May 8th in London, I'll be running a rather unique training event.

Bring The Boss aims to create an opportunity for developers to "have that chat" with their managers about why software craftsmanship matters.

I created this 1/2-day workshop in response to the many times course attendees have commented "I wish my boss was here, he/she needs to hear this".

Superficially designed as a crash course in Test-driven Development, the workshop will draw out key talking points about the value of craftsmanship to software-intensive businesses, what happens to businesess that skimp on the technical disciplines, and why developer culture is the key to unlocking sustained delivery of value.

Don't worry if your boss isn't a programmer. We find that pairing with non-technical stakeholders can be a valuable exercise in communication - both face-to-face and through code.

A place for the 2 of you costs just £99 (plus Eventbrite booking fee), and there's a morning session that starts at 9:30am and an afternoon session at 1:30pm.

So bring the boss, and let's have that chat we've been meaning to have about craftsmanship.

Online Ticketing for Bring The Boss powered by Eventbrite

March 8, 2014

Blind Developer Interviews Through Anonymised Remote Pairing - An Experiment

On Monday, I'll been conducting a reruitment experiment for a brave and progressive client who, sadly, wishes at this point to remain anonymous. Which, coincidentally, is exactly how these this experiment is intended to work.

I will be pairing remotely, as I often do for clients, with 6 candidates via That Skype That They Have Nowadays. Each candidate will log in at a fixed time to a stock Skype account we've set up especially for this experiment. I will not know who they are, what they look like, where they're from, when they were born, what experience or qualifications they have, or even - thanks to instant messaging - what they sound like.

I'll be completely blind except for what happens in the code, and what they communicate via IM. They are under strict instructions to give nothing personal away during this process. The transcripts of chats will be forwarded to the client along with my interpretation of how the session went, and a video of my desktop while it was happening.

The design of the experiment has been the culmination of a year's thought and research by myself, and it will be interesting to find out what happens.

I will not know the gender, the age, the educational background, the ethnicity, the location or the haircuts of any of the candidates as we pair. All I will have to go on - if they stick to the rule about revealing nothing personal via text - is what they are like when they are programming.

Of course, the client will know all of these things, having selected this shortlist. The second part of the experiment will be blind job advertising, which will be much harder because we'll be relying on candidates to follow strict instructions when applying.

Their CVs will need to be scrubbed of all personal information and presented in a distinctly sterile fashion, listing just their key skills, rated by how good they think they are. They will be expected to tackle a simple 1-2 hour programming problem that we hope will weed out the obvious blaggers. The results of that, along with the anonymised CV, will be the basis for shortlisting candidates.

Naturally, we are mindful that, at some point, we'll need to know more about them. For example, have they contributed to open source projects we can go and look at? Do they have a blog? That sort of thing. So the process can't be completely blind all the way along.

But the hope is that by whittling down candidates purely on technical skill first it will be harder to ignore a great developer for the wrong reasons.

February 25, 2014

Why Code Inspections Need To Be Egalitarian

The debate rages on about Uncle Bob's blog post advocating what he calls a "foreman" on software development teams who takes responsibility for the quality of commits by team members.

Rob Bowley of 7digital goes on to suggest that he no longer needs to inspect code quality, choosing instead to measure the effects of code quality on the reliability, frequency and sustainability of releases. This is per a discussion I had with Rob a few years ago, when he showed me the suite of metrics he'd published - a brave move - on 7digital's software development performance. Would that other software organisations were prepared to be so transparent.

The assertion goes back to my "Software Craftsmanship Imperative" keynote that was doing the rounds in 2010-2011.

The point is this: your customer or boss is not going to care about "clean code", no matter how much you try to persuade him or her. I've always maintained that craftsmanship is not an end in itself. There's a reason why code quality is important, and I've learned from bitter experience that you need to focus on that reason with people who aren't writing the code.

Having said that, smart developers who understand the causal link will inspect their code continuously and be ever-vigilent to things that might hinder progress later. So it's quite common to find practices like pair programming, code inspections, static analysis and all that malarkey going on in high-functioning teams.

But let's be clear about who the audience is for these two distinct and closely-related pictures; code metrics etc are for the people writing the code to provide early warning about maintainability "bugs" and as a tool for learning and improving at writing more maintainable code. Release statistics are for everyone (including developers) who cares about the sustainability of innovation.

To use another broken metaphor, code quality is data about the working of your engine, whereas release statistics are about the progress of your journey. High-functioning teams build a picture that can show how tinkering with the engine can improve progress on a long journey, which most software development turns out to be (even when the boss insists it's just going to be a short trip to the shops.)

My own experiences of being asked by managers to impose the wrong kind of picture on the wrong kind of audience have made me extremely wary of doing that, especially in the last 4-5 years.

Most importantly, I've learned that - when it comes to inspections and code quality - you can lead a horse to water, but you can't make it report untested code. The developers have got to want the information, because they believe it will help them, and have got to seek it for themselves. This is perhaps exemplified by the experimental TDD "apprenticeship" scheme we ran at the BBC in their TV Platforms team.

The same applies if one person on the team (call him/her a "foreman" if you like) tries to impose such a regime on the other - probably less willing - members. It just doesn't work.

Not only are the team likely to resent having their code's pants pulled down in such a manner for all to see, but - if they've not been paying attention to code quality as much as they should - the picture revealed is likely to dishearten and impact team morale.

Once you've handed their code its arse, what then? So now they know it sucks. What are they going to do about it? Do they want to do anything about it? Can they do anything about it? Would they know what to do about it?

Much as I'd like to believe I have the power as a coach to make developers who don't care about code quality care about code quality, the reality is that the best I can do is to make them aware of its existence as a thing that some developers care about. And then we're back to the horse and the water and the drinking.

As a software development coach in the same organisation where Rob Bowley and I met, I kind of did both. I made the mistake of imposing code quality metrics on some teams. But I also discovered something that has completely changed my whole outlook on what it is I do for software organisations.

I made a deliberate choice right at the start of my time in that organisation to focus more on developer culture. I immediately instigated internal events - totally voluntary - aimed at developers, roped in some inspiring names to come in and rally the troops, and gradually encouraged the developers there to see themselves as a community. More importantly, as a community that cares about software development.

I was told unequivocally by the people who hired me that there was no point. These people were not motivated. They didn't care, and couldn't be made to care. And they were right. From above, or from the outside, you cannot make people care. But you can build a culture in which it's easier to care than it is not to care.

From that wellspring came much nascent talent that had been festering in a command-and-control culture. Some have become software development coaches themselves in the intervening years, others lead successful development organisations. So much for "don't care, won't care".

Ultimately, my point is this; as with all technical decisions that can be made for a development team, it works best when the team makes it. You can't force people, con people, bribe people or blackmail them into caring. And if they don't care, you can point out all their code quality shortcomings as much as you like, because they're not going to fix them.

February 24, 2014

Why Development Teams Need To Be Egalitarian

In a recent blog post, Robert C. Martin proposes that teams need a technical authority who guards the code against below-par commits.

Uncle Bob argues that you wouldn't fly on a plane without a captain, or live in a house built without a general contractor overseeing the work to make sure it's up to snuff.

I feel I'm a little closer to this problem, having specialised in these "technical authority" roles for most of my freelance career, and having been embedded with teams in the recent past.

The statistics teach us that once code's checked in, the ultimate cost of it to maintain could be 7 times higher than it cost to write in the first place. So Uncle Bob's right, of course; someone should be casting a helpfully critical eye over commits and protecting the code from work that is, for whatever reason, not good enough.

Where we differ is on who that someone should be.

Here's the thing; leading software developers is like herding cats (except that cat herders would probably count themselves lucky if they knew how difficult it is to lead developers, frankly.)

Even the most junior software developers tend to be highly educated, highly intelligent people. These are not bricklayers, or co-pilots. What we do is orders of magnitude more complex and intellectually demanding. Better, I think, to compare software developers to, say, scientists.

Who keeps scientists in check? While professional scientists no doubt have supervisors (e.g., in a laboratory, or the head of a faculty), these individuals are not the arbiters of "good science", or of what's true and what isn't.

The equivalent of a software commit might be submitting a paper to a peer-reviewed journal. There, the key word is "peer". The claims made by a scientist are subjected to exactly the kind of egalitarian scrutiny Bob rails against (no pun intended.)

My own experience, and what I've observed of other software development "leaders" is that when someone takes that responsibility, it can lead to one of two outcomes:

1. The team resents it (or, at least, some of them do), and much of your time is spent dealing with the politics of being the guy who says "no". Don't underestimate how much time this costs.

2. The team accepts it, and resigns responsibility. This is no longer their code, no longer their design. It's your code, your design, and they're "just following orders". Again, never underestimate how readily even the most educated and intelligent people will let themlseves become "institutionalised", losing the desire to take charge of their own work.

As others responding to Uncle Bob's blog post have already suggested, it can work better if the technical authority is the team.

As the team leader, my effort is better spent marshalling and organising the team as a democractic body. I assume - rightly or wrongly - that, collectively, the team knows more than I do. When questions need answering, and decisions need making, put it to the team. Have deliberate mechanisms in place for these kinds of decisions. Be clear what's a team decision and what's a decision for the individual. If it helps, agree a team contract right at the start that sets the "constitution" of the team.

If it all gets a bit "he said, she said", then the team can also collect evidence upon which to base decisions. In particular, they can collect data about the effects of the decisions they make as a team. In this respect, I encourage teams to treat everything they do - every decision they make - as an experiment from which they can potentially learn.

But all of this is highly distinct from being a coach to teams. I absolutely 100% reject the idea that software development coaches should be acting in the way Bob suggests, becxause I absolutely 100% reject the notion of the coach being part of the team.

Been there, done that. The moment the coach starts making decisions for a team, they become the "go-to guy" for those kinds of decisions. There's a world of difference between coaching a team on, say, automated builds, and being "the build guy" on the team. Because that's what happens to coaches. I've seen hundreds of times, literally. Coach or consultant comes in to help a team solve a problem, coach or consultant becomes the solution.

I likee to think of what I do as being a driving instructor (and, warning, here come's another broken metaphor):

When I was 17, my Dad paid for me to have driving lessons. My instructor was a nervous man, and just kept grabbing the wheel every time he saw me doing something even slightly wrong. I had 6 lessons with this guy, and barely learned a thing. Because you cannot learn to drive by being told or being shown how to drive. You must drive yourself. A few years later, when I reaklly needed my license for work, I found a local instructor who had a completely different style. I would drive, and we would chat. Occasionally, he would point out something I could improve on. Even after just one lesson, my confidence grew enormously. I passed my test first time.

That has always stuck with me. The instructor was there to grab the wheel if I did something really stupid or dangerous, but mostly he kept his hands off the wheel and let me drive.

I'm the same way with teams as a coach. I try not to grab the wheel on their projects, letting them drive as much as possible. This is their code, their responsibility.

And if commits aren't up to snuff, that is their problem.