May 31, 2016

Squaring The Circle of "Tell, Don't Ask" vs. Test As Close To Implementation As Possible

One of the toughest circles to square in a test-driven approach to designing software is how to achieve loosely-coupled modules while keeping test assertions as close to the code they apply to as possible, so that they are: a. better at pinpointing failures, b. run faster, and c. can travel wherever the code they exercise travels.

The interaction-focused approach to TDD that allows us to write unit tests that are more about the telling and less about the asking (Tell, Don't Ask) can indeed leead to designs that are more effectively modular. But, as Martin Fowler also mentions in his post on Tell, Don't ask, it can lead to developers becoming somewhat ideological "getter eradicators".

These example classes from a community video library system break our Tell, Don't Ask principle by exposing internal data through getters.

But is, per se, by itself a bad thing? Remember that our goal is to share as little information about objects among the classes in our system. So our focus should be on clients who use Title and Member, and what they know about them.

Library, in this example, isn't bound to the implementation classes that have the getters. It only knows about the interfaces Copyable and Rewardable, which only include the methods Library uses. (See the Interface Segregation Principle).

The tests for Library know nothing of Member or Title.

And the assertions about the work that Member and Title do are isolated in test classes that are specifically about those implementations.

In practice, this turns out to be preferable to the favoured Tell, Don't Ask approach in TDD that puts all the assertions about the work in acceptance tests and relies mostly on interaction testing at the unit level. Just because Library told Copyable to register one copy, that doesn't mean Title actually registered one copy. Someone somewhere has to ask that question. Ask. And to ask that question, the number of copies has to be accessible somehow. Of course, we could cheat, and preserve our getter-less implementation by accessing state by the back door (e.g., going direct to a database), but this is far worse than using a getter, IMHO. We're sharing the knowledge, but we're pushing that dependency out of the code where our compiler can see it.

Just as functional programming's "dirty little secret" is that - somewhere, somehow - state changes, Tell, Don't Ask's is that eventually we have to ask about the internal state of objects (even if we're not going to those classes in our code to do it). It may be for use in an acceptance test, or in the user interface, or in a report - but if we don't make object state accessible somehow, then our OO applications can have no O in its I/O.

Having said all that, I do find that Tell, Don't Ask and interaction tests can lead us to more decoupled designs, in that they can lead us to minimal interfaces through which object collaborations happen.

One last though on this; if the only reason we have methods for accessing state is for testing implementations (which, as mentioned, is not really the case, but let's run with it for the sake of argument), greater information hiding could be achieved by internalising those assertions.

This would satisfy both needs of having the work tested as close to where it's done as possible, and exposing as little internal state as possible. Our tests would not actually ask any questions, they would simply be drivers that exercise our objects with various inputs.

May 25, 2016

How Many Bugs In Your Code *Really*?

A late night thought, courtesy of the very noisy foxes outside my window.

How many bugs are lurking in your code that you don't know about?

Your bug tracking database may suggest you have 1 defect per thousand lines of code (KLOC), but maybe that's because your tests aren't very thorough. Or maybe it's because you deter users from reporting bugs. I've seen it all, over the years.

But if you want to get a rough idea of how many bugs there are really, you can use a kind of mutation testing.

Create a branch of your code and deliberately introduce 10 bugs. Do your usual testing (manual, automated, whatever it entails), and keep an eye on bugs that get reported. Stop the clock at the point you'd normally be ready to ship it. (But if shipping it *is* your usual way of testing, then *start* the clock there and wait a while for users to report bugs.)

How many of those deliberate bugs get reported? If all 10 do, then the bug count in your database is probably an accurate reflection of the actual number of bugs in the code.

If 5 get reported, then double the bug count in your database. If your tracking says 1 bug/KLOC, you probably have about 2/KLOC.

If none get reported, then your code is probably riddled with bugs you don't know about (or have chosen to ignore.)

May 23, 2016

Experimental Draft Book Chapter - TDD-ing Integration Code

So, it's like the worst-kept secret that I've been brain-dumping my thoughts on TDD into convenient A5 book form, for various purposes.

It's still very much in draft form, and some of the chapters so far make more sense than others (probably).

One topic I was very keen to cover is TDD and integration code, because it comes up on every Codemanship course.

This is a very rough early draft of my chapter on Test-driving Integration Code that I'd like to test on a few guinea pigs. Have a butcher's, and if you've got feedback or suggestions, please send them to me.

May 21, 2016

Classic TDD Mistake - Writing Design-Driven Tests

Just a quick post for a grey and windy Saturday morning about a classic mistake developers make when learning TDD. Vexingly, this is something that's sometimes even taught in popular TDD tutorials. So I want to set the record straight.

People learn that the Golden Rule of TDD is that you don't declare any production code unless there's a failing test that requires it. They may misinterpret this to mean that if they plan to have a class or a method or a field, they should write a test specifically to require that.

Imagine we're TDD-ing some code that checks if a string is a palindrome (i.e., the same backwards as forwards). I've actually seen this example done in a tutorial video. The demonstrator starts by writing a test like this:

This test forces them to declare PalindromeChecker. But it's a redundant test. All we're doing here is "cheating" at TDD because we're not discovering the need for a PalindromeChecker in order to be able to check if strings are palindromes.

It's supposed to work the other way around; this isn't test-driven design, it's design-driven testing, because the only reason we wrote this test is that we wanted to declare a PalindromeChecker class.

TDD should focus on the work that our objects do, and the classes and methods will reveal themselves as places to put that work.

If we started with this test instead:

...then what's explicitly asserted in the first test is now implicitly required for this test to pass. If PalindromeChecker doesn't exist, then the code won't compile. If we don't instantiate PalindromeChecker, calling isPalindrome() will cause a null reference exception.

Most importantly, I decided I needed a method that checks if strings are palindromes - which I gave a (hopefully) self-explanatory name - and I decided to attach that method to a class that also has a self-explanatory name, purely so I could test if "abba" is a palindrome.

More generally, don't write tests about design structure: "There should be a class called Zoo, and it should have a collection of enclosures, and a keeper, who has a name." etc etc. Start with "What does this zoo do? What are the rules when it does it?", and the structure will reveal itself.

May 10, 2016

60% Off Codemanship Intensive TDD 1-day Workshop

Just a quick not to mention that this month I'm celebrating 7 years since founding Codemanship - my personal passion project.

To mark the occasion, and to say thanks for 7 years of doing what I love, I'm offering a whopping 60% discount on the popular Intensive TDD workshop for the next 7 clients who book it.

Usually, the 1-day workshop costs £3,000 for up to 20 developers. For the next 7 bookings, it'll be just £1,200. That's as little as £60 per person for a full day's hands-on TDD training.

May 4, 2016

Scaling Kochō for the Enterprise

Unless you've been living under a rock, you'll no doubt have heard about Kochō. It's the new management technique that's been setting the tech world on fire.

Many books, blogs and Hip Hop ballets have been written about the details of Kochō, so it's suffice for me to just quickly summarise it here for anyone who needs their memory refreshing.

Kochō is an advanced technique for scheduling and tracking work that utilises hedgehogs and a complex network of PVC tubes. Task cards are attached to the hedgehogs - by the obvious means - and then they're released into the network to search for cheese or whatever it is that hedgehogs eat. The tubes have random holes cut out above people's desks. When a hedgehog falls through one of these holes, the person at that desk removes the task card and begins work. Progress is measured by asking the hedgehogs.

So far, we've mainly seen Kochō used successfully on small teams. But the big question now is does it scale?

There are many practical barriers to scaling Kochō to the whole enterprise, including:

* Availability of hedgehogs
* Structural weakness of large PVC tube networks
* Infiltration of Kochō networks by badgers
* Shortage of Certified Kochō Tubemasters

In this blog post, I will outline how you can overcome these hurdles and scale Kochō to any size of organisation.

Availability of hedgehogs

As Kochō has become more and more popular, teams have been hit by chronic hedgehog shortages. This is why smart organisations are now setting up their own hedgehog farms. Thankfully, it doesn't take long to produce a fully-grown, Kochō-ready hedgehog. In fact, it can be done in just one hour. We know it's true, because the organiser of the Year Of Hedgehogs said so on TV.

Structural weaknesses of large PVC tube networks

Steel-reinforce them.

Infiltration of Kochō networks by badgers

Regrettably, some managers have trouble telling a badger from a hedgehog. Well, one mammal is pretty much the same as another, right? Weeding out the badgers on small Kochō teams is straightforward. But as team sizes grow, it becomes harder and harder to pay enough attention to each individual "hedgehog" to easily spot imposters.

Worry not, though. If you make the holes bigger, badgers can work just as well.

Carry on. As you were.

Shortage of Certified Kochō Tubemasters

Many teams employ CKTs to keep an eye on things and ensure the badgers - sorry, "hedgehogs" - are following the process correctly. But, if hedgehogs are in short supply these days, CKTs are like proverbial hen's teeth.

Only a few teams dare try Kochō without a CKT. And they have learned that you don't actually need one... not really.

In fact, Kochō can work perfectly well without CKTs, tube networks, hedgehogs, or Kochō. Indeed, we're discovering that not doing Kochō scales best of all.

April 30, 2016

Goals vs. Constraints

A classic source of tension and dysfunction in software teams - well, probably all kinds of teams, really - is the relativity between goals and constraints.

Teams often mistake constraints for goals. A common example is when teams treat a design specification as a goal, and lose sight of where that design came from in the first place.

A software design is a constraint. There may be countless ways of solving a problem, but we chose this one. That's the very definition of constraining.

On a larger scale, I've seen many tech start-ups lose sight of why they're doing what they're doing, and degenerate into 100% focusing on raising or making the money to keep doing whatever it is they're doing. This is pretty common. Think of these charities who started out with a clear aim to "save the cat" or whatever, but fast-forward a few years and most - if not all - of the charities' efforts end up being dedicated to raising the funds to pay everybody and keep the charity going.

Now, you could argue that a business's goal is to make money, and that they make money in exchange for helping customers to satisfy their goals. A restaurant's goal is to make money. A diner's goal is to be fed. I give you money. You stop me from being hungry.

Which is why - if your organisation's whole raison d'être is to make a profit - it's vitally important to have a good, deep understanding of your customer's goals or needs.

That's quite a 19th century view of business, though. But even back then, some more progressive industrialists saw aims above and beyond just making a profit. At their best, businesses can provide meaning and purpose for employees, enrich their lives, enrich communities and generally add to the overall spiffiness of life in their vicinity.

But I digress. Where was I? Oh yes. Goals vs. constraints.

Imagine you're planning a trip from your home in Los Angeles to San Francisco. Your goal is to visit SF. A constraint might be that, if you're going to drive, you'll need enough gasoline for the journey.

So you set out raising money for gas. You start a lemonade stall in your front yard. It goes well. People like your lemonade, and thanks to the convenient location of your home, there are lots of passers-by with thirsts that need quenching. Soon you have more than enough money for gas. But things are going so well on your lemonade stall that you've been too busy thinking about that, and not about San Francisco. You make plans to branch out into freshly squeezed orange juice, and even smoothies. You get a bigger table. You hire an assistant, because there's just so much to be done. You buy a bigger house on the same street, with a bigger yard and more storage space. Then you start delivering your drinks to local restaurants, where they go down a storm with diners. 10 years later, you own a chain of lemonade stalls spanning the entire city.

Meanwhile, you have never been to San Francisco. In fact, you're so busy now, you may never go.

Now, if you're a hard-headed capitalist, you may argue "so what?" Surely your lemonade business is ample compensation for missing out on that trip?

Well, maybe it is, and maybe it isn't. As I get older, I find myself more and more questioning "Why am I doing this?" I know too many people who got distracted by "success" and never took those trips, never tried those experiences, never built that home recording studio, never learned that foreign language, and all the other things that were on their list.

For most of us - individuals and businesses alike - earning money is a means to an end. It's a constraint that can enable or prevent us from achieving our goals.

As teams, too, we can too easily get bogged down in the details and lose sight of why we're creating the software and systems that we do in the first place.

So, I think, a balance needs to be struck here. We have to take care of the constraints to achieve our goals, but losing sight of those goals potentially makes all our efforts meaningless.

Getting bogged down in constraints can also make it less likely that we'll achieve our goals at all.

Constraints constrain. That's sort of how that works. If we constrain ourselves to a specific route from LA to San Francisco, for example, and then discover half way that the road is out, we need other options to reach the destination.

Countless times, I've watched teams bang their heads against the brick wall trying to deliver on a spec that can't - for whatever reason - be done. It's powerful voodoo to be able to step back and remind ourselves of where we're really headed, and ask "is there another way?" I've seen $multi-million projects fail because there was no other way - deliver to the spec, or fail. It had to be Oracle. It had to be a web service. It had to be Java.

No. No it didn't. Most constraints we run into are actually choices that someone made - maybe even choices that we made for ourselves - and then forgot that it was a choice.

Yes, try to make it work. But don't mistake choices for goals.

April 26, 2016

SC2016 Codemanship Stocking Filler Mini-Projects

Software Craftsmanship 2016 - Codemanship Stocking Filler Mini-Projects

Screencast A Habit

Estimated Duration: 30-60 mins

Author: Jason Gorman, Codemanship

Language(s)/stacks: Any that's screencast-able


Record a screencast to demonstrate a single good coding habit you believe is important (e.g., "always run the tests after every refactoring").

Demonstrate how not to do it (and what the consequences of not doing might be), as well as how to do it well.

Test your screencast on your fellow SC2016 participants.

Please do NOT upload your screencast to the web until after the event. The Wi-Fi won't take it!

Learn Your IDE's Shortcuts - The Hard Way

Estimated Duration: 30-60 mins

Author: Jason Gorman, Codemanship

Language(s)/stacks: Any


Disable the mouse and/or tracker pad on your computer, and attempt a TDD kata (here are some the Web made earlier - only using the keyboard.

Adversarial Pairing - Programmers At War!!!

Estimated Duration: 30-60 mins

Author: Jason Gorman, Codemanship

Language(s)/stacks: Any


Choose a TDD kata (here are some the Web made earlier -

Person A starts by writing the first failing test.

Person B takes over and writes the simplest evil code that passes the test, but is obviously not what was intended.

Person A must improve the test - or add another test - to steer it back to the original intent.

If Person B can still see a way to do evil with that improved tests, they should do it.

And rinse and repeat until the intended behaviour or rule has been correctly implemented - that is to say, for any valid input, the code will produce the correct result for that behaviour or rule.

Then swap over for the next behaviour or rule in the kata (if there is one), so Person B starts by writing a test, and Person A writes the simplest evil code to pass it.

If time allows, and you have access to a mutation testing tool for your programming language, subject your project code to mutation testing to see how well tested it is.

If I Ruled The (Coding) World...

Estimated Duration: 30-60 mins

Author: Jason Gorman, Codemanship

Language(s)/stacks: Any


The programmer ballots have been counted, and you have been elected the Programmer President of The World.

You can - by decree - change any one thing about the way we all write software. JUST ONE THING.

What would it be?

Use pictures, sample code, fake screenshots, plasticine expression or interpretative dance to illustrate your single decree as Programming President. Stick them on the web where we can see them.

Extra Spaces At Software Craftsmanship 2016 (London)

Good news for code crafters in the London area; we've made extra space for 10 more people at SC2016 on Saturday May 14th.

Bring a laptop, find someone to pair with (or someones to mob with if a projector is free), pick a mini-project from our menu of fun, challenging and crafty code excursions, and do what you do best!

Details and tickets can be found at

April 25, 2016

Mutation Testing & "Debuggability"

More and more teams are waking up to the benefit of checking the levels of assurance their automated tests give them.

Assurance, as opposed to coverage, answers a more meaningful question about our regression tests: if the code was broken, how likely is it that our tests would catch that?

To answer that question, you need to test your tests. Think of bugs as crimes in your code, and your tests as police officers. How good are your code police at detecting code crimes? One way to check would be to deliberately commit code crimes - deliberately break the code - and see if any tests fail.

This is a practice called mutation testing. We can do it manually, while we pair - I'm a big fan of that - and we can do it using one of the increasingly diverse (and rapidly improving) mutation testing tools available.

For Java, for example, there are tools like Jester and PIT. What they do is take a copy of your code (with unit tests), and "mutate" it - that is, make a single change to a line of code that (theoretically) should break it. Examples of automated mutations include turning a + into a -, or a < into <=, or ++ into --, and so on.

After it's created a "mutant" version of the code, it runs the tests. If one or more tests fail, then they are said to have "killed the mutant". If no test fails, then the mutant survives, and we may need to have a think about whether that line of code that was mutated is being properly tested. (Of course, it's complicated, and there will be some false positives where the mutation tool changed something we don't really care about. But the results tend to be about 90% useful, which is a boon, IMHO.)

Here's a mutation testing report generated by PIT for my Combiner spike:

Now, a lot of this may not be news for many of you. And this isn't really what this blog post is about.

What I wanted to draw your attention to is that - once I've identified the false positives in the report - the actual level of assurance looks pretty high (about 95% of mutations I cared about got killed.) Code coverage is also pretty high (97%).

While my tests appear to be giving me quite high assurance, I'm worried that may be misleading. When I write spikes - intended to as proof of concept and not to be used in anger - I tend to write a handful of tests that work at a high level.

This means that when a test fails, it may take me some time to pinpoint the cause of the problem, as it may be buried deep in the call stack, far removed from the test that failed.

For a variety of good reasons, I believe that tests should stick close to the behaviour being tested, and have only one reason to fail. So when they do fail, it's immediately obvious where and what the problem might be.

Along with a picture of the level of assurance my tests give me, I'd also find it useful to know how far removed from the problem they are. Mutation testing could give me an answer.

When tests "kill" a mutant version of the code, we know:

1. which tests failed, and
2. where the bug was introduced

Using that information, we can calculate the depth of the call stack between the two. If multiple tests catch the bug, then we take the shallowest depth out of those tests.

This would give me an idea of - for want of a real word - the debuggability of my tests (or rather, the lack of it). The shallower the depth between bugs and failing tests, the higher the debuggability.

I also note a relationship between debuggability and assurance. In examining mutation testing reports, I often find that the problem is that my tests are too high-level, and if I wrote more focused tests closer to the code doing that work, they would catch edge cases I didn't think about at that higher level.