August 19, 2014

Programming In Schools - My Final Word (Honest!)

So, the nights are drawing in as we drift towards autumn and a new school year. And my thoughts turn to the teachers and children from ages five upwards who will be doing computer programming starting in September.

This is the culmination of several years of campaigning by groups like Computing At School to bring programming and computer science (well, let's be honest: mainly computer science) back into schools after a couple of decades of unchallenging "ICT".

While the return of programming to British schools is very welcome, I'm afraid that, when it comes to the question of how we're solving the problem, I cannot escape the feeling that the Emperor has no clothes.

And I'll tell you for why:

Firstly, computing has been shoehorned into an already bulging syllabus as an academic subject like mathematics. And we all know how much five-year-olds love maths.

This is a mistake, in my opinion. I try to imagine myself as a five year old being taught computer science and "computational thinking". I suspect it might have put me off programming for life.

Programming should begin with creativity and fun. Make games, make noises, be silly, do cool stuff, impress your friends, take it home to show your parents. It should be treated in a similar vein to making bread, or collages, or space man costumes. Freely available tools like MIT's Scratch and Microsoft's Kodu present teachers with an opportunity for supervised creativity, during which kids can wrap their heads around basic programming concepts and accustom themselves to the whole business of giving instructions to a computer and seeing what the results are. That's actually how most professional programmers learned; trying stuff out and seeing what happens.

So, the emphasis - especially with younger children - should be on shits & giggles. Not theory and maths.

Secondly, let's look at the money and the time being invested.

Schools haven't known for more than a few months that computing will now be a mandatory part of the syllabus. There's been precious little time to prepare, and the help that's available to teachers is being spread very thin.

The total being invested in training teachers across the UK - bearing in mind that now all state primary and secondary schools must teach this from September (many tens of thousands of teachers) - is less than £4 million.

That's off by a couple of trailing zeros, when you consider the sheer size of the undertaking and how long it really takes to learn how to program a computer well enough to teach others.

Our government - not just today's, but going back decades - quite merrily blows billions on failed projects. We blew £10 billion+ on 2 weeks of running and jumping, for example. We're blowing hundreds of millions on aircraft carriers we can't use.

After all the talk of a high-skill, high-tech future economy, it seems we're not prepared to put our money where our mouth is. For this reason, I believe teachers and schools are being set up to fail.

Finally, let's assume that all of these little wrinkles get ironed out, and the path from primary education through to a computer science degree is smoothed to the extent that CS courses are full-to-bursting and their cups runneth over with eager young minds...

Does this solve the problem of there being not enough good software developers to stoke the boiler of the British digital engine?

Probably not.

There's been an underlying assumption running through all of this that computer scientists and software developers are the same thing; that young people who study the theory at college or university will be exactly the kind of employees software organisations are screaming out for more of.

Nope. That's not what they teach at university - with very few exceptions. I hear it time and again: "I learned more useful stuff in my first 6 weeks in this job than in my entire degree."

And that's not to denigrate CS degree courses. They're quite right not to devote much time to software development vocational skills, for the same reason that physics courses are quite right not devote much time to building fibre optic telephone exchanges.

We have time to address that question, though. While a new generation of programmers finds their feet in our schools, we must figure out how best to train and nurture great software developers.

So, for what it's worth, here are my predictions:

1. In the short term, computing in schools is going to be a bit of a train wreck. Many teachers and schools will not be ready to deliver this in September. It will hurt when they try.

2. The government - probably not this current one - will figure out that more time and much, much more money needs to be invested. Hey, it turns out this stuff isn't as easy as they told us. Go figure!

3. The emphasis of computing, especially from ages 5-11 will be forced to shift from academic subject to practical fun creative la la time. This will be welcomed by anyone who had their own practical fun creative la la time programming experience when they were kids, leading to a lifelong love of doing it.

4. (not so much a prediction as a commitment) As a profession, we need to get our shit together on apprenticeships and other vocational routes into software development, before a horde of crappy Python programmers fresh out of school descends upon us.

August 12, 2014

TDD is TDD (And Far From Dead)

Now, it would take enormous hubris for me to even suggest that this blog post that follows is going to settle the "What is TDD?" "Is TDD dead?" "Did weazels rip my TDD?" debates that have inexplicably sprung up around and about the countryside of late.

But it will. (In my head, at any road.)

First of all, what is TDD? I'm a bit dismayed that this debate is still going on, all these years later. TDD is what it always was, right from the time the phrase appeared:

Test-driven Development = Test-driven Design + Refactoring

Test-driven Design is the practice of designing our software to pass tests. They can be any kind of tests that software can pass: unit tests, integration tests, customer acceptance tests, performance tests, usability tests, code quality tests, donkey jazz hands tests... Any kind of tests at all.

The tests provide us with examples of how the software must be - at runtime, at design time, at tea time, at any time we say - which we generalise with each new test case to evolve a design for software that does a whole bunch of stuff, encompassed by the set of examples (the suite of tests) for that software.

We make no distinction in the name of the practice as to what kind of tests we're aiming to pass. We do not call it something else just because the tests we're driving our design with happen not to be unit tests.

Refactoring is the practice of improving the internal design of our software to make it easier to change. This may mean making the code easier for programmers to understand, or generalising duplicate code into some kind of reusable abstraction like a parameterised method or a new module, or unpicking a mess of dependencies to help localise the impact of making changes.

As we're test-driving our designs, it's vitally important to keep our code clean and maintainable enough to allow us to evolve it going forward to pass new tests. Without refactoring, Test-driven Design quickly becomes hard going, and we lose the ability to adapt to changes and therefore to be agile for our customer.

The benefits of TDD are well understood, and backed up by some hard data. Software that is test-driven tends to be more reliable. It tends to be simpler in its design. Teams that practice TDD tend to find it easier to achieve continuous delivery. From a business perspective, this can be very valuable indeed.

Developers who are experienced in TDD know this to be true. Few would wish to go back to the Bad Old Days before they used it.

That's not say that TDD is the be-all and end-all of software design, or that the benefits it can bring are always sufficient for any kind of software application.

But it is very widely applicable in a wide range of applications, and as such has become the default approach - a sort of "start for ten" - for many teams who use it.

It is by no means dead. There are more teams using it today than ever before. And, as a trainer, I know there are many more that aspire to try it. It's a skill that's highly in demand.

Of course, there are teams who don't succeed at learning TDD. Just like there are people who don't succeed at learning to play the trombone. The fact that not everybody succeeds at learning it does not invalidate the practice.

I've trained and coached thousands of developers in TDD, so I feel I have a good overview of how folk get on with it. Many - most, let's be honest - seriously underestimate the learning curve. Like the trombone, it may take quite a while to get a tune out of it. Some teams give up too easily, and then blame the practice. Many thousands of developers are doing it and succeeding with it. I guess TDD just wasn't for you.

So there you have it, in a nutshell: TDD is what it always was. It goes by many names, but they're all pseudonyms for TDD. It's bigger today than it ever was, and it's still growing - even if some teams are now calling it something else.

There. That's settled, then.

July 31, 2014

My Top 5 Most Under-used Dev Practices

So, due to a last-minute change of plans, I have some time today to fill. I thought I'd spend it writing about those software development practices that come highly recommended by some, but - for whatever reason - almost no teams do.

Let's count down.

5. Mutation Testing - TDD advocates like me always extol the benefits of having a comprehensive suiite of tests we can run quickly, so we can get discover if we've broken our code almost immediately.

Mutation testing is a technique that enables us to ask the critical question: if our code was broken, would our tests show it?

We deliberately introduce a programming error - a "mutation" - into a line of code (e.g., turn a + into a -, or > into a <) and then run our tests. If a test fails, we say our test suite has "killed the mutant". We can be more assured that if that particular line of code had an error, our tests would show it. If no tests fail, that potentially highlights an gap in our test suite that we need to fill.

Mutation testing, done well, can bring us to test suites that offer very high assurance - considerably higher than I've seen most teams achieve. And that extra assurance tends to bring us economic benefits in terms of catching more bugs sooner, saving us valuable time later.

So why do so few teams do it? Well, tool support is one issue. The mutation testing tools available today tend to have a significant learning curve. They can be fiddly, and they can throw up false positives, so teams can spend a lot of time chasing ghosts in their test coverage. It takes some getting used to.

In my own experience, though, it's worth working past the pain. The pay-off is often big enough to warrant a learning curve.

So, in summary, reason why nobody does it : LEARNING CURVE.

4. Visualisation - pictures were big in the 90's. Maybe a bit too big. After the excesses of the UML days, when architects roamed the Earth feeding of smaller prey and taking massive steaming dumps on our code, visual modeling has - quite understandably - fallen out of favour. So much so that many teams do almost none at all. "Baby" and "bathwater" spring to mind.

You don't have to use UML, but we find that in collaborative design, which is what we do when we work with customers and work in teams, a picture really does speak a thousand words. I still hold out hope that one day it will be commonplace to see visualisations of software designs, problem domains, user interfaces and all that jazz prominently displayed in the places where development teams work. Today, I mainly just see boards crammed with teeny-weeny itty-bitty index cards and post-it notes, and the occasional wireframe from the UX guy, who more often than not came up with that design without any input at all from the team.

The effect of lack of visualisation on teams can be profound, and is usually manifested in the chaos and confusion of a code base that comprises several architectures and a domain model that duplicates concepts and makes little to no sense. If you say you're doing Domain-driven Design - and many teams do - then where are your shared models?

There's still a lot of mileage in Scott Ambler's "Agile Modeling" book. Building a shared understanding of a complex problem or solution design by sitting around a table and talking, or by staring at a page of code, has proven to be uneffective. Pictures help.

In summary, reason why so few do it: MISPLACED AGILE HUBRIS

3. Model Office - I will often tell people about this mystical practice of creating simulated testing environments for our software that enable us to see how it would perform in real-world scenarios.

NASA's Apollo team definittely understood the benefits of a Model Office. Their lunar module simulator enabled engineers to try out solutions to systemm failures on the ground before recommending them to the imperilled astronauts on Apollo 13. Tom Hanks was especially grateful, but Bill Paxton went on to star in the Thunderbirds movie, so it wasn't all good.

I first heard about them doing a summer stint in my local W H Smith in the book department. Upstairs, they had a couple of fake checkouts and baskets of fake goods with barcodes.

Not only did we train in their simulated checkout, but they also used them to analyse system issues and to plan IT changes, as well as to test those changes in a range of "this could actually happen" scenarios.

A Model Office is a potentially very powerful tool for understanding problems, for planning solutions and for testing them - way more meaningful than acceptance tests that were agreed among a bunch of people sitting in a room, many of whom have never even seen the working environment in which the software's going to be used, let alone experienced it for themselves.

There really is no substitute for the real thing; but the real thing comes at a cost, and often the real thing is quite busy, actually, thank you very much. I mean, dontcha just hate it when you're at the supermarket and the checkout person is just learning how it all works while you stand in line? And mistakes that get made get made with real customers and real money.

We can buy ourselves time, control and flexibility by recreating the real thing as faithfully as possible, so we can explore it at our leisure.

Time, because we're under no pressure to return the environment to business use, like we would be if it was a real supermarket checkout, or a real lunar module.

Control, because we can deliberately recreate scenarios - even quite rare and outlandish ones - as often as we like, and make it exactly the same, or vary it, as we wish. One of the key reasons I believe many business systems are not very robust is because they haven't been tested in a wide-enough range of possible circumstances. In real life, we might have to wait weeks for a particular scenario to arise.

Flexibility, because in a simulated environment, we can do stuff that might be difficult or dangerous in the real world. We can try out the most extraordinary situations, we can experiment with solutions when the cost of failure is low, and we can explore the problem and possible solutions in ways we just couldn't or wouldn't dare to if real money, or real lives, or real ponies were at stake.

For this reason,, from me, Model Offices come very highly recommended. Which is very probably why nobody uses them.

Reason why nobody does it - NEVER OCCURRED TO THEM

2. Testing by Inspection - This is another of those blind spots teams seem to have about testing. Years of good research have identified reading the code to look for errors as one of the most - if not the most - effective and efficient ways of finding bugs.

Now, a lot of teams do code reviews. It's a ritual humiliation many of us have to go through. But commonly these reviews are about things like coding style, naming conventions, design rules and so forth. It's vanishingly rare to meet a team who get around a computer, check out some code and ask "okay, will this work?"

Testing by inspection is actually quite a straightforward skill, if we want it to be. A practice like guided inspection, for example, simply requires us to pick some interesting test cases, and step through the code, effectively executing it in our heads, asking questions like "what should be true at this point?" and "when might this line of code not work?"

If we want to, we can formalise that process to a very high degree of rigour. But the general pattern is the same; we make assertions about what should be true at key points during the execution of our code, we read the code and dream up interesting test cases that will cause those parts of the code to be executed and ask those questions at the appropriate times. When an inspection throws up interesting test cases that our code doesn't handle, we can codify this knowledge as, say, automated unit tests to ensure that the door is closed to that particular bug permanently.

Do not underestimate the power of testing by inspection. It's very rare to find teams producing high-integrity software who don't do it. (And, yes, I'm saying it's very rare to find teams producing high-integrity software.)

But, possibly because of associations with the likes of NASA, and safety-critical software engineering in general, it has a reputatioon for being "rocket science". It can be, if we choose to go that far. But in most cases, it can be straightforward, utilising things we already know about computer programming. Inspections can be very economical, and can reap considerable rewards. And pretty much anyone who can program can do them. Which is why, of course, almost nobody does.

Reason why nobody does it - NASA-PHOBIA

1. Business Goals - Okay, take a deep breath now. Imminent Rant Alert.

Why do we build software?

There seems to be a disconnect between the motivations of developers and their customers. Customers give us money to build software that hopefully solves their problems. But, let's be honest now, a lot of developers simply could not give two hoots about solving the customer's problems.

Which is why, on the vast majority of software teams, when I ask them what the ultimate business goals of what they're doing are, they just don't know.

Software for the sake of software is where our heads are mostly at. We buiild software to build software.

Given a free reign, what kind of software do developers like to build? Look on Github. What are most personal software projects about?

We don't build software to improve care co-ordination for cancer sufferers. We don't build software to reduce delivery times for bakeries. We don't build software to make it easier to find a hotel room with fast Wi-Fi at 1am in a strange city.

With our own time and resources, when we work on stuff that interests us, we won't solve a problem in the real world. We'll write another Content Management System. Or an MVC framework. Or another testing tool. Or another refactoring plug-in. Or another VCS.

The problems of patients and bakers and weary travelers are of little interest to us, even though - in real life - we can be all of these things ourselves.

So, while we rail at how crappy and poorly thought-out the software we have to use on a daily basis tends to be ("I mean, have they never stayed in a hotel?!"), our lack of interest in understanding and then solving these problems is very much at the root of that.

We can be so busy dreaming up solutions that we fail to see the real problems. The whole way we do development is often a testament to that, when understanding the business problem is an early phase in a project that, really, shouldn't exist until someone's identified the problem and knows at least enough to know it's worth writing some software to address it.

Software projects and products that don't have clearly articulated, testable and realistic goals - beyond the creation of software for its own sake - are almost guaranteed to fail; for the exact same reason that blindly firing arrows in random directions with your eyes closed is almost certainly not going to hit a valuable target. But this is what, in reality, most teams are doing.

We're a solution looking for a problem. Which ultimately makes us a problem. Pretty much anyone worth listening to very, very strongly recommends that software development should have clear and testable business goals. So it goes without saying that almost no teams bother.

Reason why so few teams do it - APATHY

July 29, 2014

Please Support #DevConDebut & Help Us Bring New Voices Into Software Development

Are you tired of hearing from the same old faces at dev conferences (including mine)?

#DevConDebut is a brand new conference designed to promote new talent. Every speaker will be someone who has never spoken at a conference before.

We're aiming to take a few risks to bring fresh ideas and new voices into the developer community. But we need your support to do this.

Conference organisers like to use experience speakers; they're a safer bet and they can make events easier to promote.

If, like me, you want to see new talent being encouraged, then you can help in one of 3 ways:

1. Buy a ticket - come along and support our new speakers

2. Spread the word - tell friends, colleagues and complete strangers in the checkout queue about #DevConDebut

3. Speak at #DevConDebut - we're open to all ideas from new speakers on any topic in software development

All the proceeds from ticket sales will help fund educational programmes in maths and computer programming at Bletchley Park and The National Museum of Computing. It's all in a very good cause.

I know folk are busy, and you've probably got 1,001 other things to worry about; but your support will be vitally important to get this off the ground. If, like me, you want to get behind new talent and hear new voices, please lend us your support. It can't happen without you.

Many thanks.

July 28, 2014

More Rage On ORM Abuses & How I Do It

So, a day later, and the rage about my last blog post, about how I see many developers abusing Object-Relational Mapping frameworks like Hibernate to build old-fashioned database-driven architectures, continues.

My contention is that we don't use what is essentially a persistence layer to build another persistence layer. We've already got one; we just have to know how to use it.

The litmus test for object-relational persistence - well, any kind of object persistence - is whether we're able to write our application in such a way that if we turned persistence off, the application would carry on working just dandy - albeit with a memory that only lasts as long as the running process.

If persistence occurs at the behest of objects in the core application's logic - and I'm not just talking about domain objects, here - then we have lost that battle.

Without throwing out a platform-specific example - because that way can lead to cargo cults ("Oh, so that's THE way to do it!" etc) - let's illustrate with a pseudo-example:

Mary is writing an application that runs the mini-site for a university physics department. She needs it to do things like listing academic staff, listing courses and modules, and accepting feedback from students.

In a persistence-agnostic application, she would just use objects that live in memory. Perhaps at the root there is a Faculty object. Perhaps this Faculty has a collection of Staff. Perhaps each staff member teaches specific Course Modules. Hey, stranger things have happened.

In the purely OO version, the root object is just there. There's only one Faculty. It's a singleton. (Gasps from the audience!)

So Mary writes it so that Faculty is instantiated when the web application starts up as an app variable.

She adds staff members to the faculty, using an add() method on the interface of Faculty. Inside, it inserts the staff member into the staff collection that faculty holds.

The staff listing page just iterates through that collection and builds a listing for each member. Simples.

Clicking on the link for a staff member takes you to their page, which lists the course modules they teach.

So far, no database. No DAOs. No SQL. No HQL. It's all happening in memory, and when we shut the web app down, all that data is lost.

But, and this is the important point, while the app is running, it does work. Mary knocks up a bit of code to pre-populate the application with staff members and course modules, for testing purposes. To her, this is no different to writing a SQL script that pre-populates a database. Main memory is her database - it's where the data is to be found.

Now, Mary wants to add persistence so she can deploy this web app into the real world.

Her goal is to do it, ideally, without changing a single line of the code she's already written. Her application works. It just forgets, is all.

So now, instead of building the test objects in a bit of code, she creates a database that has a FACULTY table (with only one record in it, the singleton), a STAFF_MEMBER table and a COURSE_MODULE table with associated foreign keys.

She creates a mapping file for her ORM that maps these tables onto their corresponding classes. Then she writes a sliver of code to fetch Faculty from the database. And, sure, if you want to call the object where that code resides a "FacultyRepository" then be my guest. Importantly, it lives aside from the logic of the application. It is not part of the domain model.

That's pretty much it, bar the shouting. The ORM takes care of the navigations. If we add a staff member to the faculty, another sliver of code executed, say, after server page processing persists the Faculty and the ORM cascades that down through it's relationships (providing we've specified the mapping that way.)

By using, say an Http filter (or an HttpModule, if that's what your tech architecture calls it) we can remove persistence concerns completely from the core logic of the application and achieve it as a completely separate done-in-one-place-only concern.

Hibernate's session-per-request pattern, for example, allows us to manage the scope of persistent transactions - fetching, saving, detaching and re-attaching persistent objects and so on - before and after page processing. Our Http filters just need to know where to find the persistent root objects. (Where is Faculty? It's in application state.) The ORM and our mapping takes care of the rest.

And so it is, without changing a line of code in any of her server pages, or her domain model, and without writing an DAOs or that sort of nonsense, Mary is able to make her application state persistent. And she can even, with a quick change to a config file, turn persistence on and off, and swap one method of persistence with another.

The controllers - the server pages - in Mary's web app need know nothing about persistence. They access state in much the same way they would anyway, either locally as variables on each server page, or as session and application variables. For them, the objects are just there when they're needed. And new objects they trigger the creation of are automatically inserted into the database by the ORM (ditto deletions and updates).

Now, we can mix and match and slice and dice this approach. Mary could have used a pre-processing Http filter to load Faculty into session state at the start of a session, making it a thread-specific singleton, and the persistence Http filters could be looking in session state for persistent objects. Or she could load it anew on each page load.

The important thing to remember is that the server page is non the wiser. All it needs to know is where to look to find these objects in memory. They are just there.

This is my goal when tackling object persistence. I want it to look like magic; something that just happens and we don't need to know how. Of course, someone needs to know, and, yes, there is more to it than the potted example I've given you. But the principles remain the same: object persistence should be a completely separate concern, not just from the domain model, but from all the core application code, including UI, controllers, and so on.

July 27, 2014

Object-Relational Mapping Should Be Felt & Not Seen

Here's a hand-wavy general post about object-relational persistence anti-patterns that I still keep seeing popping up in many people's code.

First, let me set out what the true goal of ORM's should be: an ORM is designed to allow us to build good old-fashioned object oriented applications where the data in our objects can outlive the processes they run in by storing said persistent data in a relational database.

Back in the bad old days, we did this by writing what we called "Data Access Objects" (DAO's) for each type of persistent object - often referred to as entities, or domain objects, or even "Entity Beans" (if your Java code happened to have disappeared up that particular arse in the late 1990's.)

This was very laborious, and often took up half the effort of development.

Many development teams working on web and "enterprise" applications were coming from a 2-tier database-driven background, and were most familiar and comfortable with the notion that the "model" in Model-View-Controller was the database itself. Hence, their applications tended to treat the SQL Server, Oracle or wotnot back-end as main memory and transact every state change of objects against it pretty much immediately. "Middle tier" objects existed purely as gateways to this on-disk relational memory. Transactions and isolation of changes was handled by the database server itself.

Not only did this lead to applications that could only be run meaningfully - including for testing - with that database server in place, but it also very tightly coupled the database to the application's code, making it rigid and difficult to evolve. If every third line of code involves a trip to the database, and if objects themselves aren't where the data is to be found most of the time - except to display it on the user's screen - then you still have what is essentially a database-driven application, albeit with a fancy hifalutin "middle tier" to create the illusion that it isn't.

Developers coming from an object oriented background suffered exactly the opposite problem. We knew how to build an application using objects where the data lived in memory, but struggled with persisting that data to a relational database. quite naturally, we just wanted that to sort of happen by magic, without us having to make any changes to our pristine object oriented code and sully it with DAOs and Units of Work and repositories and SQL mappers and transaction handling and blah blah blah.

And, whether you've heard or not, frameworks like Hibernate allow us to do pretty much exactly that; but only if we choose to do it that way.

Sadly, just as a FORTRAN programmer can write FORTRAN code in any programming language you give them, 2-tier database-driven programmers can write 2-tier database-driven code with even the most sophisticated ORMs.

Typically, what I see - and this is possibly built on a common misinterpretation of the advice given in Domain-driven Design about persistence architectures - is developers writing DAO's using ORMs. So, they'll take a powerful framework like Hibernate - which enables us to write our persistent objects as POJO's that hold application state in memory (so that the logic will work even if there's no database there), just like in the good old days - and convert their SQL mappers into HQL mappers that use the Hibernate Query Language to access data in the same chatty, database-as-main-memory way they were doing it before. Sure, they may be disguising it using domain root object "repositories", and that earns them some protection; for example, allowing us to mock repositories so we can unit test the application code. But when they navigate relationships by taking the ID of one object and using HQL to find its collaborator in the database, it all starts to get a bit dicey. A single web page request can involve multiple trips to the database, and if we take the database away, depending on how complicated the object graph is, we can end up having to weave a complex web of mock objects to recreate that object graph, since the database is the default source of application state. After all, it's main memory.

Smarter developers rely on an external mapping file and the in-built capabilities of Hibernate to take care of as much of that as possible.

They also apply patterns that allow them to cleanly separate the core logic of their applications from persistence and transactions. For example, the session-per-request pattern can be implemented for a web application by attaching persistent objects to a standard collection in session state, and managing the scope of transactions outside of main page request processing (e.g., using Http modules in ASP.NET to re-attach persistent objects in session state before page processing begins, and to commit changes after processing has finished.)

If we allow it, the navigation from a customer to her orders can be as simple as customer.orders. The ORM should take care of the fetching for us, provided we've mapped that relationship correctly in the configuration. If we add a new order, it should know how to take care of that, too. Or if we delete an order. It should all be taken care of, and ideally as a single transaction that effectively synchronises all the changes we made to our objects in memory with the data stored in the DB.

The whole point of an ORM is to generate all that stuff for us. To take something like Hibernate, and use it to write a "data access layer" is kind of missing that point.

We should not need a "CustomerRepository" class, nor a "CustomerDAO". We should need none of that, and that's the whole point of ORMs.

As much as possible in our code, Object-Relational Mapping should be felt, and not seen.

July 16, 2014

What Level Should We Automate Most Of Our Tests At?

So this blog post has been a long time in the making. Well, a long time in the procrastinating, at any rate.

I have several clients who have hit what I call the "front-end automated test wall". This is when teams place greatest emphasis on automating acceptance tests, preferring to verify the logic of their applications at the system level - often exercised through the user interface using tools like Selenium - and rely less (or not at all, in some cases) on unit tests that exercise the code at a more fine-grained level.

What tends to happen when we do this is that we end up with large test suites that require much set-up - authentication, database stuff, stopping and starting servers to reset user sessions and application state, and all the fun stuff that comes with system testing - and run very slowly.

So cumbersome can these test suites become that they slow development down, sometimes to a crawl. If it takes half an hour to regression test your software, that's going to make the going tough for Clean Coders.

The other problem with these high-level tests is that, when they fail, it can take a while to pin down what went wrong and where it went wrong. As a general rule of thumb, it's better to have tests that only have one reason to fail, so when something breaks it's alreay pretty well pinpointed. Teams who've hit the wall tend to spend a lot time debugging.

And then there's the modularity/reuse issue: when the test for a component is captured at a much higher level, it can be tricky to take that chunk and turn it into a reusable chunk. Maybe the risk calculation component of you web application could also be a risk calculation component of a desktop app, or a smartwatch app. Who knows? But when its contracts are defined through layers of other stuff like web pages and wotnot, it can be difficult to spin it out into a product in its own right.

For all these reasons, I follow the rule of thumb: Test closest to the responsibility.

One: it's faster. Every layer of unnecessary wotsisname the tests have to go through to get an answer adds execution time and other overheads.

Two: it's easier to debug. Searching for lost car keys gets mighty complicated when your car is parked three blocks away. If it's right outside the front door, and you keep the keys in a bowl in the hallway, you should find them more easily.

Three: it's better for componentising your software. You may call them "microservices" these days, but the general principles is the same. We build our applications by wiring together discrete components that each have a distinct responsibility. The tests that check if a component fulfils its reponsibility need to travel with that components, if at all possible. If only because it can get horrendously difficult to figure out what's being tested where when we scatter rules willy nilly. The risk calculation test wants to talk to the Risk Calculator component. Don't make it play Chinese Whsipers through several layers of enterprise architecture.

Sometimes, when I suggest this, developers will argue that unit tests are not acceptance tests, because unit tests are not written from the user's perspective. I believe - and find from experience - that this is founded on an artificial distinction.

In practice, an automated acceptance test is just another program written by a programmer, just like a unit test. The programmer interprets the user's requirements in both cases. One gives us the illusion of it being the customer's test, if we want it to be. But it's all smoke and mirrors and given-when-then flim-flam in reality.

The pattern, known of old, of sucking test data provided by the users into parameterised automated tests is essentially what our acceptance test automation tools do. Take Fitnesse, for example. Customer enters their Risk Calculation inputs and expected outputs into a table on a Wiki. We write a test fixture that inserts data form the table into program code that we write to test our risk calculation logic.

We could ask the users to jot those numbers down onto a napkin, and hardcode them into our test fixture. Is it still the same test? It it still an automated acceptance test? I believe it is, to all intents and purposes.

And it's not the job of the user interface or our MVC implementation or our backend database to do the risk calculation. There's a distinct component - maybe even one class - that has that responsibility. The rest of the architecture's job is to get the inputs to that component, and marshall the results back to the user. If the Risk Calculator gets the calculation wrong, the UI will just display the wrong answer. Which is correct behaviour for the UI. It should display whatever output the Risk Calculator gives it, and display it correctly. But whether or not it's the correct output is not the UI's problem.

So I would test the risk calculation where the risk is calculated, and use the customer's data from the acceptance test to do it. And I would test that the UI displays whatever result it's given correctly, as a separate test for the UI. That's what we mean by "separation of concerns"; works for testing, too. And let's not also forget that UI-level tests are not the same thing as system or end-to-end tests. I can quite merrily unit test that a web template is rendered correctly using test data injected into it, or that an HTML button is disabled running inside a fake web browser. UI logic is UI logic.

And I know some people cry "foul" and say "but that's not acceptance testing", and "automated acceptance tests written at the UI level tend to be nearer to the user and therefore more likely to accurately reflect their requirements."

I say "not so fast".

First of all, you cannot automate user acceptance testing. The clue is in the name. The purpose of user acceptance testing is to give the user confidence that we delivered what they asked for. Since our automated tests are interpretations of those requirements - eevery bit as much as the implementations they're testing - then, if it were my money, I wouldn't settle for "well, the acceptance tests passed". I'd want to see those tests being executed with my own eyes. Indeed, I'd wanted to execute them myself, with my own hands.

So we don't automate acceptance tests to get user acceptance. We automate acceptance tests so we can cheaply and effectively re-test the software in case a change we've made has broken something that was previously working. They're automated regression tests.

The worry that the sum total of our unit tests might deviate from what the users really expected is mitigated by having them manually execute the acceptance tests themselves. If the software passes all of their acceptance tests AND passes all of the unit tests, and that's backed up by high unit test assurance - i.e., it is very unlikely that the software could be broken from the user's perspsctive without any unit tests failing - then I'm okay with that.

So I still have user acceptance test scripts - "executable specifications" - but I rely much more on unit tests for ongoing regression testing, because they're faster, cheaper and more useful in pinpointing failures.

I still happily rely on tools like Fitnesses to capture users' test data and specific examples, but the fixtures I write underneath very rarely operate at a system level.

And I still write end-to-end tests to check that the whole thing is wired together correctly and to flush out configuration and other issues. But they don't check logic. They just the engine runs when you turn the key in the ignition.

But typically I end up with a peppering of these heavyweight end-to-end tests, a feathering of tests that are specifically about display and user interaction logic, and the rest of the automated testing iceberg is under the water in the form of fast-running unit tests, many of which use example data and ask questions gleaned from the acceptance tests. Because that is how I do design. I design objects directly to do the work to pass the acceptance tests. It's not by sheer happenstance that they pass.

And if you simply cannot let go of the notion that you must start by writing an automated acceptance test and drive downwards from there, might I suggest that as new objects emerge in your design, you refactor the test assertions downwards also and push them into new tests that sit close to those new objects, so that eventually you end up with tests that only have one reason to fail?

Refactorings are supposed to be behaviour-preserving, so - if you're a disciplined refactorer - you should end up with a cluster of unit tests that are logically directly equivalent to the original high-level acceptance test.

There. I've said it.

July 9, 2014

What Problem Does This Solve?

It seems that, every year, the process of getting started with a new application development becomes more and more complicated and requires ever steeper learning curves.

The root of this appears to be the heterogenity of our development tools, which grows exponentially as more and more developers - fuelled by caffeine-rich energy drinks and filled with the kind of hubris that only a programmer seems to be capable of - flex their muscles by doing what are effectively nothing more than "cover versions" of technologies that already exist and are usually completely adequate at solving the problem they set out to solve.

Take, for example, Yet Another Test Automation Tool (YATAT). The need for frameworks that remove the donkey work of wiring together automated tests suites and running all the tests is self-evident. Doing it the old-fashioned way, in the days before xUnit, often involved introducing abstractions that look very xUnit-ish and having to remember to write the code to execute each new test.

Tools like JUnit - which apply convention over that kind of manual configuration - make adding and running new tests a doddle. Handy user-friendly test-runner GUIs are the icing on the cake. Job done now.

For a bit of extra customer-centric mustard, add on the ability to suck test data for parameterised tests out of natural language descriptions of tests written by our customers. We cracked that one many moons ago, when heap big Turbo C++ compilers roamed the earth and programmer kill many buffalo etc. Ah yes, the old "merge the example data with the parameterised test" routine...

Given that the problem's solved, and many times over, what need, asks I, to solve it again, and again? And then solve it again again?

The answer is simple: because we can. Kent Beck learns new programming languages by knocking up a quick xUnit implementation in it. Pretty much any programmer beyond a certain rudimentary ability can do it. And they do. xUnit implementations are the Stairway To Heaven of programming solutions.

Likewise, MVC frameworks. They demonstrate a rudimentary command of a programming language and associated UI frameworks. Just as many rock guitar players have at some point a few weeks into learning the instrument mastered "The Boys Are Back In Town", many developers with an ounce of technical ability have gone "Look, Ma! I done made a MVC!" ("That's nice, dear. Now run outside and rig up an IoC container with your nice friends.")

But most cover versions of Stairway To Heaven (and The Boys Are Back In Town) are not as good as the originals. And even if they were, what value do they add?

Unless you're embuing your xUnit implementation with something genuinely new, and genuinely useful, surely it's little more than masturbation to do another one?

Now, don't get me wrong: masturbation has a serious evolutionary purpose, no doubt. It's practice for the real thing, it keeps the equipment in good working order, and it's also enjoyable in its own right. But what it's not any good for is making babies. (Unless it's immediately proceeded by some kind of turkey baster-type arrangement.)

It's actually quite satisifying to put together something like an xUnit implementation, or an MVC framework, or a Version Control System, or a new object oriented programming language that's suspiciously like C++.

The problems start when some other developers say "Oh, look, a new shiny thing. Let's ditch the old one and start using this one that does exactly the same thing and no better, so we shall."

Now, anyone looking to work with that team has got to drop X and start learning X', so they can achieve exactly what they were achieving before. ("But... it's got monads...")

And thusly we find ourselves climbing a perpetually steepening learning curve, but one that doesn't take us any higher. I shudder to think just how much time we're spending learning "new" technologies just to stand still.

And, yes, I know that we need an xUnit implementation for x=Java and x=C# and x=Object Pascal and so on, but aren't these in themselves self-fulfilling prophesies? A proliferation of sort-of-similar programming languages giving rise to the need for a proliferation of Yet Another 3rd Generation Programming Language xUnit ports?

Genuinely new and genuinely useful technologies come by relatively rarely. And while there are no doubts tweaks and improvements that could be made to make them friendlier, faster, and quite possibly more purple, for the most part the pay-off is at the start when developers find we can do things we were never able to do before.

And so I respectfully request that, before you inflict Yet Another Thing That's Like The Old Thing Only Exactly The Same (YATTLTOTOETS - pronounceed "yattle-toe-totes"), you stop and ask yourself "What problem does this solve? How do this make things better?" and pause for a while to consider if the learning curve you're about to subject us to is going to be worth the extra effort. Maybe it's really not worth the effort, and the time you spend making it and the cumulative time we all spend learning it would be better spent doing something like - just off the top of my head - talking to our customers. (Given that lack of customer involvement is the primary cause of software development failure. Unless you've invented a tool that can improve that. and, before anybody says anything, I refer you back to the "sucking customer's test data into parameterised tests" bit earlier. Been there. Done that. Got a new idea?)

Brought to you by Yet Another Blog Management System Written In PHP That's Not Quite As Good As The Others

June 23, 2014

What's My Problem With Node.js?

So you may have guessed by now, if you follow me on The Twitters, that I'm not the biggest fan of Node.js.

Putting aside that it's got ".js" on the end, and is therefore already committing various cardinal sins in my book - the chief one being that it's written in JavaScript, the programming language equivalent of a Victorian detective who falls through a mysterious space-time warp into 1970's New York and has to hastily adapt to hotpants, television and disco in order to continue solving crimes - my main problem with Node.js is that it makes it easier to do something that most development teams probably shouldn't ought to be. Namely, distributed concurrent programming.

If programming is hard to get right, then distributed concurrent programming is - relatively speaking - impossible to get right. You will almost certainly get it wrong. And the more you do of it, the more wronger what it do be.

The secret to getting concurrency right is to do as little of it as you can get away with. Well-designed applications that achieve this tend to have small, isolated and very heavily tested islands of concurrency. Often they have signs on the shore warning travellers to "turn back, dangerous waters!", "beware of rabid dogs!", "danger: radiation!" and "Look out! Skeletor!" You know; stuff that tends to send right-minded folk who value their lives running in the opposite direction.

Node.js is a great big friendly sign that says "Come on in. Hot soup. Free Wi-Fi.", and it's left to salvage specialists like me to retrieve the broken wrecks.

So, yes, Node.js does make it easier to do distributed concurrency, in much the same way that a hammer makes it easier to drive nails into your head. And both are liable to leave you with a hell of a headache in the morning.

June 15, 2014

A Hippocratic Oath for Software Developers - What Would Yours Be?

Good folk who take the whole notion of a profession of software development seriously are fond of comparing us to medical doctors.

For sure, a professional developer needs to keep abrest of a very wide and growing body of knowledge on tools, techniques, principles and practices, just like a good doctor must.

And, for sure, a professional developer - a true professional - would take responsibility for the consequences of their decisions and the quality of their work, just like doctors must - especially in countries where patients have access to good lawyers who charge on a "no win, no fee" basis.

To my mind, though, what would truly set us apart as a profession would be a strong sense of ethics.

Take, for example, the whole question of user privacy: we, as a profession, seem to suffer from extreme cognitive dissonance on this issue. Understandable, when you consider that we're simultaneously users and creators of systems that might collect user data.

As users, we would wish to choose what information about us is collected, stored and shared. we would want control, and want to know every detail of what's known about us and who gets to see that data. Those of us who've had our fingers burned, or have seen those close to us get burned, by a lax attitude to user privacy, tend to err on the side of caution. We want to share as little personal information as possible, we want as few eyes looking at that data as possible, and we want to know that those eyes are attached to the brains of trustworthy people who have our best interests at heart.

What we've learned in recent years is that none of this is true. We share far more data about ourselves than we realise, and that data seems to be attracting the gaze of far too many people who've shown themselves to be untrustworthy. So, quite rightly, we rail against it all and make a big fuss about it. As users.

But, wearing our other hats, as developers, when we're asked to write code that collects and shares personal data, we don't seem to give it a second thought. "Duh, okay." seems to be the default answer.

I've done it. You've done. We've all done it.

And we did it because, in our line of work, we pay scant attention to ethics and the public good most of the time. At best, we're amoral: too wrapped up in the technical details of how some goal can be achieved for our employers to step back and ask whether it should be achieved at all. Just because we can do it, that doesn't mean that we should.

When was the last time your team had a passionate debate about the ethics of what you were doing? I've watched teams go back and forth for hours over, say, whether or not they should use .NET data binding in their UI, while blithely skimming over ethical issues, barely giving them a second thought.

And just because the guy or gal writing the cheques told us to do it, that doesn't mean that we must. Sure; one way to interpret "professional" is to think it's just someone who does something for money. But some of choose to interpret it as someone who conforms to the standards of their profession. The only problem being that, in software development, we don't have any.

So, if we could take such a thing as a Hippocratic Oath for software development, what would it be?

I suspect, given recent revelations, privacy might figure quite largely in it now. As might user safety in those instances where the software we write might cause harm - be it physical or psychological. You could argue that applications like Twitter and Facebook, for example, have the potential to cause psychological harm; an accidental leak of personal information might ruin someone's life. And then we're back to privacy again.

But what other ethical issues might such an oath need to cover? Would it have anything to say about - just off the top of my head - arbitrarily changing or even withdrawing an API that hundreds of small businesses were relying on? Would it have anything to say about having conflicting financial interests in a development project? Should someone who profits from sales of licenses for Tool X be allowed to influence the decision of whether to buy Tool X? And so on.

What would your oath be?