August 22, 2007

...Learn TDD with Codemanship

Towards A Unified Model of Dependency Management

Much of the thinking that goes into software design revolves around where to put stuff.

We might know that our code needs to create a new Video Rental, and set the date and time on it to now, and link it to the selected video title and the video library member who wants to borrow it, for example. That's probably all been agreed in our acceptance test.

The next question question is "where does this code go?" In what methods? On what objects? In which packages? And, for the enterprise architects among us, in which systems (and in whose businesses, even)?

One of the primary goals of software design is to make change easier. And one of the crappier things about software that makes it so hard to change is coupling. If line of code L references member variable V, then renaming V will require a change to L.

Change propogates backwards along dependencies. If B depends on C, then changing C might force us to change B. Simple.

But it gets worse. If A depends on B, then that change forced in B might force a change in A, too.

Change can propogate along dependencies, spreading beyong neighbouring code into other parts of the software. And whenever we change so much as a single character in our software, we might have to rebuild and retest and redeploy the whole thing, since dependent code either works as a whole or it doesn't. Even with automated unit tests and automated builds and continuous integration, this is still a big deal.

So it makes sense to put an appropriate amount of effort into managing those dependencies, trying to ensure that these ripples are as localised as possible. In qualitative terms, design principles guru Robert C Martin tells us that "classes that change together, belong together". In quantitative terms, we talk about coupling and cohesion of packages of classes. Other experts talk about coupling and cohesion at the class level, too - suggesting that methods and the data they act upon should be organised into the same classes for exactly the same reasons.

I would take this even further, and suggest that it applies to any kind of coupling in software, at all levels. I'm attracted to self-similarity in architecture, and I believe that the same principles apply whether we're deciding what statements belong in what methods, all the way up to what systems belong in what business units.

If we visualise our code as a network of nodes and connections, we can beging to get a handle on how this ripple effect might spread through the network, and what properties of the network might help to localise the effect.

More importantly, we might be able to relate these network properties to software design principles and begin to build a general theory of dependency management that could be applied at the code level all the way up to the enterprise level.

As I did with the problem of project and portfolio management, I'm envisioning a game that will allow me to explore the behaviour of dependency networks and hopefully relate what I learn back to software in as direct a way as possible.

Already I'm thinking about the very simple simulation that models the spread of forest fires. A code ripple is a bit like a forest fire, in that it spreads from tree to neighbouring tree. Just as very large forest fires are rare, so too are very large ripples or "code quakes". Nobody, as far as I'm aware, has measured the distribution of the relative sizes of these ripples. But I suspect if we did - and we probably should - we'd see a power distribution much like that for forest fires and other network phenomena.

Imagine a game where we have a grid of modules, with random dependencies between them. We randomly drop a change on one module, and it has a 50/50 chance of spreading to a dependent module. Now if we measured the average degree of the modules - the average number of dependent neighbours - and the average size of the ripples (the number of dependent modules to which the change propogated), I wonder what the relationship would be.

Also, imagine that some of the modules were less likely to have a random change dropped on them - simulating abstractions in our code that are less likely to change. I wonder if the average size of the ripples would change if more of the dependencies went towards these abstract modules.

And what if we extended our model to include packages of modules? Could we use this to explore package cohesion and the stable abstractions design principle?

I'm attracted to the idea of a simple, tangible mathematical model to help me explore these ideas. It might even be possible to somehow prove that certain design principles really work.
Posted 14 years, 11 months ago on August 22, 2007