Filed under: Clean code, Java, — Tags: Divide and concur, Law of Demeter, Regular expression, Train wreck — Thomas Sundberg — 2011-12-30
Train accidents are mostly considered to be bad things. People tend to get hurt when trains have accidents. Never the less, it is not so uncommon with train wrecks in software development.
Train wreck code is code that calls a method on the return value of another method call. This chain can be very long if a value is excavated deep down in a object graph.
A really small example of code that does this is:
cucumberFeature.getFeature().getTags()
A regular expression that can be used to find candidates for a train wreck could be this:
\)\..*\(
This regular expression will start with looking for all right hand parenthesis. Methods in Java are always defined with a parenthesis pair so searching for a right parenthesis is a good start. The code we are looking for will always be followed by a dot, indicating that we will try either get access to an instance variable or execute a method on the return value. If we continue to search for any characters followed by a left parenthesis, then we have a candidate for a train wreck.
The code base I currently work with has much more than 1000 examples. IntelliJ IDEA stops counting after 1001 hits so we have more train wreck candidates. That is a closed source and therefore I can't show code from it. I searched in another code base, the Cucumber JVM code base. I found 132 candidates on the production code. One is the example below:
cucumberFeature.getFeature().getTags()
I would probably have divided this it into two lines. This is really not a bad example and it is easy to fix. But when you have 10 levels, which is where I started just the other day, then it gets kind of ugly.
How should you write the code instead?
To start with, it would be nice if you have your system modularized so you never feel that you have an urge to call a method on the result of another method. When I work on a code base that isn't that well divided, I prefer to split the train wrecks into more, smaller things. I prefer to read code that looks like this:
Feature feature = cucumberFeature.getFeature(); feature.getTags();
The files get longer, but the rows gets shorter. Long methods can be divided a lot easier than long rows. But just leaving it here doesn't really solve any problem, it merely hides it somewhat. Just like using makeup. Makeup only hides issues, it will never fix them.
The fact that I can divide the sample above into two lines may be seen as an indication that the method getTags() is missing on the class Feature. The example above could, and probably should, be refactored to
cucumberFeature.getTags();
Instead of reaching into an object and poke around, you ask for what you want and get it without hassle. There is no need to apply makeup because there isn't any wrinkles that needs hiding any more.
An API that is easy to use is one obvious benefit of the refactoring above. But even more important is of course the fact that it is a lot more modular now and less tangled together. It is actually possible to replace one implementation with another without too much effort. A colleague said recently during a sprint planning session that small, separated tasks were an utopia. Maybe it is sometimes. But if you don't even try, then it will never happen. The same reasoning goes of course with source code, if you don't even try then you will never make it.
Object oriented software should be built as a jigsaw puzzle, all connections between classes should be shallow.
The train wreck above is similar to pieces in a jigsaw puzzle that reaches beyond it's closest pieces. It is reaching one layer away. Maybe not a big problem, but not necessary. Reaching deeper is like mounting something on a wall in a building. You drill a hole and put a plastic plug so a screw will get a really god grip. The deeper you drill and the more layers in the wall you penetrate, the better mechanical behaviour you will get. The result will be a really good place to mount something heavy on. But this mechanical strength is never needed in software. We will never test our result by pulling hard on it and hope it doesn't break. Instead, we prefer to be able to replace pieces in our solution if we need without to much work.
The building metaphor is often used in software development, but it lacks some properties. Adding an extra floor in a building is difficult when you work on an actual building. But it is easy in a computer program if you haven't placed a large bolt between the floors and tightened it really hard. With the right abstractions, this bolt will never be allowed to be placed in our house built in software. This is where we so often go wrong, developers add bad or wrong abstractions and end up with trouble.
Talking to strangers may be a good idea sometimes. But there is a reason why children sometimes are told not to talk to strangers.
Law of Demeter is about not talking to strangers. If a class need the services of another class, it should be available from the other class. You should not have to create a train wreck to get hold of the information you need. If you do, then you have probably missed an abstraction or something encapsulating the information needed.
This is, as with a lot of things in computer science, nothing new. Law of Demeter was first formulated by Ian Holland in 1987. But it is still common that developers break it.
Why? I assume because they have too much knowledge about the system they are developing and perhaps not even recognise the problems new developers will face when exposed to the code.
One popular example used to explain this is The Paper Boy example. A customer buys a paper from a paper boy. The paper boy can either collect money from the customers wallet or the customer could just be told to give the boy money. Most people would probably give the boy the amount he want instead of handing him their wallet. But in computer programs, it is very common to expect a caller to get the wallet and take the amount they want from the it.
One live example of a violation is how stores in Finland tend to to ask you if you will be using your bank account or credit account when you pay with your credit card. I have always found this very strange. They ought to be satisfied wth the knowledge that they will get the money. My affairs with the bank is something that they should not care about.
My conclusion is that if we find train wrecks that are lurking in the code, try to push the functionality away from where you found it. If you can move functionality closer to the interesting end of the wreck, you will probably make the API a little easier to use. Divide the problem into a smaller problem and finally concur it.
This post has been reviewed by some people who I wish to thank for their help
Thank you very much for your feedback!