Smelly Code
I have recently been involved in a distributed SaaS application that was originally built by an offshore team; the chosen technologies (REST, Java, MongoDB) were actually valid technical choices for the problem at hand: they then proceeded to get it spectacularly wrong with a bloated (and unmanageable) data schema and an even worse implementation.
Unfortunately, due to business pressure, we cannot replace entirely the backend; and the API (also, spectacularly RESTless) cannot be altered without a significant investment in development resources in re-writing the clients (Web, iOS and Android).
In this post, I’d like to dissect the many “smells” (see Martin Fowler’s book) that a particular class was emitting, how this could have been avoided and the many anti-patterns encountered: I’m hoping that by doing so, you will avoid making the same mistakes and others like me will not be required to come over and fix your stinkin’ code [1]
First Smell – no style
I have previously ranted about the need to follow adequate (and widely accepted) code styles: new developers joining the team will have a less steep learning curve, automated tools will be better able to provide insights and, most importantly, trivial, avoidable bugs will be easy to spot.
And, yes, your code will look nicer and your self-respect will improve too: great software developers are also talented craftsman: it really is difficult to believe that you are able to express abstract ideas and brilliant insights in code, if your craft looks like you’ve been bashing a caveman’s club on your keyboard.
This code presented all the hallmark of the “keyboard monkey”:
- inconsistent use of spaces (sometimes before, sometimes after parentheses and around operators, or none at all);
- inconsistent use of blank lines: sometimes one, at random, sometimes two or more, then none;
- no regard for the line-length (many many lines longer than 100 columns, several longer than 200, and a 257 columns record-breaking line);
- no use of the “diamond pattern,” and some (random) uses of raw collections;
- variable names with widely variable level of congruence with their true meaning (and sometimes outright misleading);
- inconsistent use of constants and hard-coded strings – sometimes, the same hard-coded string repeated several times within a few lines of code (sometimes clearly the result of code-by-copy & paste)
and on and on…
Takeaways
- pick a good, widely know style guide [2] and stick to it – even better, have your IDE automatically enforce the coding style (both Eclipse and IntelliJ are very good at this);
- if your language supports it consider using an automated tool to find out code style infractions (such as pylint, jslint, etc.): even better, make style checks part of your automated tests (and prevent commits until those pass);
- regardless, be consistent in the use of spaces, blank lines and line length: you never know when a project-wide, regex-driven search & replace will be necessary!
- never (ever!) use raw collections: generics are here for a reason;
- spare a thought for the poor sod who’ll have to fix your stinkin’ code; he may even be you in a few months’ time!
Second Smell – untestable code
This class was implemented to serve a particular API call that, in response to a client query would serve a very complex object, with deep nesting of sub-objects, and a mix of business data and specific drawing coordinates [3]
When serving such a complex object back, one would want to test it under several different scenarios and ensure that the returned object conforms to the API specs.
I guess the good news was that there was virtually no documented API because, in fact, not only there was also no unit testing; but the class was (and still remains) untestable:
public static Map<String, Object> getSomeView(Map<String, Object> queryParams){ List<Map<String,Object>> results = SomeSearch.find(queryParams);
As you can see, this method (that serves an untyped, completely undefined Map of Objects) is static – as is the very first method called, that will eventually and execute the query against MongoDB.
This makes it impossible to mock the call, or the database, or to construct a set of data-driven tests, which would have enabled us to test the data transformation and view generation logic in the method under several different scenarios.
It’s also worth noting that the queryParams could be pretty much anything – from a paging/sorting set of options (which they are – even though, the only way to find out is to run the actual server in debug mode, and execute the API call from a client, and then see what comes up at this end).
Because, yes, you guessed it: there was absolutely no javadoc to explain what the method does.
Takeaways
- use injections wherever possible: Spring or Guice are really valid frameworks and are critical in keeping your code clean, responsive to change and testable;
- do not use static methods, unless you absolutely have to: they make your code difficult to test, and really difficult to mock out – they also render your clients (the code that will use your classes/methods) extremely brittle and equally untestable;
- Java is a strongly typed language – there is a good reason for that: let the compiler and the JVM help you catch your (and others’) mistakes: using String and Object everywhere is a poor man escapism into trying to make Java a dynamic language: if you really cannot devise a data object model for your business entities, you should actually consider a true dynamic language (Python being an excellent choice);
- document your code make sure all public methods have adequate javadoc for others to understand and use; that your classes’ javadoc explains clearly what the class does, how to use it and also examples of correct (and, potentially, incorrect too) usage; and, most importantly, document you APIs, what the method expects to receive and what is supposed to return.
Third Smell – Confusing code
It took me the bast part of an hour to reverse-engineer the following method; then I passed it on to my colleagues as a “Java Puzzler” and their guess were wildly confused too (I have “obfuscated” some of the code to preserve confidentiality and avoid embarassment; but, trust me, the obfuscation does indeed makes this version less obscure than the original):
// Much as it pains me, I've left the code style (or lack thereof) untouched // PLEASE DON'T DO THIS AT HOME private static String getSomething(boolean isOnRoute,List<Map<String,Object>> trip, String primarySomething, String id){ boolean somethingKnown = isOnRoute; if(!somethingKnown && trip!=null){ for(Map<String,Object> segment: trip){ Map<String,Object> line = (Map<String, Object>) thisLeg.get(Constants.LINE); if(line!=null){ String something = (String) carrierLane.get(Constants.SOMETHING); LOG.debug("Line something: "+something); if(primarySomething.equals(something)){ somethingKnown = true; break; } } } } if(somethingKnown){ return primarySomething; } LOG.debug("Unknown something:: isOnRoute:'" + isOnRoute + "' ; primarySomething:'" + primarySomething + "' ; id:'" + id + "'"); return null; }
Taking you out of your misery, all this method does is to return the value (primarySomething) that was originally given to it, if the trip isOnRoute; if not, it will tell you that and return null – and let us gloss for a moment on the fact that it requires the caller to pass in the id only to use it in a debug statement; also, note the unnecessary use of a boolean variable (somethingKnown) only to use it in the break case, and then fall into a return statement; and so many other issues there (for example, why assign the value of a flag that indicates if the trip is “on route” to a flag that indicates whether that elusive something was found?).
In the end, we figured out that this method was entirely pointless, and we removed it entirely.
Sad as it sounds, hitting the DEL key on that pile of crap was the high point of my day.
Takeaways
-
try and make the intent of your method(s) obviously clear to the readers of your code: naming conventions, clear arguments lists, clear algorithms implementations, all help your co-workers;
-
even with that, always add a Javadoc to explain what the intent is; what the arguments are supposed to be; and what the expected outcome (including side-effects) should be;
-
avoid shortcuts that will render the logic of your code obscure to other developers;
-
and, for God’s sake, avoid concatenating strings: String.format() and lsf4j string composition pattern are here for a reason:
// Log4J: LOG.debug(String.format("The result is: %d", result)); // logback or lsf4j: LOG.error("The gizmo [{}] was in {}, but then fritzed at {}, {}", gizmoId, state, foo, bar);
Fourth Smell – Coding by Copy & Paste
You remember when you were a kid and you were asked to “spot the difference” between two pictures seemingly identical?
Let’s play the opposite game: can you spot the similarities:
// Again, I've left the code style (or lack thereof) untouched // PLEASE DON'T DO THIS AT HOME List<Map<String, Object>> originSiteEvents = null; List<Map<String, Object>> destinationSiteEvents = null; List<Map<String, Object>> inventoryEvents = null; try{ originSiteEvents = (List<Map<String, Object>>) ((Map<String, Object>) ((Map)entry.get(ModelConstants.INVENTORY)).get(ModelConstants.ORIGIN)).get(ModelConstants.EVENTS); } catch (Exception e){ //No origin events avaislable } try{ inventoryEvents = (List<Map<String, Object>>) ((Map)entry.get(ModelConstants.INVENTORY)).get(ModelConstants.EVENTS); } catch (Exception e){ //No inventory events available } try{ destinationSiteEvents = (List<Map<String, Object>>) ((Map<String, Object>) ((Map)entry.get(ModelConstants.INVENTORY)).get(ModelConstants.DESTINATION)).get(ModelConstants.EVENTS); } catch (Exception e){ //No destination events available } if(events != null){ List<Map<String, Object>> eventListForLeg = (List<Map<String, Object>>) events; for(int j=eventListForLeg.size()-1; j>=0; j--){ Map<String, Object> event = eventListForLeg.get(j); if(event.get("category")!=null && event.get("category").equals("GPS")){ event.put("isLastGPSEvent", true); break; } } } if(destinationSiteEvents != null){ for(int j=destinationSiteEvents.size()-1; j>=0; j--){ Map<String, Object> event = destinationSiteEvents.get(j); if(event.get("category")!=null && event.get("category").equals("GPS")){ event.put("isLastGPSEvent", true); return; } } } if(inventoryEvents != null){ for(int j=inventoryEvents.size()-1; j>=0; j--){ Map<String, Object> event = inventoryEvents.get(j); if(event.get("category")!=null && event.get("category").equals("GPS")){ event.put("isLastGPSEvent", true); return; } } } if(originSiteEvents != null){ for(int j=originSiteEvents.size()-1; j>=0; j--){ Map<String, Object> event = originSiteEvents.get(j); if(event.get("category")!=null && event.get("category").equals("GPS")){ event.put("isLastGPSEvent", true); return; } } }
As everyone knows (or should know), coding by copy & paste is a blatant violation of the DRY principle (“Don’t Repeat Yourself”) – not to mention, it’s pretty awful to look at; it will also make you look like a lazy chump to anyone who knows the first thing about software development.
Worth mentioning that the same identical pattern (navigating nested maps according to sequence of strings) was scattered all over the 600+ lines of code of this class (did I mention that?) – so, one could have been forgiven for coming up with an utility method like the one I hacked together in less than 10 minutes to replace the abomination above:
/** * Navigates the {@literal path} in the given map and tries to retrieve * the value as a list of objects. * * <p>This is safe to use, whether the path exists or not, and relatively safe against * {@link java.lang.ClassCastException} errors (to the extent that this kind of code can be) * * @param map the tree-like JSON * @param path the path to the desired object * @return the list of objects, or {@literal null} if any of the segments does not exist */ @SuppressWarnings("unchecked") public static List<Map<String, ?>> tryPath(Map<String, ?> map, List<String> path) { List<Map<String, ?>> result = null; Map<String, ?> intermediateNode = map; String tail = path.remove(path.size() - 1); for (String node : path) { if (intermediateNode.containsKey(node)) { Object o = intermediateNode.get(node); if (o instanceof Map) { intermediateNode = (Map<String, ?>) o; } else { LOG.error("Intermediate node {} cannot be converted to a Map ({})", node, o); return null; } } else { return null; } } if (intermediateNode.containsKey(tail)) { Object maybeList = intermediateNode.get(tail); if (maybeList instanceof List) { return (List<Map<String, ?>>) maybeList; } } return null; }
The result is that the sequence of lookups (which, again, when first encountered looked like a riddle wrapped in an enigma) now looks something like:
List<List<String>> paths = Lists.newArrayList(); paths.add(Lists.newArrayList(Lists.asList( ModelConstants.INVENTORY, new String[] { ModelConstants.EVENTS} ))); paths.add(Lists.newArrayList(Lists.asList( ModelConstants.INVENTORY, new String[] { ModelConstants.ORIGIN, ModelConstants.EVENTS} ))); paths.add(Lists.newArrayList(Lists.asList( ModelConstants.INVENTORY, new String[] { ModelConstants.DESTINATION, ModelConstants.EVENTS} ))); List<Map<String, ?>> events = null; for (List<String> path : paths) { events = tryPath(entry, path); if (events != null) { break; } } if (events != null) { for (int j = events.size() - 1; j >= 0; --j) { Map<String, Object> event = (Map<String, Object>) events.get(j); if (event.contains("category") && Constants.GPS.equals(event.get("category"))) { event.put(Constants.isLastGPSEvent, true); return; } } }
This is still not as clean as I’d like it to be, but mostly blame Java for lacking a facility similar to Scala’s Lists:
val paths = List(List(ModelConstants.INVENTORY, ModelConstants.EVENTS, List(ModelConstants.INVENTORY, ModelConstants.ORIGIN, ModelConstants.EVENTS), List(ModelConstants.INVENTORY, ModelConstants.DESTINATION, ModelConstants.EVENTS))
and refactoring the (abominable) use of Map<String, Object> is an ongoing activity that will take us several sprints.
Also note the anti-pattern of using a try-catch block to filter out potentially valid code paths, but avoid writing null and key-existence checks: by factoring out the checks and the class casts (with proper type safety checks) in one place, we avoid having to choose between two bad options: scatter the same repetive, tedious checks all over the place or mask out possible error codes by casting a very wide net (by letting through potentially erroneous conditions, we will make it exceedingly difficult to find out root causes of unexpected bugs).
There’s an entire blog post to be written about “swallowing” exception and error conditions.
Takeaways
- don’t copy & paste code: if you see commonality, factor out the common parts and re-use them (in a utility class or a helper method);
- don’t avoid safety checks in your code by indiscriminately “swallowing” exceptions: try-catch blocks should be there for a reason, and the reason should be (the giveaway being in the name) exceptional.
Conclusions
Code is still largely a craft – while a deep understanding of issues related to distributed computing; a capacity for critical reasoning and abstract thinking; and an understanding of operational issues related to scalable systems are critical these days to be a great software developer, still the mastery of the craft is as important as it used to be in Reinassance Italy.
Write well-designed, clearly structured and properly documented code – and be proud of your craft!
NOTES
[1] | This is now probably the fourth or fifth time that I’m involved in a startup where the technical implementation of the early teams was incredibly amateurial and demonstrated a profound disregard for the most basic principles of good software development: the hope, in writing these notes (and this blog) is that more and more people will come to appreciate the business costs of poor technical implementations. As they say, hope springs eternal. |
[2] | For Java, I favor Google’s Style Guide; for Python, not using PEP8 would be idiotic. With a bit of self-discipline getting familiar with those is almost trivial; even better, having the IDE highlight (or even fix) style infractions is the equivalent of having automatic spell-checking for your emails: only fools (and cavemen) still persist in not doing it, and you want neither as co-workers. |
[3] | Yes, really. The backend sends the UI coordinates (dependent on zoom levels, too) as to where to draw certain objects: do I need to add that this is cause of some hilarious and very hard-to-fix bugs? To notch up the level of comedy yet higher, particularly crafted queries send back “the database:” quite literally so; we have seen payloads in excess of 64MB sent back to the Web client. Care to guess what happens next? |
Leave a Reply