The Root of All Evil (in software)
This is another post I originally wrote in 2011 before I accidentally deleted my website. I had to deal with a log of legacy code at Wells Fargo at that time, but tbh all developers struggle with technical debt to a degree regardless of how modern their stack.
What are you scared of? Snakes? Spiders? Sith Teddy Bears?
I am pretty desensitized having grown up in the era of great horror flicks in the 80′s, but there is one thing that never fails to create a pit deep in my stomach. It is when complex code changes are deployed into production for a system that already has a lot of outstanding issues. I hold my breath until those first users get onto the system and start using the modified code. To some degree, I feel this way with every production release, but there is more of a tangible fear when the target system is already messed up.
“Fear is the path to the dark side. Fear leads to anger. Anger leads to hate. Hate leads to suffering.” – Yoda
To get to the root of all evil in software, we need to analyze where this fear comes from.
Q: Why do you feel fear in this situation?
A: I worry that something will go wrong and the business will be adversely affected.
Q: Why do you think something will go wrong?
A: There have been times in the past when we thought we were implementing a simple, non-intrusive change, but once the change got into production there were major issues. So, the fear is not specific, it is just a general feeling that something could go wrong.
Q: Why not fix the root cause of those previous issues?
A: We do make small, quick changes to fix specific issues that occur, but the problem is that there are bigger, more general deficiencies within the system that continue to start new fires even as we put out some of the old ones.
Q: Why not fix these bigger issues before additional changes are made?
A: It is a matter of technical debt because the business wants something and they don’t want to wait so they are accepting the risk (i.e. adding technical debt) in order to get their change in faster.
Q: What specifically is it about the existing system issues that makes it so difficult to fix them?
A: Because in order to resolve the issue we will need to change a lot of code and touch a lot of components that would likely then require a lot of testing.
Q: Why do you need to change a lot of code at once?
A: Because it is not possible to change one section of code without affecting many other components. Everything is intricately tied together like one big ball of twine.
What my internal Socratic dialog reveals here is that true root cause of the challenges I face is the extreme tight coupling of code within the system. For the non-dorks out there, you can understand the concepts of tight and loose coupling by thinking about the difference between building a robot by welding pieces of scrap metal together (i.e. tight coupling) versus using Legos (loose coupling). When you weld metal together, you can do it quickly (assuming that know how to use a blow torch) and you can customize the shape into whatever you want. The only problem is that once you are done, the robot may be functional but kids would not be able to make any changes to the robot like putting on a new head or upgrading the robot’s arm canon. The scrap metal robot is what it is and if you want something different you likely need to toss it and get a new one. The Lego robot, on the other hand can still be built relatively quickly, but it is designed so that kids can customize the robot to their heart’s content. They simply switch out a couple blocks and they have a revamped, cutting edge robot.
It should be noted that not all tight coupling is bad, though. Each Lego piece (i.e. smaller component within a system) is made up of material that is tightly coupled together. The key is that at a larger system level there needs to be loose coupling at least among the major components so that they can be upgraded/replaced/removed without major impacts to the other parts of the system. Many legacy systems don’t even have this high level of loose coupling and that is why there are still so many Cobol-based programs out there today.
In 2008 the Governator tried to cut the salaries of 200,000 state works but he was unable to do it because California’s Cobol-based Payroll System couldn’t be changed fast enough. The state controller, John Chiang, basically said that the legacy “constraints” (i.e. tightly coupled code that couldn’t be modified quickly and easily) prevented them from achieving their goals. You can see this type of situation in many big businesses where the emphasis is usually on cutting corners to get to the quickest possible solution.
The root of all evil in software is the tight coupling within a system that slows down or prevents system changes. A system may have many other problems that come up from time to time, but the point is that most issues should be able to be quickly fixed and thus not cause any heart ache in the long run. If you are saying something cannot be easily fixed over an extended period of time, then it almost certainly mean the system is tightly coupled in some way. The reason why I can say this with a high degree of confidence is that if a system has the appropriate level of loose coupling, an issue within any individual component should not affect the other components. Problems will always be somewhat contained and limited in scope.