I’m proposing a method for tracking how difficult it is for software engineers to work with code. Preexisting methods such as code smells, cyclomatic complexity, best practices, test coverage, etc seem to work okay for a course grained approach for determining how difficult a code base may be to work with, but I don’t feel that they capture enough detail and will often not provide sufficient insight for how to structure code to avoid issues.
Why does it matter, though? At a first pass, as long as the customer and end users are happy with the output, then the difficulty for software engineers to navigate the code base should be irrelevant. It’s what they’re paid for after all.
I heard a fable about a wizard who was tired of the townspeople constantly asking for assistance. So the wizard created a golem that could listen to the townspeople and accomplish their tasks. One day the townspeople became concerned about thieves, so they asked the golem to create a box that could only be opened with one key. The golem created the box and the townspeople where grateful. But a young boy grabbed the one key that could open the box and jumped inside. The townspeople pleaded with the golem to open the box, but the box could not be opened without the key that was now stuck inside.
This story shows that just because you want something doesn’t mean that you fully appreciate the implications. You don’t always want exactly what you say; sometimes you need an expert to help you get to the best solution. This is the difference between a software engineer and a programmer. And it’s something that I’ve seen again and again at Software Engineering Professionals. Don’t just provide code to spec, but try to understand the problem the customer has so that the resulting deliverable will help achieve the best good.
And what does this have to do with code difficulty? What happens when the code is so difficult to understand that the software engineer needs to focus a majority of time on dealing with the code instead of understanding the customer’s problem? Take a look at the current security landscape. The heartbleed exploit, goto fail bug, the debian random number bug, cars being hacked over the internet, smart lightbulbs being used as an attack vector for compromising wifi. When the code base becomes problematic, it can be difficult to ensure that the deliverable will meet the customer’s need without also causing new problems. And security is just the tip of the iceberg. When the code base is the problem, then achieving the customer’s goal might be compromised. Dealing with end user complains may be difficult. And even the ability to quickly implement features and maintain existing code can be hindered.
Measuring the quality of a code base isn’t easy though. I mentioned before that the existing methodologies don’t quite seem to achieve what I’m hoping for. Cyclomatic complexity is a good example. This metric measures linearly independent paths in any given section of code. Bad variable names, generally considered to be a problem, however will not affect the cyclomatic complexity of code at all. So there are details that are getting missed and I want something that will do a better job.
Unfortunately, I’m going to need multiple blog entries in order to go over my method. There are several aspects of code that need to be analyzed in order to understand when code becomes problematic. This entry is going to serve as the preface and index for the rest of the entries.
Problem Analysis
- Introduction
- Blob Structure
- Blob Structure Examples
- High Dimensional Spaces
- Topological Holes
- Path Connected
- Continuous Functions
- System Complexity
- Overlapping Blobs
Code Analysis