Architecting Complexity

January 30, 2015

As a contractor, I come across a lot of other people’s projects and may I just say, there’s a lot of crap out there. And I mean that. Not just “Oh, this developer could have done better here and there”, but more alone the lines of “Sorry, I can’t work with this. Your best option is to rewrite the whole damn thing.” To be clear, I’m not necessarily talking about a person’s coding skills. Certainly, I’ve come across some questionable code and downright stupid approaches to solving problems, but this is not what I’m talking about. I’m talking about a complete lack of any kind of architecture to the codebase. Too often I come across projects that seem to be trying to typify the “anti-design pattern”, usually by stuffing quite literally everything into the View Controllers.

So I want to explain here why Software Architecture is important and, if you’re a developer, why you should be learning it or, if you’re a client, why you want it in your project.

So let’s start with the assumption that software is complex. I don’t think anyone with any kind of reasonable awareness of software development would deny this. Truly, I’ve heard arguments that software development is the most complex undertaking humanity has ever done. Whether that’s true or not, I do think that most non-developers don’t actually realize how complex software actually is. And honestly, I think quite a few developers may not realize this either. The problem isn’t so much that any can’t look at a bunch of lines of software and fail to understand that this gibberish they’re looking at is complex. It’s that there seems to be a lack of awareness of the combinatorial increase in complexity inherent in interdependent parts of a system. More importantly, there is a lack of awareness (especially on part of the clients) that in an interdependent system as a whole will naturally become more unstable and failable as you increase the system’s complexity.

So what do I mean by “combinatorial increase in complexity”? To explain, let’s abstract to simple, indivisible components. We don’t need to know what they do. We just need to know that they’re interdependent. Let’s start out with component A. A is the system. If A fails, the system fails. Easy enough. If the failure rate of A is .02, then the system will fail 2% of the time or, to put it another way, the system is 98% reliable. Now, let’s add component B. Components A and B now interact with each other. They’re interdependent. Let’s say component B has a failure rate of .03 (97% reliable). What is the reliability of the system? The answer is easy, just multiple the reliability of the two components: 98% * 97% = 95%. Again, kind of intuitive. The system fails whenever either component fails. But now let’s add yet another component, C. C is also 98% reliable and it’s interdependent on A and B. What is the reliability of the system? Did you think 96% (A * B * C)? You’d be wrong. The reliability of the system isn’t just dependent on each individual component, but also on the interdependencies they have with each other. So the formula isn’t: A * B * C. The formula is (A * B) * (A * C) * (B * C): 86%. Woah… that’s a lot lower. But can you see why? The system is a series of interactions with failable components. If there’s only one component, the system is interacting with itself and so it’s failure is that of the component. If there are two components, then the system is the interaction between those two components. Add yet another component and now the system is the interaction between each component with each other. Thus the reliability of the system is multiplied along those interactions. For hoots, add another component, D, that’s 99% reliable and interdependent with the other components and lets see what happens: (A * B) * (A * C) * (A * D) * (B * C) * (B * D) * (C * D) = 78.4% reliable. Damn, by adding a component that’s 99% reliable to the system we dropped the reliability of the whole system from 86% to 78.4%. To add insult to injury, this is a simple example. The probabilities are multiplied along the interdependent interactions, and so far we’ve assumed each component is interdependent to each other component in only one way. But the reality is, especially in software development, components can be interdependent in multiple ways, further compounding the reliability issues. And now I’m going to add one more complication: These probabilities need to weighted to the probability distribution of component usage. Uhh… essentially, this means if you add component D, but the probability that it will be used in the system is 1%, it won’t have quite such a drastic effect on system reliability because it’s not used as much. I will not be going over that math.

Of course, I haven’t demonstrated complexity. But I don’t think I need to demonstrate that a software’s complexity lies along the same kind of combinatorial lines. However, I do want to add one more caveat to software complexity: Perceived vs real complexity. Let’s say we have two components A and B that have individual failure rates dependent on their own internal components of interdependent systems. Now A and B are interdependent in only one way and the reliability and complexity of the system are easy to see. However, let’s say that each component exposes many more possible interdependencies (probably by exposing their own internal components/systems to the containing system). The reliability of the system hasn’t changed because the components are still actually interdependent in only one way. However, the perceived complexity of the system has just skyrocketed. Each exposed, possible interdependency represents yet another layer of complexity to the system, regardless of whether that interdependency is being used. One can only separate the real vs perceived complexity of the system at great cost. And I would even say that for larger projects its virtually impossible to do. The result is that even though the system may have a simple set of interdependencies, the perceived complexity of the system makes it impossible to know what they are. This makes it impossible to reliably modify the system. Because the perceived complexity is so high, you can’t know if one change made to the system will break an interdependency. In this sense, the systems actual complexity doesn’t matter. It’s the perceived complexity that affects attempts at alterations. ¹ It’s the perceived complexity that makes things so difficult, especially if you’re trying to understand the system as a whole.

Here’s where Software Architecture comes in. How you architect a system will have a huge impact on the software’s complexity:

By creating modular components, you hide possible interdependencies, revealing only the necessary interdependencies.
By creating types of components, you clearly define roles within the system and the types of interdependencies can become clear. Moreover, by using common types, a system can be easily understood by other who use those types.
By defining clear interfaces you further define the available interdependencies and make it obvious what they are as well as define the expected behavior of each component.
By hiding an objects complexity, you also de-incentivise hidden interdependent behavior. I’ve seen this a lot where a component depends on another component to pass in arbitrary data in a very specific format (Ex: passing in a dictionary with very specific key : values… such a bad, bad idea).
By using protocols you can make the underlying complexity of a component irrelevant. What only matters is how that component interacts with the rest and how reliable it is.
By limiting the responsibilities of any single component, you can more easily increase the reliability of said component. It’s much easier to increase a simple component’s reliability than a complex one for obvious reasons. This is called SRP (single responsibility principle) and it means that any component in the system should have only one responsibility.

Software architecture can not only make the perceived complexity of the system approach its real complexity, but it can also help reduce the real complexity and increase the system’s reliability. It also makes the system much more understandable. And when you understand a system, you can change it with confidence.

This isn’t, of course, the only reasons to Architect your software. Testability, for instance, plays a heavy role. But I hope I demonstrated why it’s so necessary. Good Software Architecture is essential for managing complexity. Without it, the software’s complexity is bound to spiral out of control, and with that spiraling complexity will come spiraling costs.

Note: I know this is not actually true: the ratio of perceived vs actual complexity determines the probability that a change will result in a break. ↩