Of course your JIRA backlog keeps growing. It’s math.

Programmers generally understand mathematical limits. At least in terms of “Big O” notation for complexity, even if they don’t love digging into mathematical formalisms about limits from Calculus class. Some algorithms scale better than others. Knowing how things scale is an important part of being able to shout buzzwords like “Web Scale!” in meetings. Scaling organizations is at least as hard as scaling technology.

Stolen from Twitter.
We all learned what exponential growth is during the pandemic.

What is the limit of the logarithmic function log(t) as t approaches infinity? (The limit does not exist / infinity.)

What is the limit of the exponential function et as t approaches infinity? (The limit does not exist / infinity. But bigger than the log function.)

What is the limit of the constant function C as t approaches infinity? (The limit is C.)

And finally… It costs log(t) to remove an item from a queue. It costs C to add an item to that queue. What is the upper bound on storage required for that queue to hold all items as t approaches infinity?

Even junior developers have no problem answering these questions. But empirically, it seems like architects as team leads have no idea. People are often surprised at how much Jira queues tend to grow over time, and they don’t expect to have to scale organizations over time in the way that they scale technology. The math above is more than enough to explain why, in the long term, you will never ever catch up on your Jira queue. Let’s dig in.

Disclaimer

None of this is specific to Jira. But you’ll probably have to use JIRA at some point, and writing “Jira” is a lot more convenient than writing “Jira, or whatever work queue system you use to keep track of things like feature requests and bug reports.” It’s a given that developers hate Jira, but ultimately we would hate any other thing that does the same sort of things, no matter how imperfect Jira’s tools may be. So, I am not criticizing Jira, per se. I am talking about mathematical laws that you’ll probably notice while using Jira.

Popping The Queue

On day one of your new startup, you have no tickets in your queue. Glorious! But, you need to get to work, so you have to start filing tickets. You haven’t written any code, so you don’t have any bugs yet. It’s pretty easy to think of feature requests for the new product.

And those first few features are pretty easy to add. Part of that is because the existing code base is small or non existent. Part of that is because there’s no process. Part of that is because you already know every stakeholder, and they are probably sitting in the same [chat] room as you. During this early stage, you may even get the queue back down to zero a few times. This normalizes the idea that working through the queue entirely is normal and expected.

Over time, the difficulty of adding a new feature grows in complexity. There are many reasons for this. One is that you just have more code. Companies brag about the size of their code bases. (The fools!) But a large code base is more of a cost than an asset. Having more code means each build takes longer. Having more tests means each test run takes longer. But having more code also means that it’s harder to find the right place to add a feature. You have to read more code to understand the existing systems. In the long run, you physically can’t just “know” all of the code. You can’t read all of it. And even if you could, parts of it are being changed by other developers since the last time you read it. The once-elegant architecture needed to become more complex because some asshole in sales absolutely had to land a customer, and insisted it would be super popular. (He got a car out of landing the customer. You got to work late. No other customer ever used the feature you added, just like you said would happen. Whatever. It’s done now. It’s not like you’ll keep bringing it up in blog posts years later.)

Aside from the actual complexity of the code itself, the company adds more departments over time, more process, and more stakeholders. To add a feature in a more mature org, you need to reach out to another department and coordinate how they’ll handle the front-end for your back-end feature. You need to warn customers about a change in behavior. You need more layers of review, etc.

So, best case scenario, the difficulty of adding a feature grows roughly in log(t). That’s best case scenario. That’s a really well run organization, with a code base that has a lot of focus on mitigating and minimizing tech debt. That’s if you already know most of the stakeholders and key players you need to interact with.

In something worse than the best case scenario, the technical complexity and the organizational complexity each grow worse than linearly. When those two factors multiply each other, the scaling becomes exponential. To be clear, I am not using the term exponential in a non-literal way. I am saying that the cost of adding a feature under bad conditions literally grows exponentially. I’ve known people with horror stories where they worked on pushing out a conceptually simple change to a single dialog box when they started at a company, and the fixed UI had not shipped to customers by the time the left. After a few decades of increasing inertia, the time required for simple UI changes can literally become measured in careers rather than weeks. Even when the changes are conceptually the same like “Add a button to a UI.”

So, the lower bound for shipping a feature (or fixing a bug) is roughly log(t), and the upper bound for it is exponential.

Pushing To The Queue

What about filing a feature request or filing a bug report?

I assume pictures make blog posts more exciting for the reader. Anyhow, here’s a screenshot of a bug I fixed recently. Note the mojibake in the keyboard shortcuts in the “Player” menu? Yeah, didn’t take you very long to see it. And taking a screenshot when you see a bug takes the same amount of time regardless of how complex an application is.

Filing a bug report is pretty easy. More importantly, it has very few scaling factors. You log into JIRA. You have to find the right queue to file it in. If there are a zillion queues, it might take a moment to find the right one. But even if you find the wrong one, you can still make a new ticket. Somebody else can forward it to the right queue.

Dreaming up new features is pretty easy. If you run out of ideas, you can always file a ticket for arbitrary changes. Or even file a ticket to remove a feature that you think clutters the UI but isn’t very popular. Or if you work at Google, file a ticket saying somebody should create a new chat app. See, I just came up with some great product feature requests while writing this paragraph and I didn’t even bother having a specific product in mind.

The upper bound for adding items to a Jira queue is pretty much “It happens in constant time.” Or in a very badly run org, maybe something like log(log(t)) if a stupid amount of process growth happens over time. But really, there is usually some sort of practical upper limit to the number of custom fields you have to fill out to make a ticket. It may be annoying. But it doesn’t literally grow unbounded toward infinity.

The Thrilling Conclusion

So, what does this get us? Not much. Look, I don’t actually have a solution here. I didn’t claim to. I am basically just filing a metaphorical bug report about the way organizations try to scale. (And as I said, filing a bug report is a lot easier than shipping a fix.)

But, I think it’s worth thinking really hard about the expectations around queue length over time. Lots of organizations have metrics about “getting behind.” Many will have backlog grooming processes that are dedicated to keeping queue lengths “reasonable.”

The founders who remember the early days when they got their queue lengths down to zero expect their underlings to accomplish the same thing. But they don’t fully understand that the early successes against the queue were effectively just a sampling error because the numbers were small in those days. Expecting the past success at getting to Queue=0 becomes less and less reasonable over time. This very much relates to my previous post about Corporate Event Horizons. But it requires senior people to really appreciate that the present organization they are running is completely unlike the past organization they worked in. And the differences are wildly nonlinear. Humans are consistently bad at intuitively understanding nonlinear behaviors. That’s why we use things like Big-O notation to study scaling behaviors. Or, at least, we sometimes use Big-O notation.

Just understanding that people need to be thinking about scaling factors goes a long way. It drives things like how much time you spend working on paying off tech debt. It makes the return on that investment much clearer if you can see that the organization is grinding to a very predictable halt if you don’t address any of the technical complexity. It also helps inform thinking about when you should do things like split up teams. If you have a dozen people talking in a standup meeting every day, that’s going to take more them than five people and less than 20 people. The more people who have opinions on a feature you are implementing, the slower it will be to work on.

Like most things in organizational scaling, infinite Jira queues are entirely predictable, and entirely shocking when you suddenly realize you’ve had them for a very long time. Shock is disruptive in bad ways, and that disruption always slows down work. Panic efforts to “oh no, fire alarm all hands panic” introduce context switching overheads for everybody involved. Sometimes panic happens for unforeseeable reasons. When panic happens for entirely foreseeable reasons, it’s waste.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s