17 Jun 2025

What even is Cyclomatic Complexity?
Ever spend 20 minutes trying to figure out why your bug fix or feature code isn’t triggering or being executed — only to realize you missed a buried branch in someone’s 10-path function? That’s Cyclomatic Complexity in action. Intuitively, you can think of Cyclomatic Complexity as the number of possible paths a single execution of a function can take.
For example, a = b + c
has a cyclomatic complexity of one, and a = b + c if foo else d + e
has a cyclomatic complexity of two: one path is when foo
is True
and the effective logic is a = b + c
, and the other path is when foo
is False
and the effective logic is a = d + e
.
Ain’t got no time? Here’s the goods.
If you take just one thing away from this note, then let it be this.
Strive to reduce the Cyclomatic Complexity of your code; your team and your future self will thank you!
Time to hit the brain gym, bro
As an exercise, I will let you figure out the cyclomatic complexity of the following piece of code:
env_val = os.environ.get('...')
switcher_val = False
if env_val is not None:
jk_val = True
if env_val.lower() is in ["true", "1", "yes"]:
env_val = True
else:
env_val = False
else:
env_val = True
switch_name = "/switch/name/from/config"
switcher_val = switcher.check(switch_name, switchval=region)
if env_val or switcher_val:
apply_some_config(job)
I’ll wait… (Spoiler: It’s not pretty.)
Give up? Turns out, it is 4
: three if-checks contribute to three branching points, and the cyclomatic complexity is one more than that; ergo 4
.
Next, by spending no more than 60 seconds looking this code, can you tell me what exactly it is doing? BTW, this is real production code that I ran across when debugging some issue, and it took me a long while to make sure I knew exactly when and how the config is applied. It wasn’t obvious at all. If you can grok this in 60 seconds, take a bow!
Reeling yet?
Anyway, making sense of functions with high cyclomatic complexity is annoying. It’s notoriously difficult to write tests with good coverage for these functions, and in general, they tend to be bug factories.
And yet — somehow — a lot of senior software engineers don’t seem to grok this. I keep seeing deeply nested if-else
blocks, sometimes inside loops with break
s and continue
s, and it doesn’t seem to bother anyone! It’s like we’ve collectively normalized this cognitive overhead.
Why?! Why are we putting up with this crap? It’d never fly in an interview.
Yo, let’s fix it up!
Coming back to the above example, the confusion and ugliness of this code really got to me. It got so bad I considered dusting off a Karnaugh map. After some much needed grokking, I managed to simplify it down to a cyclomatic complexity of 2
! :)
In the end, here’s what that poor little code snippet was trying to do:
# Apply config when '...' environment variable is True, else check the switch
__ENV_VARIABLE = '...'
__SWITCHER_KEY = '/switch/name/from/config'
def has_env_override():
val = os.environ.get(__ENV_VARIABLE)
return val is not None and val.lower() in {"true", "1", "yes"}
if (
has_env_override() or
switcher.check(__SWITCHER_KEY, switchval=region)
):
apply_some_config(job)
Fewer paths, fewer bugs. Cleaner code. Happier teammates. What’s not to love?
11 Jun 2025

I have seen way too many ‘senior’ engineers get bug fixing wrong. It is common to see an engineer sent a pull request titled “bug fix: " and the PR has changes to the functional code that fixes the bug and a correspond test case that shows that the bug is fixed. If that sounds reasonable, THINK AGAIN — you’ve walked right into the classic trap!
If you are sending PRs for bug fixes with functional code change and an added test case in the same PR/commit, then you are doing it wrong!
The crux of the problem is the following: HOW DO YOU KNOW YOU’RE SMASHING THAT BUG? HOW CAN YOU BE SURE YOUR TEST ISN’T A DUD?! Your answer better not be VIBE CHECKS or just STARING REALLY HARD! If you are having to deploy your entire service/library and run an end-to-end test to demonstrate correctness, then you are doing too much, and you still haven’t demonstrated that the unit test actually captures the previously errneous behavior.
There is this shiny little concept called Test Driven Development (TDD) that is mighty useful here. You can peruse the wikipedia link to figure out what TDD is exactly. This note will show you how to use TDD for bug fixes.
Here are simple steps to fixing bugs using TDD:
-
🕵️ Discover the bug. BAM! There it is! Your nemesis!
-
🧪 Create a PR that creates a new unit test that exposes the unit test. YAWZA!
-
🔧 Create a second PR on top the first PR that makes the functional code change and changes the expectation on the unit test accordingly. That should squash the bug! KAPOW!
-
💰 Justice is served! PROFIT!

Still not sure? Let’s demonstrate this with an example. Say, there is a bug that you discovered and know how to fix it.
First, you create a PR that demonstrates the bug by invoking your SUT with the offending input, and sets the expected value to be incorrect so that the test case actually passes with this incorrect value; thus demonstrating the bug.
class TestSUT(unittest.TestCase):
...
def test_bug_b12345(self) -> None:
'''
Test to expose bug b12345
'''
# Arrange
sut = SUT(...)
# Act
actual = sut.test_method(input="bad-input")
# Assert
self.assertEqual(actual, "bad buggy output")
# The assertion above demonstartes the bug b12345
# The right expected value should be "correct output".
# self.assertEqual(actual, "correct output")
You can send that PR out for review and merge it in. Now you have a solid proof that you have found a bug, and reproduced it.
Next, you have a new PR that fixes that bug. If you bug fix is correct, then the test test_bug_b12345
should not start failing. The output of sut.test_method(input="bad-input")
should be "correct output"
and not "bad buggy output"
. So, you now modify the unit test test_bug_b12345
in that same PR that looks as follows:
def test_bug_b12345(self) -> None:
'''
Test to expose bug b12345
'''
# Arrange
sut = SUT(...)
# Act
actual = sut.test_method(input="bad-input")
# Assert
- self.assertEqual(actual, "bad buggy output")
- # The assertion above demonstartes the bug b12345
- # The right expected value should be "correct output".
- # self.assertEqual(actual, "correct output")
+ self.assertEqual(actual, "correct output")
Now your test should pass. This second PR is conclusive proof that your diff now fixes the bug! So, merge it in. Deploy with confidence. BOOM — PROFIT!
07 Jun 2025
At work, I had a customer team that aspired to be “customer first.” To them, that meant fixing issues before they became SEVs. That was all and good, except that the way they went about it was to fire alerts well before their SLOs were close to being breached. Of course, I knew nothing about it until I was the receiving end of their ‘aspiration’.
It’s 4 AM, and I am in deep sleep. Suddenly, my phone, overriding all silencing setting starts ringing like there is no tomorrow. Naturally, I was being paged. I wake up bleary eyed, acknowledge the page, and join the team channel. Helpfully, the customer team oncall has message for me: “Your service has a latency spike. Please look into it.”
I drag myself to a laptop, check the graphs, and yes — there was a p99 latency spike, it lasted about half hour, and is already waning. Our SLOs were fine; our latency SLOs at these latency levels don’t breach for another 30 minutes. I double-checked their SEV criteria, and they are also still green! So why the 4 AM fire drill?
Turns out, they’d set up their alerts to go off when their p99 latency went above the normal limits for 30 minutes, but their SLO wouldn’t be breached until the elevated p99 persisted for 60 minutes. A twitcy alert if you ask me!
Their on-call had no idea what to do with the alert, saw my service mentioned, and did the classic move:
“When in doubt, escalate!”
So now I’m awake, trying to make sense of a 30-minute p99 latency increase that is fixing itself. I asked:
“Where’s the SEV’?
I imagine the scene something like this.

Silence. Five minutes later, “Here is the SEV number…” The SEV was created two minutes ago. Facepalm!
Here’s what actually happened:
- The latency spike lasted about 30 minutes.
- The system auto-healed.
- The affected service was user-facing, but this was deep in the off-hours.
- Total estimated user impact: somewhere between “negligible” and “none.”
We could’ve all just slept through it and looked at it with fresh eyes in the morning. Instead, two engineers got pulled into zombie mode to stare at graphs that improved all by themselves. It was like debugging a ghost.
Moral of the story:
If your alert is going to wake someone up at 4 AM, it better be for something that actually matters. If there’s no SEV, no SLO breach, and no clear user impact — maybe let sleeping engineers lie.
22 Jul 2022
The Law of Demeter essentially says that each unit should only talk to its ‘immediate friends’ or ‘immediate dependencies’, and in spirit, it is pointing to the principle that each unit only have the information it needs to meet its purpose. In that spirit, the Law of Demeter takes two forms that are relevant to making your code more testable: (1) object chains, and (2) fat parameters.
Object Chains
This is the more classic violation of the Law of Demeter. This happens when a class C
has a dependency D
, and D
has method m
that returns an instance of another class A
. The violation happens when C
accesses A
and calls a method in A
. Note that only D
is the ‘immediate’ collaborator/dependency of C
, and not A
. The Law of Demeter says that C
should not be accessing the method in A
.
# A violation of the Law of Demeter looks as follows.
## Example 1:
c.d.m().methodInA()
## Example 2:
d: D = c.d
a: A = d.m()
a.methodInA()
What is the problem with violating the Law of Demeter? Consider the following production code:
class UpdateKVStore:
def __init__(self, client: KVStoreClient) -> None:
self.client = client
def update_value(new_content: Content) -> Status:
transaction: KVStoreClient.Transaction = self.client.new_transaction()
if transaction.get_content() == new_content:
# Nothing to update
transaction.end()
return Status.SUCCESS_UNCHANGED
mutation_request: KVStoreClient.MutationRequest = (
transaction.mutation_request().set_content(new_content)
)
mutation = mutation_request.prepare()
status: KVStoreClient.Mutation = mutation.land()
return status
Now how would you unit test this? The test doubles for testing this code will look something like this
mock_client = MagicMock(spec=KVStoreClient)
mock_transaction = MagicMock(spec=KVStoreClient.Transaction)
mock_mutation_request = MagicMock(spec=KVStoreClient.MutationRequest)
mock_mutation = MagicMock(spec=KVStoreClient.Mutation)
mock_client.new_transaction.return_value = mock_transaction
mock_transaction.mutation_request.return_value = mock_mutation_request
mock_mutation_request.prepare.return_value = mock_mutation
Now you can see how much the class UpdateKVStore
and its unit tests need to know about the internals of the KVStoreClient
. Any changes to how the KVStoreClient
implements the transaction will cascade into test failures on all its clients! That’s a recipe for a low accuracy test suite.
There are a few ways to address this. Instead, if KVStoreClient
could be recast as a Transaction
factory, and then encapsulate all operations associated with the transactions within the Transaction
class, then UpdateKVStore
can be modified as follows:
class UpdateKVStore:
def __init__(self, client: KVStoreClient) -> None:
self.client = client # Now a Factory class for Transaction.
def update_value(new_content: Content) -> Status:
transaction: KVStoreClient.Transaction = self.client.new_transaction()
if transaction.get_content() == new_content:
# Nothing to update
transaction.end()
return Status.SUCCESS_UNCHANGED
status = transaction.update_and_land(new_content)
return status
When testing the new UpdateKVStore
, you only need to replace the KVStoreClient
and the Transaction
, both of which are (explicit or implicit) direct dependencies, with test doubles. This makes the code much easier and straightforward to test.
Fat Parameters
While the anti-pattern of ‘fat parameters’ does follow directly from the Law of Demeter, it does follow from the spirit of passing in only the information that the class needs to perform its function. So, what are fat parameters? They are data objects that as passed in as an argument to a class, and they contain more information than what is needed by the class.
For instance, say you have a class EmailDispatcher
whose method setRecipient
only needs a customer name and email address. The method signature for setRecipient
should only require the name and email, and not the entire Customer
object that contains a whole lot more.
@dataclass(frozen=True)
class Customer:
... # data class members.
def getFullName(self):
...
def getEmail(self):
...
def getPhysicalAddress(self):
...
def getPostalCode(self):
...
def getCountry(self):
...
def getState(self):
...
def getCustomerId(self):
...
# and so on.
class EmailDispatcher:
...
def setRecipient(name: str, email: str):
...
def setRecipientWithFatParameter(customer: Customer):
...
def sendMessage(self, message: Message):
...
In the pseudocode above, the class EmailDispatcher
has two methods setRecipient
and setRecipientWithFatParameter
. The former uses only the information it needs, and the latter passed in the entire Customer
object as a fat parameter.
The convenience of passing in the entire Customer
object is straightforward. It allows gives you a simple method signature. It makes it easier for the method to evolve to use richer information about the customer without needing to change its API contract. It allows you to define a common Dispatcher
interface with multiple Dispatcher
s that use different properties of the Customer
class.
However, when it comes to unit testing, such fat parameters present a problem. Consider how you would test the EmailDispatcher
’s setRecipientWithFatParameter
method. The tests will need to create fake Customer
objects. So, your fake Customers
might look like this:
fakeCustomer = Customer(
first_name="bob",
last_name="marley",
email="bob@doobie.com",
address=Address(
"420 High St.",
"",
"Mary Jane",
"Ganga Nation",
"7232"
),
id=12345,
postal_code="7232",
...
)
When someone reads this unit test, do they know what is relevant here? Does it matter that the second parameter of address
is empty string? Should the last parameter of address
match the value of postal_code
? While we might be able to guess it in this case, it gets more confusing in cases where the fat parameter is encapsulating a much more complicated entity, such as a database table.
When refactoring or making changes to the EmailDispatcher
, if the unit test fails, then figuring out why the test failed becomes a non-trivial exercise, and could end up slowing you down a lot more than you expected. All this just leads to high maintenance costs for tests, low readability , poor DevX, and limited benefits.
11 Jul 2022
You service may be massive, but it’s public API surface is pretty small; it has just a handful of APIs/endpoints. Everything else behind those APIs are ‘private’ and ‘implementation details’. It is highly advisable to follow this pattern even when designing the implementation of your service, almost like a fractal. This will pay dividends in the quality of your test suite.
For instance, you service implementation should be split into ‘modules’ where each module has a well defined API through which other modules interact with it. This API boundary has to be strict. Avoid the temptation of breaking this abstraction because your module need this ‘one tiny bit’ of information that is available inside the implementation of another module. You will regret breaking encapsulation, I guarantee it!
If you follow this pattern, you will eventually reach a class that has a public API, has all of its external/shared dependencies shared, and delegates a lot of it’s business logic and complex computation to multiple ‘private’ classes that are practically hermetic and have no external/shared dependencies. At this point, treat all these ‘private’ classes as, well, private. That is, DO NOT WRITE UNIT TESTS FOR SUCH CLASSES!
Yes, that statement seems to fly in the face of all things sane about software testing, but it is a sane statement, nonetheless. These private classes should be tested indirectly via unit tests for the public class that they serve/support. This will make your tests a lot more accurate. Let me explain.
Say, you have a public class CallMe
and it uses a private class HideMe
, and furthermore, HideMe
is used only by CallMe
, and the software design enforces this restriction. Assume that both CallMe
and HideMe
have their own unit tests, and the tests do an excellent job. At this point, there is a new requirement that necessitates that we refactor CallMe
’s implementation, and as part of that refactoring, we need to modify the API contract between CallMe
and HideMe
. Since HideMe
’s only caller is CallMe
, it is completely safe to treat this API contract as an implementation detail and modify it as we see fit. Since we are modifying the specification of HideMe
, we have to change the tests for HideMe
as well.
Now, you run the tests, and the tests for HideMe
fail. What information does that give you? Does that mean that there is a bug in HideMe
; or does it mean that we did not modify the tests correctly? You cannot determine this until you either manually inspect HideMe
’s test code, or until you run the tests for CallMe
. If CallMe
’s tests fail, then (since this is a refactoring diff) there must be a bug in HideMe
and/or CallMe
, but if the tests don’t fail, then it must be an issue in HideMe
’s tests.
Thus, it turns out that the failure in HideMe
tests gives you no additional information compared to failure in CallMe
’s tests. Thus, tests for HideMe
have zero benefits and a non-zero maintenance cost! In other words, testing HideMe
directly is useless!
By aggressively refactoring your code to push as much of you logic into private classes, you are limiting the API surface of your software that needs direct testing, and simultaneously, ensuring that your tests suite is not too large, has very high accuracy, with reasonable completeness.