In the previous three parts of this series, I have talked about how unit tests are useful in a lot more than just verifying that your code works. We've talked its uses for documentation, refactoring and code health, and writing better software. Next, we'll see how unit tests helps you debug issues in production.
Debugging
A single unit test is supposed to test a single code path within your
class. I don't always follow this maxim, but it is nevertheless a very
good rule of thumb. Given that the number of code paths within a given
code if often increases exponentially with the size of the code, unit
tests are often a lot more lines of code than the actual production code
itself. This is a good thing when it comes to debugging.
If your unit tests cover sufficiently many code paths (which any good
unit test suite should do), then when an issue arises in production,
and you narrow it down to your code, then you know that the offending
code path could not have been the several code paths that your unit
tests already covered. This pruning makes your debugging a lot simpler
than before.
Unit tests can prune the possible set of offending code paths to make debugging tractable,
EXAMPLE
Let's go back to the example I gave you in Part 3.
Here is that piece of code. Recall that it takes a large query, splits
it up into multiple subqueries, sends them off in parallel, collects
their responses back, munges them and returns the munged response to the
caller via a callback
public class QueryManager { void sendRequest(Query query, Callback queryCallback) { List<Subquery> subqueries = splitQuery(query); for (Subquery subquery : subqueries) { sendQuery(subquery, new Subquery.Callback() { @Override public onSuccess(Response response) { // Do some processing. ... if (allResponses()) { queryCallback.finalResult(); } else { queryCallback.incrementalResult(); } } @Override public onFailure(Error e) { // Do some error handling. ... if (allResponses()) { queryCallback.finalResult(); } } }); } } // Other methods. ... }
I ran into this code because of an issue that we were seeing in production.
Every so often the logs showed a really really long query that timed
out, but it did manage to serve the response back to the user. Digging
into it some more, I managed to narrow it down to this class. But beyond
that, things were a mystery. Recall that in my last post I mentioned
how there were no unit tests here, and the code actually needed
refactoring to pull out the anonymous class. We pick up the story here
after all that.
Once I had all the unit tests put in, I discovered that the root cause for the bug is actually a race condition.
The unit tests had ruled out sufficiently many code paths to lead me
to a strong suspicion that it was a race condition in the allresponses()
function causing two overlapping 'last' responses to both trigger the
incrementalResult() callback function, and so the finalResult() method
was never invoked.
Once you have a candidate cause, reproducing and verifying it becomes
pretty straightforward (not necessarily easy or simple, but
straightforward), and the rest is just mundane software engineering.
See, unit tests are more than a one trick pony! :)
I am not talking about writing bug-free software here. Sure, good
unit tests help you discover/avoid large classes of bugs, but that's not
the point. Unit tests also help you enforce good design patterns and
modularity in your software.
Unit tests help you with software design in two complementary ways.
First, they help you establish optimal boundaries of modularity in terms
of methods, and classes, and second, they help you understand your
dependencies better and almost force you to use good dependency
injection hygiene.
Modularity
How do you know that your class or method does 'too much', or that it
has undesirable side-effects? A pretty good way to discover it is to
start writing unit tests for it. If you find yourself having to test for
too many different types of inputs, then your methods are doing too
much. If you find yourself having to test for too many orderings of
operations, when your methods have too many side effects. It really is
just as simple as that!
For instance, some time ago, I came across a piece of code that
essentially took a large query, split it up into multiple subqueries,
sent them off in parallel, collected their responses back, munged them
and returned the munged response to the caller via a callback. The code
looked something like this:
public class QueryManager { void sendRequest(Query query, Callback queryCallback) { List<Subquery> subqueries = splitQuery(query); for (Subquery subquery : subqueries) { sendQuery(subquery, new Subquery.Callback() { @Override public onSuccess(Response response) { // Do some processing. ... if (allResponses()) { queryCallback.finalResult(); } else { queryCallback.incrementalResult(); } } @Override public onFailure(Error e) { // Do some error handling. ... if (allResponses()) { queryCallback.finalResult(); } } }); } } // Other methods. ... }
This code makes for an interesting case study on multiple fronts, and I will come back to it in a later post. For now, it is sufficient to state that I wanted to make changes to this code, but it was pretty thorny because (you guessed it!) it has no unit tests.
Naturally, the first step is to write unit tests for this class, and
that was when I realized why there were no unit tests here. This class
is incredibly tricky to unit test. Getting the subqueries to respond
under various conditions and ordering resulted in a combinatorial
explosion of test cases making the task intractable. This was very
strong code smell.
As you have probably figured out already, this was a classic case of a
single class doing too much. The primary culprit was the anonymous
class that implemented the Subquery.Callback interface. It should really
have been its own class with its own unit tests.
After I pulled out that anonymous class into its own class, it became
a lot easier to unit test both the Subquery.Callback and the
QueryManager individually, and with that, the code became much more
modular, easier to read, and much easier to maintain.
Dependency
If your code does not do a decent job of injecting its dependencies
from outside, you are gonna have a bad time! Having good unit tests
will actually keep you from getting into this pitfall pretty
effectively. Consider the following contrived example. You have a piece
of code that writes to an external service, and your code throttles the
rate of writes because going over your approved rate/quota can be pretty
expensive. So, your code could look something like this:
class RateLimiter { void writeToExternalService(const vector<Entries>& stuff) { auto service = new ExternalService(ConnectionParameters foo); for (auto entry : stuff) { waitUntilQuotaAvailable(); service.write(entry); } } }
Remember how I said that going over the rate/quota is bad? How do you
verify that it will not happen? Well, you could set up an elaborate
testbed that has an ExternalService simulator, and you can run your code
through all sorts of inputs and verify that the simulator says that the
rate limiting is work. But that's expensive, and if you choose to go
with a different external service, then well, good luck with that!
Instead, you could try to unit test it. But how? You need to have
access to the ExternalService to do that, which we have already
established is expensive! Well, this is where unit testing it will force
you into healthy dependency injection. For this contrived example, you
can inject the dependency as follows.
In the previous post, we saw how unit tests can serve as a reliable source of documentation for your code. There is a lot more that unit tests can do for you. In this post I'll talk about a fairly obvious, but often ignored, benefit to unit testing: Refactoring.
I posted this article on LinkedIn and am reposting it here cuz’ this is the authoritative source for it :)
Refactoring
Refactoring is a less than pleasant, nevertheless, essential part of
developing and maintaining high quality software. No matter how well
designed your software is, over time a few things happen to it.
1. The assumptions you made about the environment change, and that triggers unanticipated changes to your code.
You started with the assumption that every account will have a
unique username, but with the introduction of shared family-accounts,
this may not be true anymore.
2. New/junior engineers make changes to the code that run antithetical to the original design.
While you were out on vacation, your colleague's intern made a
less-than well thought out change to add an session expiry feature to
your code with a bunch of if-checks that see if the session has not
already expired. By the the time you returned from your vacation, this
code has been in production for two weeks, and you have more important
things to do.
3. The software is co-opted for something that it was never intended for in the first place.
That awesome geo-spatial indexing service that you wrote for
indexing cities became so popular that it is now being used to index
stores in malls across the country, and to accommodate that, the schema
now includes fields such as 'store name', 'floor number', etc. which
make no sense in your original use case; it has just been shoehorned
here.
Every time any of those things happen, you accrue some tech debt, and
eventually the tech debt gets so high that it start impeding your
ability to make future changes. This is the point at which you have to
redesign your software to reflect the new world it is in.
In my experience, one of the most delicate things about refactoring
such code is that you often have to rewrite the software but keep all of
its existing behavior intact. Any failure to do so will start
triggering failures in the system-at-large. Having good unit tests can
be an indispensable asset. If, throughout the evolution of your
software, there has been a diligent effort to keep the unit tests up to
date and with sufficient code/path coverage, the task of refactoring
becomes a lot easier.
The rule of thumb is simply “As long as the unit tests pass, every iteration of your refactor is (most likely) correct.”
Good unit tests can save you multiple days or even weeks of making
incremental changes and testing them out in production in a gingerly
fashion in ensure that nothing breaks.
There are multiple reasons to write unit tests. Verification is only one of them, and the least interesting. This is part 1 of a five part series on why you should write unit tests (apart from the obvious): Documentation!
I posted this article on LinkedIn and am reposting it here cuz' this is the authoritative source for it :)
We have all heard this
repeatedly: “You have to have unit tests. Unit tests are how you find
issues in your code and fix them before they hit the trunk.” Ok, so that
argument sounded a little weak. Here is a much stronger version of the
same argument.
You have to write unit tests because unit tests are a scientific
mechanism to verify that your implementation satisfies the
specification (that is, you code does what you say it does). To
elaborate, your claim of what your code does is a falsifiable
hypothesis in that it is possible to conceive of an observation that
could negate your claim. Unit tests are the experiments that can be
performed to test your hypothesis. If all your unit tests pass, then it
is increasingly likely that your claim is correct. (Note that we cannot
claim to have proved correctness; that is an impossible task in the
general case).
This still an unsatisfying argument at best. What if I were a perfect
developer who writes perfect code, and everyone around me agrees that I
write perfect code. Do I still need to write unit tests? Or what about
the cases where visual inspection of the code makes correctness obvious.
Do I still need to spend precious time writing unit tests?
While verification is an important reason to have unit tests, IMHO,
it is also the least interesting. There are many more reasons to be
rigorous about unit tests.
Let's start with my favorite.
Documentation
Unit tests makes the best documentation for your code. We were all
hired as software engineers because we can write good software. That
means a lot of us are not very good at technical writing, and for many
of us, English isn't our native tongue. This creates barriers for us in
communicating what our software does so that others can use it well.
We could write wiki/docs describing what our code does, but that has three major issues.
The documentation doesn't live anywhere near the code, and so discoverability is difficult
English is not the primary language for a lot of us, and technical
writing is not our strong suit, and so the quality of the writing can be
suspect
As the code evolves, the documentation becomes obsolete. The only thing worse than no documentation is wrong documentation!
We could write comments in the code itself. So discoverability is not
a problem. However, the other two issues still remain. (Raise your hand
if you have seen comments that are out of sync with the code that it is
supposed to clarify.)
What we could really use it a mechanism that leverages our strength
(writing software) to create documentation, and automation to ensure
that documentation is never obsolete. Unit tests turn out to be the
perfect tool for this!
Think of each unit test as a how-do example of how to use your code.
So, if anyone wants to use the code that your wrote, all they need to do
is look at your unit tests, the job is done!
EXAMPLE
A great example is the folly library. Let's take folly Futures for instance. The primary header file Future.h tells you the API, but figuring out how to use it is not straightforward from there. However, go over to the unit tests at FutureTest.cpp,
and you will come away knowing how to use Futures for your use case in a
matter of minutes. For each minute spent by a folly developer to write
these unit tests, it has saved thousands of developers countless hours.
But wait, there more!
There are many other things that work better when you have unit tests.
I had a really good time visiting Mira's family in Sevlievo, Bulgaria. I am intrigued by the culture and sensibilities in small town Bulgaria.
We started with visiting the old St. Prophet Eliah Church. The last place I was allowed to take photos was the entrance. It was pretty quaint on the inside, but we were not allowed to take any pictures.
St Prophet Eliah Church
Church entrance
There are a few more churches around the corner, and none are like the churches I am used to seeing in India and in the US.
Another orthodox church
Yet another orthodox church
Interestingly, the insignia on one of the church doors, often associated with the Byzantine empire, looked a lot like Gandaberunda, the official emblem of the state of Karnataka.
Notice the double headed eagle on the church door
Karnataka state insignia
As I read up more about it, I discovered that two headed birds are ubiquitous in state insignias and mythologies.
As it turns out, there is a mosque in town as well, but only muslims are allowed in. So this is pretty much all I got to see.
The only mosque in town, and we are not allowed in.
Next, we head to the city center. The old town. Not the most happening place, but nice nevertheless. Mostly clean streets with plenty of space to relax and lounge. Here is a sample of small-town Bulgarian main streets.
The main street continues on with businesses on either side
Start of a main street in town center largely for pedestrians
The other side of the same main street.
Notice the wide sidewalk , and benches to sit on.
The city center also has an impressive statue on a pillar. Apparently, the pillar it stands on is an original Roman pillar from the Nicopolis ad Istrum, and the statue was made in Vienna.
The pillar is from Nicopolis ad Istrum
Close up of the statue from Vienna
In true European fashion, Sevlievo is also home to some old european houses that once belonged to the rich, but are now run down, but maintained. I remember seeing similar houses in New Castle Upon Tyne, Madrid, and other old cities in Europe.
Medieval aristocrat's dwelling? At least, it looks like it.
Looks like an old time house whose owner was a rich and influential person.
Moving on, the city also has a "palace of culture", which ironically looks rather drab.
Palace of culture
Speaking of culture, the town also has plenty of monuments and installations from the communist days. I was surprised by the monotony of these installations. Here is a sample to give you an idea of what I mean.
In fact, it appears that everything was pretty monotonous during the communist days. Sevlievo retains some a of communist-era buildings that served as stores and supermarkets. Here are a couple of them.
For contrast, compare them to the new bookstore that right down the street.
The bookstore looks a lot more like in the west.
Among all the things uniquely Bulgarian, the most intriguing is the death notice. We are all familiar with the obituary section in the newspapers. In Bulgaria, there are such obituary notices all over town to announce a death, or a death anniversary. For instance, here is a tree that is used as a obituary notice board.
A tree trunk used as a obituary notice board.
People also post similar notices on the doors of their homes as well, and at times, it is accompanied by a black ribbon tie.