Tests should be isolated from each other; not coupled

•

3 July 2022

•

13 mins

Tests should be isolated from each other; not coupled

Almost by definition unit tests should be isolated from its (external, shared) dependencies. But, equally importantly, unit tests should also be isolated from each other. When one test starts to affect another test, the two tests are said to be coupled. Alternatively, if changes to one test can negatively impact the correctness of another test, then the two tests are said to be coupled.

Coupled tests are problematic in two ways.

Tests become less readable. Reading the code for a single unit test does not necessarily communicate what the test does. We also need to understand the ‘coupling’ between that test and other tests to grok what a single test does. This coupling can be subtle and not easy to follow.
Tests become less accurate. When one test affects another, it becomes difficult to make changes to a single test in isolation. For instance, if a diff makes changes to the some production and test code, and then a test fails, then it is not always clear why the test failed. The failure could due to a bug, or an artifact the coupled tests. Thus, your tests are no longer trustworthy, and therefore, less accurate.

Coupling can happen in many ways. The obvious ones include (1) using the same shared dependency (like when you use the same temp file name in all tests), and (2) relying on the post-condition of one test as a precondition of another test. Such cases are also obvious to detect, and to fix. There are two more following ways in which tests can be coupled; but these are more subtle, and more prevalent.

Precondition setting in test fixtures
Parameterized tests for heterogeneous tests

The rest of this note is focused on the above two anti-patterns of test coupling.

Coupling through test fixtures

Say, your SUT has a dependency called Helper, and initially, for the two tests in your unit tests for the SUT, you initialize your Helper stub with contents valueA, and valueB. Since both tests share the same initial state, you include the initialization code in the SetUp of the unit tests.

class SUTTestCase(unittest.TestCase):
    def setUp(self):
        self.helper = StubHelper()
        self.helper.add_contents([valueA, valueB])
        self.sut = SUT(self.helper)
        
    def test_behavior1(self) -> None:
        ...  # Assumes self.helper set with contents=[valueA, valueB]
    
    def test_behavior2(self) -> None:
        ...  # Assumes self.helper set with contents=[valueA, valueB]

Next, you modify SUT to add features to it. In order to test those features, the Helper stub needs to include controllerA. But these are useful only in the new tests being added. However, looking at the unit test you already have, it is easiest to to simply add controllerA to self.helper. So, your unit tests look as follows:

class SUTTestCase(unittest.TestCase):
    def setUp(self):
        self.helper = StubHelper()
        self.helper.add_contents([valueA, valueB])
        self.helper.add_controller(controllerA)
        self.sut = SUT(self.helper)
        
    def test_behavior1(self) -> None:
        ...  # Assumes self.helper set with contents=[valueA, valueB]
             # But this test assumes nothing about self.helper's controller

    def test_behavior2(self) -> None:
        ...  # Assumes self.helper set with contents=[valueA, valueB]
             # But this test assumes nothing about self.helper's controller

    def test_behavior3(self) -> None:
        ...  # Assumes self.helper set with contents=[valueA, valueB], and controller=controllerA

    def test_behavior4(self) -> None:
        ...  # Assumes self.helper set with contents=[valueA, valueB], and controller=controllerA

Then you discover a gap in testing that requires the initial state of the Helper stub to have just the content valueA and include controllerA. Now, when adding this new unit test to suite, the simplest way to do this would be to remove valueB from self.helper at the start of the new test. So, now, your test suite looks as follows:

class SUTTestCase(unittest.TestCase):
    def setUp(self):
        self.helper = StubHelper()
        self.helper.add_contents([valueA, valueB])
        self.helper.add_controller(controllerA)
        self.sut = SUT(self.helper)
        
    def test_behavior1(self) -> None:
        ...  # Assumes self.helper set with contents=[valueA, valueB]
             # But this test assumes nothing about self.helper's controller

    def test_behavior2(self) -> None:
        ...  # Assumes self.helper set with contents=[valueA, valueB]
             # But this test assumes nothing about self.helper's controller

    def test_behavior3(self) -> None:
        ...  # Assumes self.helper set with contents=[valueA, valueB], and controller=controllerA

    def test_behavior4(self) -> None:
        ...  # Assumes self.helper set with contents=[valueA, valueB], and controller=controllerA

    def test_behavior5(self) -> None:
        # Assumes self.helper set with contents=[valueA, valueB] (because of other tests' setup)
        self.helper.remove_content(valueB)
        # Now assumes self.helper set with contents=[valueA]
        ...  

Let pause here and inspect the state of the unit test. The tests are coupled. Why? Because modifying one test ends up affecting other tests. In the example above, if we replace self.helper.add_contents([valueA, valueB]) with self.helper.add_contents(valueA) for tests test_behavior1 and test_behavior2, it will result in a failure in test_behavior5 because self.helper.remove_content(valueB) will end up throwing an error!

Furthermore, for anyone reading these tests, it is not entirely clear that test_behavior1, and test_behavior2 need self.helper to be initialized with values [valueA, valueB], but do not need for controllerA in self.helper. The preconditions for test_behavior1 and test_behavior2 are coupled with the preconditions for test_behavior3.

It also results in test incompleteness in that, if we introduce a bug that causes behavior1 to fail when self.helper.add_controller(controllerA) is not set, we might not catch that bug because we have initialized the test for behavior1 with self.helper.add_controller(controllerA).

How to decouple such tests?

Use the setUp method to simply set up your dependencies, but not to enforce any precondition. Instead, make setting preconditions part of the arrange step of each unit test. You can even encapsulate the precondition setting into a function (with the right parameters) so that the arrange section does not get too bloated, and yet the test code is readable. Consider the following refactoring of the tests:

class SUTTestCase(unittest.TestCase):
    def setUp(self):
        self.helper: Optional[StubHelper] = None
        self.sut = SUT(self.helper)
        
    def prepare_helper(self, contents:List[Value], controller: Optional[Controller]=None) -> None:
        self.helper = StubHelper()
        self.helper.add_contents(contents)
        if controller:
            self.helper.add_controller(controller)
        
    def test_behavior1(self) -> None:
        # Assumes self.helper is a fresh object.
        self.prepare_helper(contents=[valueA, valueB])
        ...

    def test_behavior2(self) -> None:
        # Assumes self.helper is a fresh object.
        self.prepare_helper(contents=[valueA, valueB])
        ...    

    def test_behavior3(self) -> None:
        # Assumes self.helper is a fresh object.
        self.prepare_helper(contents=[valueA, valueB], controller=controllerA)
        ...

    def test_behavior4(self) -> None:
        # Assumes self.helper is a fresh object.
        self.prepare_helper(contents=[valueA, valueB], controller=controllerA)
        ...

    def test_behavior5(self) -> None:
        # Assumes self.helper is a fresh object.
        self.prepare_helper(contents=[valueA], controller=controllerA)
        ...

Coupling in parameterized tests

Parameterized tests are a collection of tests that run the same verification, but with different inputs. While this is a very useful feature (available in almost all unit test frameworks), it is also very easy to abuse. A few common ways I have seen it abused is in conjunction with DRYing, and the use ‘if’ checks, and that often results in coupling all the tests denoted by the parameterized list. Consider the following illustration:

class TestOutput(typing.NamedTuple):
    status: StatusEnum
    return_value: typing.Optional[int]
    exception: typing.Optional[Exception]
    ...

class TestSequence(unittest.TestCase):
  
    @parameterized.expand([
        [test_input1, expected_output1],
        [test_input2, expected_output2],
        ...
    ])
    def test_something(self, test_input: str, expected_output: TestOutput) -> None:
        self._run_test(test_input, expected_output)
    
    def _run_test(self, test_input: str, expected_output: TestOutput) -> None:
        sut = SUT(...)
        prepare_sut_for_tests(sut, test_input)
        output = sut.do_something(test_input)
        test_output = make_test_output(output, sut)
        self.assertEquals(expected_output, test_output)

The above illustration tests the method do_something for various possible inputs. However, note that the outputs (as illustrated in the class TestOutput can have a status, a return_value, or an exception). This means that every instantiation (for each parameter) has to content with the possibility of different types of outputs even though any single test only should have to verify against a single type of output. This couples all the tests verifying do_something, this making it difficult to read and understand. Adding a new test case here becomes tricky because any changes to either prepare_sut_for_tests, or make_test_output now affects all the tests!

How to decouple parameterized tests?

There are some fairly straightforward ways to decouple such tests. First, is that we should be very conservative about how we organize these tests. For example, we can group all positive tests and group all negative tests separately; similarly, we can further subgroup the tests based on the type of assertions on the output. In the above example, we can have three subgroups: positive tests that verify only output status, positive tests that verify return value, and negative tests that verify exception. Thus you now have three parameterized test classes that look something like this:

class TestDoSomething(unittest.TestCase):
  
    @parameterized.expand([
        [test_status_input1, expected_status_output1],
        [test_status_input2, expected_status_output2],
        ...
    ])
    def test_something_status_only(
        self, 
        test_input: str, 
        expected_output: StatusEnum
    ) -> None:
        # Arrange
        sut = SUT(...)
        ...  # More 'arrange' code
        
        # Act
        output = sut.do_something(test_input)
        output_status = output.status
        
        # Assert
        self.assertEquals(expected_output, output_status)
        
    @parameterized.expand([
        [test_return_value_input1, expected_return_value_output1],
        [test_return_value_input2, expected_return_value_output2],
        ...
    ])
    def test_something_return_value_only(
        self, 
        test_input: str, 
        expected_output: int
    ) -> None:
        # Arrange
        sut = SUT(...)
        ...  # More 'arrange' code
        
        # Act
        output = sut.do_something(test_input)
        output_status = output.status
        output_value = output.value
        
        # Assert
        self.assertEquals(SomeEnum.SUCCESS, output_status)
        self.assertEquals(expected_output, output_value)

    @parameterized.expand([
        [test_return_value_input1, expected_error_code_output1],
        [test_return_value_input2, expected_error_code_output2],
        ...
    ])
    def test_something_throws_exception(
        self,
        test_input: str,
        expected_error_code: int
    ) -> None:
        # Arrange
        sut = SUT(...)
        ...  # More 'arrange' code
        
        # Act
        with self.assertRaises(SomeSUTException) as exception_context:
            sut.do_something(test_input)
        exception = exception_context.exception
        
        # Assert
        self.assertEquals(excepted_error_code, exception.error_code)

Like it? Share it!

« In unit tests, I favor Detroit over London 'Privatize' your classes for better unit testing »