Brittle Tests or Brittle Code?

We spend a lot of time fixing up integration and end-to-end tests when we change the behaviour of our API and UX. That process is quite frustrating sometimes, especially when you have asynchronous interactions and caching involved throughout all the different layers of the architecture, and it requires a lot of discipline and tenacity to keep moving forward. Sometimes, I definitely dream about giving up on fixing tests (especially the time dependent ones) and moving on with a red light - shame on me.

I am in another one of those chases right now, and there seems no end to the red lights that go off in various places, sometimes frustratingly intermittently. You fix a bunch, then another bunch go red somewhere else. I also very much dislike the idea of adding various time based compensations to fix tests (like waits) to make them synchronize correctly. That's a smell to me.

So, I was venting the other day, "why are my UI tests so 'brittle' that they fail at the slightest little change somewhere deep in the API?", and that in some cases, those changes are miles away from the area of the product we are changing. This is so frustrating sometimes. 

Yes, I've of course, been tempted at some of the more frustrating moments just to disable or eliminate the newly failing tests, just to get my green lights and confidence back. And then I remind myself why we have them in the first place. And I know, I don't want to go to that hell again.

 

So, should tests be brittle?

And if not, what is the practical alternative?

At this point, I'm not sure it's a valid question. Not all tests are created equal. And absolute rules that don't comply to the "it depends" test, I am suspicious of nowadays.

For example, in our current product we make a firm distinction between unit tests, integration tests and end-to-end tests (e2e tests strictly speaking are integration tests, that either start at the UI or at front end of an application - whatever that might be). Since we live in a world of full-stack web apps backed with API's it's a convenient distinction for us right now. 

Unit tests: By design, unit tests ARE brittle, and absolutely should be! Many people don't know this, so it's worth saying out loud. There is (in practice) a 'finite' (i.e. bounded) number of unit tests you can, and should write, and they should be highly cohesive with the the code you are creating that satisfies those tests. This is especially true if you are designing your code test-first. That's because every line of code exists because you wrote a test to require it. Which usually, but not always, means that their is [at least] one test for every code branch in your CUT (code under test). 

You absolutely, want a unit test to fail if you change even just one line of that code.  So being brittle is an essential quality of a unit test for maintaining a good design and refactoring code. If you can change a line of that code and it does not break a test, you should take immediate pause and perhaps remedy that (if you can).

Integration tests: unlike unit tests, you can write [if you try hard enough] an infinite number of integration tests to test your system. There are so many permutations [close to infinite for all practical purposes] of context and variability at this level, that you could test at this level. The number of these kinds of tests is unbounded, and because of that, its hard to learn when enough is enough and stop writing more (it is diminishing returns). At the same time, you absolutely should write some integration tests, and in practice that number [that you should write] is usually small for each feature (i.e per API). I like to teach to developers, that you write an integration test for every common case that you know the code (end-to-end) has been designed for (and at least one for any bug you subsequently uncover, and one for any issue you think is a potential problem in that code path). You also ought to know what the minimal set of integration tests is going to be before you design the code. If you can write these tests first, and they run fast enough, they can be a powerful guide to let you know when you are done writing the feature.

Unit tests and integration tests are brittle by design for good reason and for good design!

I've noticed that lots of coders/developers disagree on this idea. Some even ward off others against writing large number of integration tests. But my experience has taught me that there'd  "be many dragons" when assuming your system works as you designed it without many of these tests to verify it. The fact that integration tests fail when you change something in your API/system, especially miles away from the affected code, is solid testament to that. So, it seems that having brittle integration tests, is a huge benefit too.

Incidentally, I also teach developers that integration and unit tests tell you your code does what you thought you needed it to do. End-to-end tests tell you what the system actually needs to do.

[Let's not even get into UI (e2e tests) needing to be brittle. That is somewhat unavoidable even with great tooling these days. If you change the UX, why wouldn't you would want the existing tests to fail to let you know that. It's so important.]

Anyone not for Brittle Tests?

So, you don't like the sound of brittle tests? Then what is the alternative?

What if you don't want your tests to be brittle, that sounds bad! I know most developers think that they should avoid anything that is 'brittle' because they heard it was bad for various reasons, and so having brittle tests around sounds wrong too, but it is just not true. Brittle tests good, brittle code bad!

I am going to assert that if your tests are not brittle they are for the most part worthless for regression and refactoring. That 'brittality' is a vital quality of effective tests at any level.

If you are a practitioner and you would rather not maintain brittle tests, then you probably don't have brittle tests. Which means you likely have few effective tests, and in all likelihood probably few to no tests at all [suspect strongly].  

Perhaps the tests you do have are just 'smoke tests'. That's good, you should have some of those, but if you claim that you have avoided brittleness and only now actually have smoke tests as a your regression mechanism, then you have just avoided the effort and frustration of maintaining rigorous brittle tests to achieve a state of un-brittleness. 

Is that such a good thing?

Might be, if your goal is to code up a spike or throwaway project and get done with that as quick as possible, and never to have to maintain it over time. A pet peeve of mine when that prototype becomes the basis for a real product, which is very common in project-based solution development that so many coders are involved in. Projects end, people move on. No durability, no robustness, no longevity, but great experience, learned heaps thanks, projects are good for that! Not great for delivering robust products though.

If you avoid brittle tests because you think that brittle is bad (in general) then presumably you don't have brittle tests that are maintained, and then you don't have a early warning system in place to detect regression problems. And if you don't have that, you definitely have areas of your code that don't work the way you expect them to, and you spend most of your time in a debugger (oh boy!).

Perhaps you are cool with that for now, and are happy to wait to hear from your user's when your app or service is used in anger in production and breaks for them?

[BTW: They are not going to tell you when that happens. You know this, so, I assume you accept the  frustration and enjoyment of debugging production code while your customers quietly slip away through your fingers?]

In a way, it could be said that you have designed your code to fail, and fail it will, because you are a good designer, and it will do that when you least expect it of course. Hasn't happened to you yet, right?

I guess most developers who believe this is a good way to proceed to build robust software products also see themselves as: smart, conscientious, cost-efficient, under-rated, and under-paid, and in all likeliness they are probably working alone not having to worry about others dealing with their creation long term. 

The absence of brittle tests, leads to brittle code. 

So I am going to assert that without brittle tests, you are more than likely, in most cases going to end up with brittle code instead. Especially if you are embarking on a sizable product with a long life.

It is well known now that code without tests is usually poorly designed, usually unstable, and often too brittle to refactor, and 'bit rot' sets in very quickly. That's really the scary stuff that drives developers away from maintaining or investing in their code bases long term. So no wonder creating and maintaining brittle tests for it seems like it is in vain. 

Get the brittle tests going before the code gets brittle!