If your test case is causing more harm than good, is it really useful? In the days of legacy software delivery, with long lead times and great difficulty in changing the product once shipped, almost all test cases (automated or not) were good test cases.
In this the era of continuous delivery, however, this calculation has changed. It’s incredibly easy to end up with a test that inadvertently makes your software less stable, whether it’s creating false trust in the code, removing the trust from the tests themselves, or taking so long that the tests are not running frequently enough.
Whether you are performing automated or manual testing, it is essential that any software checks you perform to validate your assumptions about the code follow three key principles to ensure that they are fully compatible with a continuous integration and delivery system.
Any test case that you are going to run with any frequency must be reliable; that is, the test case cannot be flaky. Consider automated verification: In a continuous integration environment, this test case can be run dozens or hundreds of times per day for a single team.
If a test is only 99% reliable (a false report out of 100 test runs) and you run it 200 times a day, your team will investigate false positive failures at least twice a day. Multiply that by a unit test suite that can have tens of thousands of test cases, and the math becomes clear.
Any test case that is not at least 99.9% reliable should be deleted until it can be brought above the confidence level.
But what does reliability look like? A test case must take every precaution to avoid a false negative or a false positive. It must be reproducible without external human intervention; he must clean up after himself.
In a fully automated system, a human usually does not have time, for example, to drop tables from the SQL database after a few tests. Even a manually executed test case has to clean up after itself, as it is an unmanageable mental load for the test runner to have an ever-changing starting state.
Why is reliability so important? When developers regularly have to waste time looking for false positives or negatives, they quickly lose faith in the automation solution and are likely to ignore real failures alongside false ones.
In a continuous integration system, the most valuable resource you have to spend is engineers’ time. Engineers have learned to expect results quickly, and they are unwilling to wait for something they perceive to be a waste of time. So make sure you get relevant results as quickly as possible.
For example, there’s no point in trying to run unit tests on code that doesn’t compile. And there’s no point in running an API-level integration test suite if the unit tests on an underlying package don’t pass. You are assured that the code being tested will have to change, so why waste time on a test that is guaranteed to be thrown away?
Figure 1. In this modified testing pyramid, unit testing forms the foundation of your testing strategy, integration testing validates across boundaries, and specialty testing at the top captures any slow or complex testing. Source: Mélissa Benua.
Always run the most important test cases as quickly as possible, and always run your fastest tests first. These are almost always your unit tests; a typical unit test runs in microseconds and can usually be run in parallel. In my continuous integration systems, I can typically process tens of thousands of unit tests in about 90 seconds.
An integration test is a test that exceeds limits, typically including at least one HTTP limit or other machine-to-machine limit. By definition, these test cases run in milliseconds and are several times slower than unit tests.
Finally, a specialty test is anything that is significantly slower than an integration test (such as an end-to-end automated UI test) or that requires human intervention or interpretation that slows down the overall communication of people. results.
While tests slower than a unit test certainly have value and absolutely have a place in a continuous integration system, that place is after running the fastest and most reliable tests.
A good test case individually should do as little as possible to produce a pass / fail result as quickly as possible. If you had infinite time to run a test, the overlapping coverage and redundant tests wouldn’t be very important.
But if you only have a budget of five minutes for a full integration test, for example, and each integration test case takes 10 milliseconds, then you only have time to run 30,000. test case. And if you run a UI-based test that’s closer to a second per test, you only have time for 300 test cases.
Once you recognize the reality of an upper limit on the number of test cases you can run, you can figure out exactly how to spend those resources.
Figure 2. Always name test cases clearly and descriptively. As you can see above, this makes troubleshooting easier. Source: Mélissa Benua.
Each test case should be a clear answer to a clear question, the combination of which adds up to a test suite that will give a clear answer on the full set of features tested. Clearly named, atomic test cases should make it easy to locate the potential cause of a failed test, and also easily understand at a glance what has been tested and what has not been tested.
When the test cases are clear and atomic, it becomes easy to find overlaps in coverage, and therefore candidates for deletion.
Now put these principles into practice
Test cases that have many complicated steps and validations are prone to failure (violation of Principle 1) and long run time (violation of Principle 2). Consider the following test case:
Make sure Safe Search = Strict Works
- Create a new user
- Connect this user to the user interface
- Go to the user settings page
- Change the Safe Search setting to Strict
- Go to the search page
- Find adult content
- See that no results are returned
By going through everything from start to finish, you are inadvertently testing a lot of features in order to answer the question the test case actually asks. Rather than going through the entire experience exactly like an end user would, this test case would be better served if you break it down into several different cases and several different suites:
- Research suite; our test case goes here
Most likely, all A, B, and C will have a combination of 100x unit testing and 10x integration testing, depending on their specific system architecture limitations. Although C may require a logged-in user, the purpose of the following is not to ensure that you can create a user or update the user’s settings.
Creating and modifying parameters are ancillary features that are likely to be invoked while configuring the test suite; if they fail, do not try to test additional features. Given this known order of precedence, you also want to make sure that the test suites are run in the order of A, B, C, because you know that C is dependent on the functionality of A and B. There is no point in trying. to run C if A or B is known not to work.
Now apply the principles to improve stability
If, when you switched to continuous delivery, your automated or manual testing made your software less stable, the above steps are for you. Follow these three key principles and your testing will always be compatible with your organization’s continuous delivery efforts.
Melissa Benua will present at STAREAST Online virtual conference, taking place May 4-7, 2020. Its tutorials include continuous testing using containers and advanced test design for CI / CD. She is also doing a presentation on Fuzz Testing. Click here for more details.