Test Automation Best Practices

2023-10-07

Here are some tips on what I've found to be good practices when working on test automation

Introduction

Test automation is an integral part of any modern software development project. It gives you confidence about software quality and enables the team to move fast when delivering new features while at the same time making sure there aren’t regressions in the code.

Getting to the right level of testing in a reliable manner can be challenging, though. Following these eleven practices will make things go smoother in your company.

You can read a previous article here for a more generic overview of test automation.

1. Follow the Test Pyramid

Test automation is not just one thing but consists of tests on different levels, for example:

  • Unit tests
  • Integration tests
  • UI component tests
  • end-to-end (E2E) tests / API tests

The test pyramid is a way to categorise tests. The bottom level, i.e. unit tests, are the fastest and the cheapest to write, while UI tests are the slowest and most challenging to write and to keep running successfully.

These test levels add value; the key is correctly balancing the different levels. Unit tests don’t prove that a component can be integrated with another one successfully or that a feature works correctly, but they are critical when building a robust application.

Much of the front-end development uses JavaScript frameworks, which divide UI development into components that can be tested independently. This brings a new level of testing, which wasn’t often required in a server-rendered application. Component tests are a great way to ensure the components work correctly in different states, as the components’ states can be passed in instead of relying on external factors. This makes it easy to see and test components in specific states, for example, with valid data or with an error message. While it doesn’t prove that the component will always receive valid data from the outside, there are very useful tests. When components are tested separately, there’s less need to have end-to-end tests covering the same.

Unit tests must be combined with integration tests to test the integration with other components and services and with acceptance/E2E tests to ensure the whole feature works as defined in the requirements.

Another consideration is whether more things can be tested using the API, not the UI. For example, an API might return a specific error code (400) for multiple reasons. Still, from the integration point of view, it’s enough to test this with one test case, assuming different error messages have been tested in the component tests.

2. Treat Test Automation Code as any Other Code

Code quality is one of the most critical factors in determining the success of software development and, thus, a test automation project. Test automation code has to be treated like any other code, including linting (checking code style) and code reviews. The code should be DRY and not repeat the same statements, even when similar tests are implemented.

Several tools can be used for linting. For TypeScript/JavaScript, the most commonly used is ESLint, while for Ruby, RuboCop is often used. These tools come with ready-made rule sets, and changes can and should be made at the team/organisation level. The rules for test automation code should be the same as for any other code. The whole team should contribute to the test code and as such, it should follow the same rules as the rest of the codebase.

Test automation code should, in most cases, be located in the same repo as the application code. This way, when the site changes, the tests will also have to be updated to get the build to pass and to be able to merge to the main/develop branch.

3. Use the Right Tools

In the same way, as there are many different areas in a test strategy, multiple tools can be used to accomplish the required test coverage. There are unit tests, integration tests and acceptance tests. There are tests to verify that the functionality works, and there are tests to cover non-functional requirements. There are performance tests and load tests.

Each category of tests will likely use different tools and possibly even a different language. Unit test tool selection comes from the development team and is often tied to the tools and frameworks used to implement the project. One might use Jest for unit tests and a framework more intended for functional tests, such as Playwright.io, Cypress.io, WebdriverIO, or Selenium. Playwright.io and Selenium have bindings in multiple languages, while Cypress.io and WebdriverIO are JavaScript/TypeScript only.

Several tools can be used for API testing. I usually lean towards using a unit test framework and an HTTP library for the language to make API requests and parse responses, for example, Jest and Axios. There are also UI tools, such as Postman, where a collection of requests can be created, which can later be run on the CI as an API regression test pack.

Gatling and JMeter are the most commonly used performance/load testing tools. One shouldn’t be afraid to select a different tool when it suits the problem best.

4. Use Page Objects or Screenplay or a Similar Pattern

Page objects are used to model the pages and components under test by defining the elements required in the tests and the selectors to find them in a single place. When a CSS class affecting the current element changes, the code only needs to be updated in one place instead of all places the element was accessed. As a side note, using CSS classes is prone to changes and nearly impossible with JavaScript Single Page Applications. A better approach is to use a data-test-id or a similar attribute dedicated to testing purposes. Locating elements with visible text is another popular way to find elements. Still, it also comes with the risk of copy changes breaking tests and has extra challenges with multilingual websites.

One of the points mentioned earlier was that the code shouldn’t repeat itself. The page object and the Screenplay pattern are two patterns that help in this regard. Page objects define how to find elements and sometimes have functions for taking specific actions on a page. Screenplay pattern is based on defining user actions that can used in the tests when needed, thus reducing repetition.

Page objects and similar abstractions for everyday actions taken on the site reduce the maintenance burden when the page markup and a CSS selector change and improve the test suite’s stability and performance. Improvements in stability and performance are not automatic but a result of implementing a specific action in one place, allowing time to ensure the implementation is as good as possible. When code actions are located in one place, it’s easier to ensure the tests stay stable. Regarding performance, some code might get called hundreds of times during a test suite, and optimising the actions can bring significant time savings.

It’s possible to use both page object and Screenplay patterns simultaneously. In this case, the page objects only define where to find the elements, and the Screenplay pattern defines the actions.

Regarding API testing, using similar patterns can make the client code easier to manage without always building a request from scratch.

To summarise, the reason for using these patterns is to make the maintenance of the tests more manageable. Another significant benefit is reducing friction in writing new tests when the supporting code is already in place.

5. Run Tests on CI on Every Commit

The code is only as good as the last build. The tests need to be run on each commit for two reasons:

  1. To find application code problems
  2. To find problems with the test code

The first one is obvious. The whole reason for having the tests is to find problems with the application code. If the tests are not run on each commit, the issues are only discovered later, making fixing them more difficult.

The second reason, test code problems, can be due to the application changing and working perfectly fine from the functionality perspective but requiring tests to be updated. This is also why tests should be kept in the same repo as the code. Any other arrangement makes it more challenging to keep the tests up to date. Running the tests and discovering that something breaks gives quick feedback required to keep things working smoothly. It also allows the developer who changed the page/application to fix the tests as part of the pull request.

6. Use Mock Data

Tests running on the CI should not rely on external endpoints and production data. Instead, the data should be controlled and known and fulfil the requirements of the test. The data should, of course, also reflect real data, and there needs to be a process to ensure it does. When using mock data or mocked backend APIs, we can be sure that if there is a problem with the application, it is due to the application code, not the data.

The following list contains reasons why mock data should be used:

  • To make sure tests are not flaky as the data is static
  • Making tests run as fast as possible (assuming mocks are faster)
  • Ability to control the data required for the tests

The ability to control the data from the tests makes writing tests much more straightforward. The data can be configured just before running the test, and the data will always match the requirements of the test. There is no need to figure out all possible data requirements beforehand, but do it when writing the tests.

While using mock data fixes several problems, it generates a new one; mock data/format can get outdated. Some tests still need to run against real endpoints, but their number can be reduced. The project must define a suitable strategy for keeping the mock data current.

7. Make Sure Tests Run as Fast as Possible

Another critical factor for a successful test suite and a happy development team is how quickly the tests are executed. Some of the critical questions are:

  1. Is a specific test needed?
  2. Is the test on the right level in the test pyramid?
  3. Is this specific test running as fast as it can?
  4. Can the tests be executed in parallel?

The first and the second questions are closely related. A test might be necessary but can be on the wrong level in the test pyramid. Testing through the UI is slow and flaky (although getting better with modern tools), and if the same piece of functionality can be tested on a lower level, it should be. For example, testing a UI component thoroughly with unit tests and integration tests can make it unnecessary to test the same component through the UI in an end-to-end test. A limited set of e2e tests is needed to ensure the pieces work as a part of the system.

For example, suppose a test needs to check that a specific error message is shown when a user tries to log in with an incorrect password. In that case, it can be done on the UI level, but it can also be done on the API level by sending a request with an incorrect password and checking the response. The UI only shows the application’s state and can easily be tested in a unit test or using Storybook, which is quicker than doing the same in an e2e test.

When working on stories to build new features in an application in an agile project, the initial MVP is focused on core functionality. When more features are added, new tests are required as well. It’s easy to have too many tests at some point, making the test suite too slow. One must be vigilant and remove tests that don’t bring more value or are covered elsewhere.

For the third question, whether specific tests are running as fast as possible, there are several things to consider. One of them is the test framework, and new frameworks such as Playwright tend to be faster. The second part is the test code itself. For example, having pause/sleep commands waiting for a specific number of seconds will always delay the tests by a constant amount no matter how quickly the element is available, and sometimes, the wait times add up to a significant amount of time. The application speed also matters; the faster the pages are rendered, the quicker the tests run.

Waiting should always use the built-in waitFor (or similar) helpers in the test framework, minimising the wait times. These helpers typically retry using a configurable interval (often 100ms) and only wait as long as necessary or time out.

Parallelisation is an often used tool to help reduce the time required to run the tests depending on the number of parallel jobs running. Having four parallel nodes running a subset of the tests does not quite reduce the time to 25%, but it can be close depending on how the tests are split and how much time is spent to set up the system under test on the CI.

8. Name Things Accurately

Naming things accurately is essential in any coding. For test automation, it often comes up in page objects, test step definitions or Cucumber steps. This is especially important in Cucumber tests, which specify the system behaviour and work as documentation for the features.

Naming things accurately is important from two points of view. First, the project stakeholders should be able to read the feature files and understand what the system does. The second target group is the test automation developers themselves. If the step wording doesn’t accurately describe what it does, another developer might use and rely on the step to do something it doesn’t based on the incorrect wording. The same goes for function names.

9. Make Sure Tests are Stable

Most people who have written test automation for web apps have faced problems with the stability of the tests. Sometimes, an element is not found, or it is stale (already removed from the DOM), and other times, the data has not yet loaded.

What causes tests to fail? Most of the time, it is due to loading a web page asynchronously, and either an element or some required data is not yet present. This can happen on your local machine, but often, the probability of these errors is higher on a continuous integration system due to different performance characteristics compared to the machine the tests were implemented on. Usually, the following differences exist:

  • Operating system version/configuration
  • Performance
  • Screen resolution

The fact that a test sometimes fails when controlling a browser is part of automation testing and is prone to happen in new tests. Once such a test is identified, time should be spent ensuring the problem doesn’t happen again. Screenshots and videos are vital for determining what went wrong on the CI. The problem should also be reproduced locally, where debugging is easier.

Another important thing that needs to be mentioned, with the risk of it sounding obvious, is the effort of debugging a failing test. Each failure should be looked at to figure out what went wrong. Ask yourself at least the following questions:

  • Is it a new test?
  • Did it work before?
  • When did it start failing?
  • What changed in the system under test?
  • Does it work locally?

10. Remove Tests when they are no Longer Useful

Removing tests that are no longer useful should be a natural part of the development culture of the organisation. The test suite should be kept lean, optimised when the work yields improvements regarding stability and performance, and fixed when there are problems.

It’s easy to get attached to tests and keep them even when they don’t provide much value. It has certainly happened to me when wanting to keep a test case as proof of specific acceptance criteria being tested. Removing tests covered at a lower level is a good practice when a component-level test provides enough coverage.

11. Monitoring and Debugging the Tests

All of these factors need to be monitored continuously in the team with the tools available and improved once problems are found. Having an insight into what tests are failing and how often gives the team an understanding of where to spend their time to get the most improvements.

To find out which tests are having problems, the CI jobs need to be configured to report test results in a way that makes it easy to find any tests that fail every so often. You should check the CI docs on how to enable getting test results in a way that makes it easier to find these problems.

Once a flaky test is identified, time should be spent ensuring the problem doesn’t happen again. Screenshots and videos are vital tools in figuring out what went wrong. Running tests locally is the first step.

It’s worth noting that a test can sometimes seem flaky, but the problem can be that the application behaves incorrectly sometimes.

Test performance should also be tracked. CIs usually show the total build time, which is good for detecting run-time jumps for the whole test suite. These are typically caused by one of the following:

  • Application under tests getting slower
  • More tests being added
  • Problem with the system under test (no response, and tests keep waiting for a page)
  • A problem with the test implementation

Using test framework-specific tools to log out test run times helps find any slow tests.

Summary

As you can see from this post, there are many things to consider when tackling test automation in a product team. Following these guidelines will help you get started and avoid common pitfalls. Tests evolve as the application code; everything doesn’t need to be perfect from the start.