Efficiently managing your test data is necessary for maximizing return on investment in the testing effort. While there are plenty of resources on things to do with your data, there isn’t much available about what NOT to do. That’s sometimes even more important than being told what to do right, because how can you fix mistakes you aren’t aware that you’re making?

Proper test data management can make or break the testing process. Without properly-managed data, it’s almost impossible to guarantee efficient, cost-effective, and expansive testing.

What is test data?

According to Wikipedia, the information which is earmarked for use in testing is called test data. There are two main sources for test data: anonymized customer data, which is based on actual customer information sans anything that could identify the individuals it pertains to, or fake data which was fabricated by testers specifically for the particular test case. It may be used in a confirmatory way, such as to verify a given set of input and a given function produces an expected result. Other data can be used to challenge the program’s ability to respond to unusual, extreme, exceptional, or unexpected input.

Typically for domain testing, test data is produced in a focused or systematic way, though it can also make use of other, less organized approaches for randomized automated tests (particularly those of high volume). It is sometimes created by the testers themselves or by a program specifically designed for this function. Sometimes test data is recorded for reuse in later tests, or used only once or twice and then discarded.

Generating and using test data correctly is a huge part of accurately testing software, and of course, inaccurate testing leads to inaccurate results about the product itself, which will decrease ROI for the entire project. That’s why it’s so important to pay attention to how your test data is being managed: it has a huge impact on whether or not the testing process itself is reliable enough for its findings to be credible.

Test Data Management Mistakes

Proper test data management can make or break the testing process. Without properly-managed data, it’s almost impossible to guarantee efficient, cost-effective, and expansive testing. If the data does not promote ease of use and adaptation, consumes excessive resources, or if it poorly represents the sampled source, it will quickly begin to negatively impact the testing process, slowing it down and degrading the quality of the test results. A great way to mitigate that is to become familiar with the test data management process, on which we recently published a white paper.

Are you doing any of these five things? If so, you probably aren’t managing your test data properly.

  1. Reusing the same data for multiple test cases, without explicitly stating the earlier ones as requirements. As test cases can modify the data, this can lead to different results based on the order of execution (for example, if you are testing a calculator, and the two test cases are: x = x + 1, and x = x * 2).
  2. Lackluster coverage. What’s the point in using test data that doesn’t cover the entire testing scope?
  3. Using live data, as this can be subject to change as the tests progress. It’s best to use static data from previous test cases, as then the expected results are known, and there are no privacy concerns.
  4. Not resetting between releases. See problem 1.
  5. Having the manual testers have known “workarounds” for defunct data, without actually filing it. This can be confusing if the project has to later be handed off to a different test team, and leads to problems with expected outcomes mismatching actual results, even when a bug is not present.

We talk a lot about ways to make your testing process more efficient. While things like assessing your testability and accurately selecting testing tools are great ways to do so, the importance of proper test data management really can’t be stressed enough. For a more in-depth exploration of the challenges and concerns of test data management, as well as solutions to address them, please take a look at our recent white paper on the topic here.