Since ChatGPT was launched last year in November, it has captured the world’s attention by showing millions of users around the world the incredible powers of artificial intelligence as it provides answers to requests from the surreal to the practical, often astounding us with how human-sounding those answers are.

We decided to ask ChatGPT a simple question – “Is ChatGPT the software tester’s new best friend?” to see what it came back with.

The exact copy generated by ChatGPT is included in this blog marked in bold, along with our commentary.

ChatGPT is an advanced conversational AI model created by OpenAI that has revolutionized the way we interact with technology. With its ability to understand natural language, generate human-like responses, and learn from vast amounts of data, ChatGPT has become a powerful tool for various applications, including testing.

Testing is an essential aspect of software development that involves checking for bugs, errors, and other issues that could affect the performance and functionality of a program. Testing can be time-consuming, complex, and tedious, especially for large and complex software systems. That’s where ChatGPT comes in.


[OUR COMMENTARY – at Qualitest, as our audience would be well versed in software testing and what it entails, we would not include a basic explanation like this.  There are certain subjective elements in the response here too, for instance ChatGPT’s statement that testing is tedious, there can be tedious parts, but not necessarily the entirety of testing as a discipline or all aspects of testing on a project.  In addition, ChatGPT says it has revolutionized our interactions, and while we don’t know if that is necessarily true, it has certainly lit a fire under the industry’s interest in AI’s ability to assist us in all manner of tasks.]

A conversational AI model that brings great benefits to software testing

As a conversational AI model, ChatGPT can be used to create chatbots that can interact with users and simulate real-world scenarios to test software applications. By creating a chatbot that mimics the behavior of a typical user, developers can test their software in a more realistic environment and identify potential issues more efficiently.

[OUR COMMENTARY – we think that ChatGPT is missing something here, as it is not just about chatbots. You can ask it to assist with ideation, for example, design the test cases and data combinations for X, prepare sample code for Y etc.  whilst these outputs do of course need review, it can be used in multiple scenarios.]

Some of the benefits of using Chat GPT for testing includes:

  1. The use of natural language responses

One of the biggest advantages of using ChatGPT for testing is its ability to generate natural language responses. Unlike traditional testing methods that rely on pre-defined inputs and outputs, ChatGPT can generate responses that are similar to those of real users, making testing more accurate and realistic.

[OUR COMMENTARY – to ask ChatGPT to interact with software via language interfaces, developers and automation engineers would need to include an interface and extend automation harnesses for their software to interact directly with ChatGPT to take its inputs and allow analysis of it’s results, for many teams this may not be practical.]

  • The ability to learn from previous interactions

Another benefit of using ChatGPT for testing is its ability to learn from previous interactions. As ChatGPT interacts with users, it can learn from their responses and adapt its behavior accordingly. This means that developers can use ChatGPT to test their software in a more dynamic and evolving environment, simulating real-world scenarios where user behavior can change over time.

[OUR COMMENTARY whilst ChatGPT does learn from interaction and will continue to grow in capability, there’s another risk here.  It can also learn from bad interactions, or out of date approaches to testing that may then be used in poor quality testing.
]

  • The ability to automate repetitive testing tasks

ChatGPT can also be used to automate repetitive testing tasks, such as regression testing, which involves retesting software after changes have been made to ensure that existing functionality has not been affected. By using ChatGPT to automate these tasks, developers can save time and focus on more critical aspects of software development.

[OUR COMMENTARY – we do not agree with the statement that ChatGPT can automate without a further caveat. Whilst it can certainly help automate and do some heavy lifting, for instance by generating code stubs and blocks, it cannot do it all as the code would need to be reviewed for design, integrity, sanity, and reusability, much like this content does.  When using ChatGPT and other generative AI technologies in testing in this way, it does also raise the questions of IP ownership of generated code, cases, etc.]

  • Greater versatility

ChatGPT is a general-purpose language model that can understand and generate natural language text across a wide range of topics and domains. This makes it highly versatile and applicable to a variety of testing scenarios.

  • Enhanced Speed and efficiency

ChatGPT can process large volumes of text data quickly and accurately, which can save time and improve efficiency in testing. It can generate responses almost instantly, allowing for rapid iteration and testing cycles.

[OUR COMMENTARY – it is unclear from ChatGPT what those responses are in the cycle of testing.]

  • More scalability

ChatGPT can be scaled up or down depending on the size and complexity of the testing task. It can handle a variety of data sources and types, making it useful for testing a wide range of applications and systems.

  • Greater consistency

ChatGPT provides consistent responses based on its training data and modeling process, which can improve the reliability and repeatability of testing results. This can be especially valuable in large-scale or complex testing scenarios.

[OUR COMMENTARY this is not entirely true in our experience, as ChatGPT has been shown to be inconsistent over time.]

  • Additional accessibility

ChatGPT is available as a cloud-based service or as a software library, making it accessible to a wide range of developers, testers, and researchers. It can be easily integrated into existing testing frameworks and workflows.

As well as all these great benefits, ChatGPT can be integrated with other testing tools to provide a more comprehensive testing solution. For example, ChatGPT can be used to generate test cases that can be fed into other testing tools to identify issues more efficiently. ChatGPT can also be used to analyze testing results and provide insights into potential areas for improvement.

The limitations of using ChatGPT for testing

Like everything, there are also some limitations of using ChatGPT for testing. Some of these include:

  1. Limited domain knowledge

ChatGPT is a general-purpose language model, which means it has a broad understanding of many different topics, but it may not have specific knowledge of the subject matter being tested. This can lead to inaccurate or incomplete responses in certain contexts.

  • Lack of context awareness

While ChatGPT is capable of understanding the meaning of individual words and phrases, it may struggle to understand the overall context of a conversation or question. This can lead to irrelevant or confusing responses.

  • Bias and accuracy issues

Like all AI models, ChatGPT may have inherent biases in its training data, which can affect the accuracy and fairness of its responses. It has also shown signs of being “woke” with an anti-conservative bias.

[OUR COMMENTARY – In one instance, ChatGPT refused to write a poem about Donald Trump’s “positive attributes,” saying it was not programmed to produce content that is “partisan, biased or political in nature.” But when asked to describe Joe Biden, it waxed poetic about him as “a leader with a heart so true.”]

  • Limited interactivity

While ChatGPT can engage in a conversation, it is not a true interactive agent that can respond to user feedback or adjust its responses based on user input. This can limit its usefulness in certain testing scenarios, such as those that require dynamic or adaptive responses.

  • Technical limitations

ChatGPT may have technical limitations related to its computational resources, speed, or memory capacity, which can affect its ability to process large volumes of data or respond to complex queries in a timely manner.

[OUR COMMENTARY this shows a contradiction to the earlier paragraphs output by ChatGPT where it had stated “It can generate responses almost instantly, allowing for rapid iteration and testing cycles.”  It also highlights our commentary point in the automation section that actually connecting ChatGPT to software and harnesses for meaningful use of it’s inputs and outputs may not be practical.  This section does also highlight that ChatGPT itself can be inconsistent and self-contradicting, bringing it’s overall reliability without careful supervision into question.]

What we think about ChatGPT

Chat GPT’s ability to generate natural language responses, learn from previous interactions, and automate repetitive testing tasks make it a valuable tool for software development teams. However, it is essential to validate the testing results and consider the cost of implementation when using ChatGPT for testing. We think it can help with generating test assets, but it has inaccuracies, and its assets need review. It can block in content, but this still needs tweaking. It has other risks and considerations that were not touched on in the responses provided but that are apparent to us, such as:

  • IP and security for example are a known challenge for this generation of AI technologies and as an industry many questions are not answered about ownership of content and access to data in the models.
  • There are also significant complexities in installing or accessing ChatGPT and other LLMs in restricted environments, the kind of environments commonly found in software development teams.
  • The adoption of tools like ChatGPT and generative AI will need new processes, skills and techniques to use reliably that need to be factored into the total cost of test ownership for business.

ChatGPT has the potential to become a useful tool in the Tester’s arsenal, generating large amounts of content and collateral for quality, many testers are already using it.  However, as we have seen time and again, there are no silver bullets to software quality and this technology, like so many before it needs to be used in controlled, cautious processes with appropriate oversight and focus to keep quality high, cost and time low.

quality engineering free assessment