Testinnsikt 2026

Welcome to SOCO’s platform for deeper analyses of topics within the software-testing field in Norway!

This year’s theme is AI and how it is used in software testing today. We look at both the use of AI for testing and the testing of AI-generated solutions.

243

responded to the survey

13 %

never use AI [1]

25 %

have high trust in AI tools

Change since 2024

Hypothesis: The use of AI has increased in all areas since the previous survey in 2024.

Some of the questions in this year’s survey were also asked in the survey two years ago. Unsurprisingly, AI use has increased across every area.

Two years ago, the use of AI as a sparring partner, chatbot and search engine had already reached 50%. This year a full 86% use AI for such purposes – helped, of course, by the fact that AI results are now part of standard Google searches.

Two years ago, about a quarter of respondents had experience integrating AI into the solutions they developed. This has risen to just over 50% this year.

The largest relative increase in AI use has come within the testing field, where 57% of respondents now use AI assistance for various test work, from writing unit tests to drafting test plans and strategies. In line with developments in AI, it is reasonable to assume that newer models and tools are more practically useful than those available two years ago.

Are there differences in the spread of AI between the public and private sectors?

Yes, there are – especially when it comes to adopting AI in the products being built. The share that has implemented solutions with AI integrated has increased significantly in both, although the private sector is still a good head in front.

When it comes to using AI in daily work, however, larger changes appear to have happened in the public sector. Two years ago, use in the private sector was considerably higher than in the public sector. The latest survey shows the difference has almost completely evened out, and nearly 9 of 10 respondents get help from AI to solve everyday problems large and small – regardless of sector.

See less

Effect of AI-generated code

In total, about 50% of respondents have helped test solutions where AI generated part of the code. What effect does developers’ use of AI tools have on testers’ everyday work?

In total, 30% feel that the perceived quality of the AI-generated code is poorer than manually written code. Just as many feel it makes no difference to quality, while barely one in four feel that the AI-generated code is of higher quality than they are used to.

How is the test process affected by AI-generated code? One third feel it makes no difference – the job is the same regardless. One in four feel the test process changes, for example opening up for time to be used differently in the form of more exploratory testing.

All in all, a larger share appears to experience positive effects than those who experience negative effects.

See less

Testing AI solutions

A quarter of respondents have helped test products where AI is an integrated part of the solution. What challenges does that present?

The vast majority of solutions use purchased off-the-shelf products as the AI component, for example using APIs from OpenAI, Google or Microsoft. Even so, we see that a full 15% are entirely self-developed solutions, so we may manage to avoid 100% American hegemony in the field.

Testing AI is still new for most, while the models are developing rapidly, which unsurprisingly brings a number of challenges. From the figures, it appears that there are relatively greater problems with AI-specific challenges than with testing-related ones. Some find it demanding to create test cases and test data; even so, many more have problems with the output of the AI models when the tests are actually run.

See less

Use of AI for test work

Six of ten have used AI to plan and/or carry out testing.

Most are relatively neutral toward what AI produces – a healthy skepticism, perhaps? Only 12% have low or very low trust, while nearly 30% have high or very high trust.

Interestingly, a higher share experiences great usefulness from AI in testing than the share that has high trust. This may suggest that people use AI, sprinkle a pinch of salt over it, and adapt it to their own purpose.

The survey shows that AI tools are used across a broad spectrum of areas. The graph shows what share of those who use AI for test work have used it in the respective areas. We find the lowest use at the implementation level in the form of unit and integration tests, and the highest as a sparring partner and for suggesting test cases. This seems natural, as the strength of today’s language models lies in generating text and answering questions thoroughly. In addition, experience shows that a significant share of respondents have roles that are not purely technical.

See also the in-depth interviews section for more discussion of how AI can be used in test work.

See less

In-depth interviews

This year, in addition to the survey, we conducted five in-depth interviews with people who use AI in various ways, from pure test work, through ordinary development, to more industrial use. Because the use is so varied, it is not easy to draw clear conclusions across them – so the insights give a more illustrative picture of how AI is used today, what challenges exist, and how they are handled. The sections below are relevant insights for a given theme – drawn across the interviews.

Trust and security

The interviews show that trust in AI is largely conditioned by how security and privacy are handled. Even in organisations that use AI actively in daily operations, there is little willingness to leave decisions entirely to the models.

A clear example comes from a product where generative AI is used to analyse enquiries from the public and suggest possible answers. Here it is an absolute requirement that all answers are quality-assured by a customer-service agent before they are sent out. AI may suggest wording, but it is the human who holds final responsibility. This is a deliberate choice to reduce the risk of errors, unfortunate wording, or breaches of regulations.

Several interviewees also describe how GDPR and privacy have steered technical choices. Local models are often preferred over cloud services where practically possible. There is also deliberate use of anonymisation, masking and cleaning of data. Caution is also exercised about using AI-generated data directly in production, instead using it as a support system in the background.

These examples show that low trust in AI does not necessarily lead to low use, but to stricter frameworks, clearer responsibility, and increased focus on control.

How is AI used for test work?

The interviews show that AI is primarily used where it supports human understanding and analysis, rather than full automation of test work.

Several describe how AI is used as a sparring partner in test design. In one interview it was described how AI is used to analyse API structures and automatically generate suggestions for test scenarios, which are then assessed and adjusted by the tester before being put to use.

Another example is using AI to analyse large amounts of documentation and code to quickly understand the system’s structure and potential risk areas. This makes it possible for testers to get going faster in complex projects.

A couple of the projects had also adopted agentic AI in the development process, for automatic code review, integration testing and security testing. But these too had manual approval steps in addition.

Common to these examples is that AI functions as a support tool that reinforces test-related work, not as a replacement for it.

It is a recurring feature that many of the projects bear the mark of being at an early stage, with prototyping and exploration. There are few formalised test processes in place, and the teams are small – often without dedicated test resources.

Testing AI

When AI is an integrated part of the solution, test work changes considerably. The interviews confirm that traditional test methods are not sufficient on their own.

A clear example comes from an environment working with advanced models for real-time analysis. Here it is described how the same input can give different results depending on context and data, and how this makes fixed test cases less relevant. Instead, data from production is used for verification testing.

Other strategies are to have known results from earlier runs, compare against these, and assess how large a deviation is expected with the new model. Strict parameter control helps the models exhibit less creativity, which gives more predictable results.

In some cases it is also possible to ask the model to explain step by step how it arrived at the result. That can make it easier to trace the process and build trust, or uncover logical flaws in the reasoning.

The projects’ experimental nature and lightly formalised processes also mean that testing is not a concluding phase. The systems are monitored closely in production, and deviations are caught through operational signals and manual assessment. Test work thus continues as part of operations.

These examples illustrate how AI changes the test role. The tester becomes more responsible for understanding the system’s behaviour over time, and for assessing risk and consequences, rather than only verifying predefined requirements.

Test data more important than test cases

The interviews indicate that the quality of test data is decisive for the quality of AI solutions, and that traditional test cases alone are not sufficient. This applies especially where one develops one’s own AI models with machine learning.

In one interview it was clearly expressed that synthetic test data can be directly misleading and “dangerous” in certain domains. Instead, large amounts of production-like data are used to ensure the models are tested against real patterns and anomalies. Testing thus focuses more on representativeness in the data basis than on full coverage of predefined test scenarios.

One project has also spent considerable resources developing domain-specific physics simulators used to verify the models. For industrial processes that can be a good approach.

In another project there is extensive use of synthetic test data, so the needs still differ. For fairly predictable domains with transactions, personal data and other structured data, synthetic test data can still be a good choice. For other applications involving more unstructured signals, interpretation of audio and video streams, and general machine learning, it will be very demanding to create synthetic data that covers reality well enough.

Speed and mess

AI-assisted coding can increase the pace of development, which some of the interviews also indicate. At the same time, clear challenges related to quality and maintenance are described.

One interview highlights how AI can generate large amounts of code quickly, but with significant duplication and lack of structure. What is meant to be a small adjustment leads to changes across large parts of the codebase, making it confusing and hard to understand what are real changes and what is merely cosmetic.

Examples mentioned are also that the generated code is too generic, for example that it does not adapt well enough to domain-specific challenges. Generated code can also work well in typical cases but fail on special cases. One respondent describes how AI tends to fill out a poor specification with assumptions, which in some cases led to code that apparently did something sensible.

To uncover this type of error, good domain knowledge, good requirements and precise contracts are still very important. One interview also points out that they have better experience creating good specifications from which AI generates code, rather than asking for code directly. Iterating on the specification can give better results over time. At the same time, this means the code is regenerated every time the specification changes. That makes a traditional flow with version control of the code challenging, since there can be large changes, especially if the language model has been upgraded. This makes good tests very important.

This has led several environments to tighten their requirements for code review and regression testing. AI-generated code is treated as a draft that must pass the same – or stricter – quality controls as human-written code. Responsibility for quality still lies with the developer and test team, not with the tool.

See less

Summary

Unsurprisingly, the use of AI has increased significantly since the previous survey two years ago. This applies to general use, integration in developed solutions, and use as a testing tool.

There is a fair amount of trial and error; many experience challenges and are perhaps not sure how best to approach the problem. The in-depth interviews give the impression that there is a lot of prototyping and exploration under way, both regarding product development and testing. That is entirely natural, as the AI tools are also still developing rapidly. What is true today does not necessarily hold in three months.

At the same time, we also see examples of concrete use cases where AI provides opportunities that were not there before.

There is some skepticism toward what comes out of the AI models, both regarding correctness and regarding the use of data given to the models. This is handled with reinforced control mechanisms, human validation, and caution about exposing AI data directly in production.

AI tools are used and, for now, work best as support tools for analysis, design of test cases, strategy and as a sparring partner. They are also used for implementing and running tests, but to a lesser degree. This seems natural, as most tools are good at generating text, not necessarily correct technical and domain-specific details.

Critical thinking, testing and development competence, and domain understanding appear to be more important than ever.

1) The average experience of those who never use AI is 19.6 years