Combinatorial Test

When a software application accept several inputs, each of which can assume different values, it is impossible - in general - to test all combinations of values of input variables, simply because they are too many. Let's make an example, consider a software feature that accepts as input three possible values A, B and C. These values can be chosen arbitrarily from the following table:

 

A

B

C

A1

B1

C1

A2

B2

C2

A3

B3

 

A4

   

# Values

4

3

2

Table 1 – Variables and values

 

The total number of possible combinations of variables (A, B, C) is equal to ; in practice, in order to ensure trying at least once all possible combinations of the values of the variables (A, B, C) 24 test cases must be carried out. Such combinations are the following:

 

1-4

5-8

9-12

13-16

17-20

21-24

A1;B1;C1

A1;B3;C1

A2;B2;C1

A3;B1;C1

A3;B3;C1

A4;B2;C1

A1;B1;C2

A1;B3;C2

A2;B2;C2

A3;B1;C2

A3;B3;C2

A4;B2;C2

A1;B2;C1

A2;B1;C1

A2;B3;C1

A3;B2;C1

A4;B1;C1

A4;B3;C1

A1;B2;C2

A2;B1;C2

A2;B3;C2

A3;B2;C2

A4;B1;C2

A4;B3;C2

Table 2 – A, B and C values’ variables combinations

 

Now, in this particular case, such number of tests can still be affordable. However, if we consider the general case of N variables X1, X2, …Xk, the first accepting n1 possible values, the second n2 possible values, the k-th that assumes nk possible values, the total number of combinations is equal to: 3*4*2*2*3=144 . Such a value, even for low values of n1, n2 ,…, nk is an high number. For example, if k = 5 and (n1=3; n2=4; n3=2; n4=2; n5=3) we get a combinations’ number equal to that is quite a large number of tests to perform if you want to ensure complete coverage of all combinations.

 

In real software applications, the number of values that can assume ni variables is high and it's easy to reach the hundreds of thousands (or millions) of combinations, which makes it impossible to perform comprehensive tests on all combinations.

How can we carry out an effective test when the number of variables and values is so high to make it impossible to exhaustively test all combinations? What reduction techniques apply?

wise-1 testing

When combinations’ number is high, it is possible at least verify that - at least once - each individual value of the variables is given as input to the program to be tested. In other words, if the variable A can take the values A1, A2, A3, it must be executed at least a first test in which the variable A = A1, a second test in which A = A2, and a third test in which the variable A = A3; the same goes for the other variables. This type of test provides a so-called wise-1 cover, and we will see shortly the meaning. In practice, we have the following table:

 

# TEST

A

B

C

1

A1

*

*

2

A2

*

*

3

A3

*

*

4

A4

*

*

5

*

B1

*

6

*

B2

*

7

*

B3

*

8

*

*

C1

9

*

*

C2

Table 3 – Max wise-1 Test Set

 

A first reduction that is possible to do is to set a value for the first variable and assign a random (but permitted) value to the other variables (stated with * in Table 3) and proceeding in this way for all the variables and values. In this way we reduce the test cases from 24 to just 9. It’s still possible to further reduce the number of test cases considering that instead of * you can put a value of the variable which can then be excluded from the subsequent test cases.

Put into practice, for test case # 1 in place of B = * put B = B1, instead of C = * put C = C1 and remove test case # 5 and test case # 8, which –now - are both covered by the test case # 1;

Test case # 2: in place of B = * put B = B2 and n place of C = * put C = C2 and erase test cases # 6 and # 9 both of which are now covered by the test case # 2 .

Test case # 3: instead of B = * put B = B3 and in place of C = * insert any value C1 or C2, considering that the values of the variable C equal to C1 and C2 are already been covered by test cases # 1 and # 2; we can let C=* and postpone the choice of whether to enter C1 or C2. Now, remove test case # 7, since B = B3 is now covered by the test case # 3.

Having understood the mechanism, there remains only the test case # 4, which covers A = A4; we can let B=* and C=* postponing the choice of what to actually select when we will really perform the test.

The symbol * represents the "don’t care"; any value we put in it, the coverage of the test set does not change and all variables values will be used at least once. Those with "*" value ​​should be covered more than once.

The final minimized Test Set for wise-1 coverage is the following:

 

# TEST

A

B

C

1

A1

B1

C1

2

A2

B2

C2

3

A3

B3

*

4

A4

*

*

Table 4 – Minimized wise-1 Test Set

 

Tabella 4 is drawn from Tabella 3 moving up the columns of the variable B and C to fill the values ​​with * with specific values; the * values “stagnate” in the lines that cannot be covered from specific values ​​(row 3 variable C and row 4 variable B), because the number of values of the variables are different (for example, variable B has just 3 values, while variable A has 4 values; the missing B value – with respect to A – is replaced by *).

Therefore saying that a test set, such as that reported in Table 4 , provides wise-1coverage is equivalent to say that each individual value of each variable is covered at least once.

The general wise-1 rule we can deduce is the following:

 

N variables X1, X2, …Xk, the first assuming n1 possible values, the second n2 possible values, the k-th nk possible values, the maximum number of tests that provide coverage wise-1 is equal to , while the minimum number of tests is equal to the maximum value among {n1 , n2 ,…, nk }.”

 

In real cases, what is of interest is always the Test Set with the minimum number of test cases ensuring the chosen coverage (and this for obvious reasons).

2-wise testing or pairwise testing

If the wise-1 test guarantees coverage of every single value for each variable, it is easy to see that a wise-2 test set ensures that all pairs of values of the variables are covered at least once. In the case of the variables listed in Tabella 1, all pairs of variables are as follows: {(A, B), (A, C), (B, C)}. In fact, the combinatorial calculation shows that the number of combinations of N values taken K to K (with N ≥ K) is equal to:

Combination

In our example we have three variables (N=3) taken two to two (K=2), an applying the combinatorial formula above we get 

 

Pairs

 

The three pairs are {(A, B), (A, C), (B, C)}.

 

Wanting to compute all possible pairs of variable values, we need to consider the following table:

 

PAIR

# VARIABLES VALUE

TOTAL

A

B

C

(A,B)

4

3

   4*3=12

(A,C)

4

 

2

 4*2=8

(B,C)

 

3

2

 3*2=6

GRAND TOTAL

 12+8+6=26

Tabella 5Counting of the pairs of values of the variables A, B and C

 

Hence, the total of all the pairs of values of the variables A, B and C whose values are reported in Tabella 1is equal to 26 and are all printed in the following table:

#

# COPPIE VALORI

A, B

A, C

B, C

1

A1,B1

A1,C1

B1, C1

2

A1,B2

A1,C2

B1, C2

3

A1,B3

A2,C1

B2, C1

4

A2,B1

A2,C2

B2, C2

5

A2,B2

A3,C1

B3, C1

6

A2,B3

A3,C2

B3, C2

7

A3,B1

A4,C1

 

8

A3,B2

A4,C2

 

9

A3,B3

   

10

A4,B1

   

11

A4,B2

   

12

A4,B3

   

# COPPIE

12

8

6

TOTALE

12+8+6=26

Tabella 6Pairs of valuesof the variablesA, B andC

 

Why you should consider a Test Set to cover wise-2? Not be enough to consider a Test Set with 1-wise coverage? Here we enter into a thorny issue, in which the opinions are varied, concordant and discordant.

Belowis the “incipit”from the site http://www.pairwise.org/ :

 

“Pairwise (a.k.a. all-pairs) testing is an effective test case generation technique that is based on the observation that most faults are caused by interactions of at most two factors. Pairwise-generated test suites cover all combinations of two therefore are much smaller than exhaustive ones yet still very effective in finding defects.”

 

We mention also the opinion of James Bach and Patrick J. Schroeder about the Pair Wise method: "Pairwise Testing: A Best Practice That Is not" from James Bach, Patrick J. Schroeder available from http://www.testingeducation.org/wtst5/PairwisePNSQC2004.pdf ):

 

“What do we know about the defect removal efficiency of pairwise testing? Not a great deal. Jones states that in the U.S., on average, the defect removal efficiency of our software processes is 85% [26]. This means that the combinations of all fault detection techniques, including reviews, inspections, walkthroughs, and various forms of testing remove 85% of the faults in software before it is released.

In a study performed by Wallace and Kuhn [27], 15 years of failure data from recalled medical devices is analyzed. They conclude that 98% of the failures could have been detected in testing if all pairs of parameters had been tested (they didn’t execute pairwise testing, they analyzed failure data and speculate about the type of testing that would have detected the defects). In this case, it appears as if adding pairwise testing to the current medical device testing processes could improve its defect removal efficiency to a "best in class" status, as determined by Jones [26].

On the other hand, Smith, et al. [28] present their experience with pairwise testing of the Remote Agent Experiment (RAX) software used to control NASA spacecraft. Their analysis indicates that pairwise testing detected 88% of the faults classified as correctness and convergence faults, but only 50% of the interface and engine faults. In this study, pairwise testing apparently needs to be augmented with other types of testing to improve the defect remove al efficiency, especially in the project context of a NASA spacecraft. Detecting only 50% of the interface and engine faults is well below the 85% U.S. average and presumably intolerable under NASA standards. The lesson here seems to be that one cannot blindly apply pairwise testing and expect high defect removal efficiency. Defect removal efficiency depends not only on the testing technique, but also on the characteristics of the software under test. As Mandl [4] has shown us, analyzing the software under test is an important step in determining if pairwise testing is appropriate; it is also an important step in determining what addition al testing technique should be used in a specific testing situation.”

[4] R. Mandl, "Orthogonal Latin Squares: An Application of Experiment Design to Compiler Testing," Communication of the ACM, vol. 28, no. 10, pp. 1054-1058, 1985.

[26] Jones, Software Assessments, Benchmarks, and Best Practices. Boston, MA: Addison Wesley Longman, 2000.

[27] D. R. Wallace and D. R. Kuhn, "Failure Modes in Medical Device Software: An Analysis of 15 Years of Recall Data," Int'l Jour. of Reliability, Quality and Safety Engineering, vol. 8, no. 4, pp. 351-371, 2001.

[28] B. Smith, M. S. Feather, and N. Muscettola, "Challenges and Methods in Testing the Remote Agent Planner," in Proc. 5th Int'l Conf. on Artificial Intelligence Planning and Scheduling (AIPS 2000), 2000, pp. 254-263

To clarify,we can say that Pairwise or 2-wise test method ensures that all combinations of pairs of values ​​of the variables are tested, "theoretically ensuring" the maximization of the anomalies found, with percentages ranging from 50% to 98% according to the studies. In fact, no test can ever guarantee a defined percentage removal of defects (which can only be calculated ex post for the specific project); let's say trying to be realistic - that the Pairwise achieve a valid agreement between the number of tests to be performed and the anomalies detected, when the number of variables and their values is so high that it cannot be tested all the combinations (the so called all-wise testing or N-wise testing, where N is the number of variables we are playing with).

In the case of Test Set covering wise-2 level is very simple to know the maximum number of tests that provide coverage for all pairs of values of the variables. This value is equal to the number of pairs of values of the variables themselves. In our example, three variables A, B and C in Tabella 1, this number is 26 (calculated as described in Tabella 6). The real problem – still unsolved - is to discover the minimum number of tests that guarantees wise-2 coverage, although there are a variety of methods and algorithms that approximate this value for a problem with an arbitrary number of variables and values. Examples of tools that use these algorithms are:

  1. Microsoft Pairwise Independent Combinatorial Testing tool (PICT), downloadable from http://download.microsoft.com/download/f/5/5/f55484df-8494-48fa-8dbd-8c6f76cc014b/pict33.msi;
  2. NIST Tools: http://csrc.nist.gov/groups/SNS/acts/documents/comparison-report.html
  3. James Bach AllPairs downloadable from http://www.satisfice.com/tools.shtml ;
  4. Other tools here: http://www.pairwise.org/tools.asp .

n-wise testing with n>2

It’s now easy to extend the concept of pairwise or 2-wise test to the generic case of n-wise, with n> 2. The generic Test Set provides n-wise coverage if it is able to cover all the n-tuples (3-tuples if n = 3, 4-tuples if n = 4 and so on). As in the Pairwise, is always possible to know the size of the maximum Test Set, equal to the number n-tuples of values of the variables, but there is no way to know - in the general case - the size of the Test Set Minimum guaranteeing coverage n -wise.

Using NIST Tools (ACTS) or PICT is possible to extract a Test Set that approximates as much as possible the minimum Test Set. It 'clear that, given a set of N variables, the maximum level of wise which you can have is equal to the number of variables. So, if we have 4 variables, a 4-wise test set coincides with all possible combinations, while a 5-wise Test Set or higher makes no sense.

Purpose

The combinatorial testing techniques presented here, are intended to solve the following cominatorial test basic problem:

 

DIRECT COMBINATORIAL TEST PROBLEM: "A software system accepting N input variables, each of which can take on different values, find the Test Set with the smaller number of test cases that guarantee (at least) the coverage of all combinations (2-tuples) of all the values of the variables involved.
To solve this problem has been developed Pairwise technique and a large number of support tools. Once such a test set (the smallest possible) has been generated, you should run the test cases and detect (if present) all the defects comes up in the software under test.

 

There is also a second problem, maybe less "popular" compared to the previous, that is the following:

 

REVERSE COMBINATORIAL TEST PROBLEM:"Given a Test Set for which you do not know the method of generation (if any), calculate what percentage of nwise-coverage the Test Set ensures, with nwise between 1 and the number variables in the Test Set ".

 

An example is that in which tests are generated by automated tools for which you have a low or almost zero process control, or when the "test cases" are generated by the automatic flows that feed interfaces between different systems (think of a system that transmit accounting data from a system A to B); test data are - in general - excerpts from historical series on which you have no control.

 

For test scenarios in some way related to a combinatorial inverse problem, it’s not easy found support tools or such tools are not readily available. The only tool I found is NIST CCM, still in alpha-phase at the moment I’m writing this notes.

 

In the following we describe a set of tools called "Combinatorial Testing Tools" (executable under Windows, but -if needed–  not difficult to port under Unix / Linux) allowing to extract the (quasi)minimal test set and calculate the coverage of a generic test set , using systematic algorithms calculation coverage, starting from all n- tuples of the variables’ values; such algorithms should be categorized "brute force algorithms" and should be used (on a normal supermarket-buy-PC) if the number of variables and values is not too high.

 

Tools in Combinatorial Testing Tools product (CTT in the following) try to provide a support to the solution of both problems of combinatorial test previously stated.

 

CTT do not intend to compete with the existing tools aimed to solve the direct problem of combinatorial testing, such as Microsoft PICT, AllPair J. Bach, NIST ACTS or other several commercial and non-commercial tools already present on the market. Those tools implement algorithms definitely more effective than CTT and therefore should be favorite – I repeat – to solve the direct problem.
Regarding the reverse problem of combinatorial tests do not exist, to my knowledge, tools on the market (except NIST CCM in alpha-test) and the CTT then attempt to provide a very first solution to the reverse problem, to be surely improved over time, when we will be better understood the logic and the rules that are behind combinatorial test and the minimal determination of test sets related to it.


We would like to remember that:

  1. Solving the direct problem means determining the as small as possible test set with a level of WISE coverage agreed (usually WISE = 2) from the set of variables and values.
  2. Solve the reverse problem means determining the level of coverage of a given test set with respect to a reference WISE level (also here usually is WISE = 2)

 

The tools should be categorized as follow:

  1. First level Tools: batch DOS scripts providing an immediate response to standard scenarios that typically occur in test projects requiring combinatorial techniques.
  2. Second level Tools: executable (C++/Perl development language) more versatile giving response to problems that may be less common, but sometimes occurs in test projects requiring combinatorial techniques.

 

First level scripts were thought as the "wrapper" around the second level executables, in order to "simplify end user life" with a set of simple commands that allow to quickly get a number of “standard information”.

Web Traffic

Today4
Yesterday34
This week35822
This month579
Total35822

Who Is Online

1
Online

Saturday, 16 December 2017 01:13
Powered by CoalaWeb