Open Testware Reviews

jenny

Copyright 2004 by Tejas Software Consulting and Christopher J. Meisenzahl - All rights reserved.

Contents

Overview -- Observations -- Maturity -- Project activity -- Platforms -- Support -- Documentation -- Installation -- Implementation -- Performance -- Similar tools -- Limitations -- Appendix A: Additional examples -- Appendix B: Dealing with impossible pairs -- Appendix C: Processing jenny's output

Overview

Reviewer: Christopher J. Meisenzahl
Date reviewed: 2004-03-17
Version reviewed: 2003-09-14
Maintainer: Bob Jenkins
URL: http://burtleburtle.net/bob/math/jenny.html
Testingfaqs.org category:
Test Design Tools
License: Public Domain
User interface: Command Line

Jenny is a tool used primarily to generate test case combinations based on a specified set of inputs. Let's say that you have a set of variables representing configurations that need to be tested, such as combinations of browsers (IE, Netscape, Mozilla, Opera, Safari), operating systems (Win95, Win98, Win2k, Mac OS X, Linux), JavaScript configurations (on, off), and SSL options (enabled, disabled). To exhaustively test every combination of these values, you would need to test 100 configuration combinations: 5 x 5 x 2 x 2 = 100.

You could instead decide to test only the unique pairs of variable states (known as pair-wise testing). This has been proven empirically to be an effective method to reach satisfactory test coverage (treatments of the effectiveness of pair-wise testing are handled elsewhere, see Appendix D). By feeding the “5 x 5 x 2 x 2” example into a tool like jenny, you can whittle the 100 combinations down to only 25, and still have full pair-wise coverage. The bottom line is that tools that generate pair-wise test cases have the potential to greatly reduce the time and cost associated with testing. See the documentation section for references to additional information about this technique.

Here's an example of jenny in action on a smaller problem.

jenny screen shot

Observations

I like this tool quite a bit and find it both useful and powerful. But it's not for non-technical users or those who don't understand pair-wise testing. It's important when using a tool like this that you understand what it is that you've asked the tool to generate, and its limitations. Pair-wise testing is not a substitute for risk analysis, Human-Computer Interaction testing, performance testing, etc.

Also, the fact that it must be compiled and only has a command line interface will be daunting and possibly overwhelming for many non-technical users. But this facet of jenny is also one of its strongest assets. Being a command line application gives it a broad appeal. It can be used natively on the Win32, Linux, most Unix variants, and even MacOS X

By default, jenny will generate all pair-wise combinations of inputs. But it is flexible and permits n-tuple combinations to be generated (e.g. 3-tuple, 4-tuple, etc.). I don't see higher-order tuples being needed very often, but it's good to know it's there.

In his notes, the tool's author goes into a good amount of detail about some relatively advanced combinatorics that frankly is over my head. While most users will want to generate all-pairs combinations of test cases, triples, quadruples, and more may be generated. It's comforting to know that the tool is capable of more in case I need it.

I find the user interface fairly straightforward, given that it's a command line application. While it certainly doesn't hold your hand, one or two simple examples make it clear how the tool functions in its basic use. Compiling and anything beyond the most basic use of the tool requires some familiarity with a command line interface; Unix familiarity is a plus. Some of the tools I reference in this review (specifically wc and sed) are not natively available in Microsoft operating systems. I found Cygwin to be an excellent compromise as it provides a Unix-like environment under Windows.

The big question will come down to which tool is preferred, jenny or ALLPAIRS (discussed in the Similar tools section). Both are free and distributed with the source code. Jenny beats ALLPAIRS both in performance and optimization, though ALLPAIRS is easier to use.

The author released an updated version of jenny on 03/02/2004. The version fixes some known problems with the "-o" option and adds some internal checking to verify that all requested tuples have been covered. I have not rerun any of the performance tests to determine if they have been affected.

Maturity

4 - Beta (on a scale of 1-5)

I would consider this tool to be at the beta maturity level, with a couple of caveats. The developer created this utility and placed it in the public domain. As of this writing, the developer has created four releases since August 2003 but there is no formal schedule for future releases.

I encountered no crashes or serious errors when using the tool. Key features have been implemented but it is not particularly polished. However there is ample documentation, including detailed info available by running "jenny -h".

Project activity

3 - Stable (on a scale of 1-5)

I would rate the level of project activity as stable. Jenny is not two years old yet, but it has seen a couple of well documented revisions. There is no formal schedule for future updates.

Platforms

The program consists of a single C file. It appears to be ANSI C compliant. The author's web site includes a link to a pre-compiled Windows binary. I was able to compile the program with no problems using gcc on Windows 2000 with Cygwin, and on Red Hat Linux 9. It is likely to work on any system with a C compiler and a shell interface.

Support

I found no mention of formal support. However, the author's web site does include his email address and he was quite helpful to me, graciously answering several of my questions.

No publicly-accessible version control system exists; though old versions of the code are available. The code is well-commented but does not include a version number. A change log is available on the web page. The tool does not have an explicit diagnostic mode.

Documentation

I think the documentation for this product is well done. The documentation consists of a single web page with the following categories:

The author does a good job describing the point of a tool like jenny, and follows up with basic syntax and a treatment of each of the command line options. Also covered are some tips for making jenny more useful (including using tools like wc and sed to filter and manipulate output), and a description of possible future enhancements. The author even lists a few competitive tools, both freeware and commercial.

For more information about all-pairs, see these papers: "Pairwise Testing" by Michael Bolton and "Efficient Testing Using the Pairwise Approach" by Bernie Berger.

Installation

There is no installer. The application consists of one C file, and after compilation there is just one executable file.

Implementation

The tool is implemented in C. I built it very easily using Cygwin/gcc on a Windows 2000 PC. No errors or warnings were reported by the compiler, except when I added the -Wall option the compiler issued several warnings. There are about 1300 non-comment source lines, and 320 lines of comments. The code is very well-formed and commented.

Performance

As with most any command-line application built with C using modern hardware, it's quite fast. I gave it some pretty large problem sets, most of them unreasonably large. Most were completed in under a second, with all finishing in under one minute. The bottom line is that I don't see performance being a limiting factor for anyone using the tool. Memory use by the tool is negligible in the basic examples.

I executed a few tests to see what kind of performance can be expected. All tests were done with the –n2 option (generate pair-wise tests) and run on an IBM ThinkPad with a 1.5 GHz Centrino and 768 MB RAM.

Jenny, run under Windows 2000 with Cygwin, compiled locally with gcc. The test runs used an equal number of dimensions and features within each dimension.

Problem size

Time

Generated Tests

3 x 3

0m0.016s

9

6 x 6

0m0.021s

50

9 x 9

0m0.086s

122

12 x 12

0m0.452s

232

15 x 15

0m1.865s

380

18 x 18

0m6.201s

563

21 x 21

0m17.217 s

791

24 x 24

0m42.423 s

1058

I think that in practical use, 9 x 9, or 12 x 12, would be considered large problem sets (both examples completed in less than one second). If I ran into anything larger than that I would probably work to logically break the problem down into more manageable components. “9 x 9” results in 122 test cases and ”12 x 12” results in 232 test cases.

Similar tools

I think the closest and most sensible comparison would be with James Bach’s ALLPAIRS tool, a similar command line program written in Perl (see the ALLPAIRS Test Case Generation Tool review from January 2003). There are a few other commercial tools that generate pair-wise test cases, but Bach’s tool will draw the most comparisons. One example I ran had ALLPAIRS generating 44 test cases and jenny generating 43 test cases. I was curious to see how the performance of jenny compared with ALLPAIRS, so I executed the same tests with that tool. Note how much quicker Jenny is, and also how it generates a smaller and more efficient set of tests in every case. I am confident that the better performance is due to the fact that Jenny is a compiled C program while ALLPAIRS is interpreted Perl. But again, for practical use, even ALLPAIRS is fast enough for even moderately sized problems.

ALLPAIRS, executed on Windows 2000, using the precompiled allpairs.exe:

Problem size

Time

Generated Tests

3 x 3

0.039 s

10

6 x 6

0.379 s

54

9 x 9

3.657 s

139

12 x 12

19.963 s

272

15 x 15

1m13.291 s

447

18 x 18

3m39.952 s

676

21 x 21

9m15.540 s

949

24 x 24

21m21.560 s

1296


ALLPAIRS offers an advantage to the non-technical tester. It allows, actually all but demands, that variables and their possible states be entered into a spreadsheet. ALLPAIRS is then run from the command line, dumping its output to the standard output, easily redirected to a text file and imported back into a spreadsheet. Since the tester's specific data was provided to the tool to begin with, the results are already in a readable format, no need to translate data or map any cryptic output. In addition to the raw test cases, ALLPAIRS results include diagnostic information showing how the results were chosen and brief metrics on the pairings.

A glaring shortcoming of ALLPAIRS is the inability to specify impossible or nonsensical pairings. Users must either split their problem up into multiple problems, or rework the resultant test cases by hand. Jenny is slightly more daunting for a new user, and in some cases requires a more technical audience, but it does provide more flexibility. Jenny offers a few features that ALLPAIRS does not, but by far the most valuable is the ability to specify impossible or nonsensical pairings. The majority of times I need to generate pair-wise test cases I need to specify some pairing that my application won't logically permit. While I prefer the self-verifying output of ALLPAIRS, I will most often give the nod to jenny based on the ability to rule out impossible pairings. Another factor in jenny's favor is that it's solutions are notably better optimized compared to ALLPAIRS.

Limitations

Some of the command lines needed to use the tool can get rather verbose. It would be nice if the tool could use a configuration file to store common command-line options. The author mentioned this as a possible future enhancement in his notes.

Jenny's output is just a list of combinations of integers (representing different variables) and alpha characters (possible states for each variable); each row represents one test case.

Let's say your problem domain looks like this:

OS

Browser

JavaScript

Mac OS X

Mozilla

Enabled

Win 2000

Netscape

Disabled

Win XP



You would feed this info into jenny using a command like:

$ jenny 3 2 2

The output of jenny would look like this:

1a 2b 3a
1b 2a 3b
1a 2a 3b
1b 2b 3b
1b 2a 3a

This is not the most useful output. The user is left to translate the output into their specific problem domain. In our example, “1a” maps to “Mac OS X”, “2b” maps to “Netscape” and so on.

See Appendix C for a more in-depth treatment of this issue.

Jenny does not generate any kind of debugging information to verify that it actually created all of the required pairings. In simple cases you can determine this manually, but this approach quickly becomes unreasonable in even moderately complicated scenarios. The adventurous can always walk through the code to verify the algorithm.


The error messages that Jenny displays are not always particularly helpful. For example if I enter:
$ jenny -n33

This is displayed:

jenny: -n says all n-tuples should be covered.

A message explaining that I provided no dimensions would have been more appropriate. There are several similar examples of misdirected error messages, but I think that this limitation is somewhat mitigated by the good documentation and the ability to run “jenny -h” to get detailed usage instructions.


Appendix A: Additional examples


A simple example, if we have 3 variables, each with 2 possible states, the jenny command line will look like this:
$ jenny 2 2 2

The output will look like this, 5 total test cases. ‘a’ is the first possible state of a variable, and ‘b’ the second possible state. If there were more states, the progression would continue through the alphabet.

1a 2b 3a
1b 2a 3b
1a 2a 3b
1b 2b 3b
1b 2a 3a
A slightly more complicated example - here we have 3 variables, 2 with 3 possible states, and 1 with 2 possible states:
$ jenny 3 3 2

The output will look like this, 9 total test cases:

1a 2c 3a
1b 2a 3b
1c 2b 3b
1b 2b 3a
1a 2a 3b
1c 2a 3a
1b 2c 3b
1a 2b 3b
1c 2c 3b

Appendix B: Dealing with impossible pairs


Jenny takes a significant step beyond ALLPAIRS: jenny permits users to specify nonsensical pairings, that is, pairings or states that the application under test is not capable of. For example, if we want to test various browsers and operating systems with a web site, my browsers to test with might be Safari, IE, and Netscape. My operating systems might be MacOS X and Windows 2000. My user types might be “returning user” and “purchaser”.

In this case my jenny command might look like this:
jenny 3 2 2
The output would look like this:
1a 2b 3a
1b 2a 3b
1c 2a 3a
1b 2b 3a
1a 2b 3b
1c 2b 3b
1a 2a 3a
Now, we know that “1a” represents Safari, and “2b” represents Windows 2000. That combination will be represented in test cases 1 and 5. But this is a problem as Safari is not available on that platform, so what can we do? Well, we can't just ignore those test cases because we would miss other valid pairings as well. We can manually dissect those test cases and reconstruct replacements by hand. However, that can be a bit tedious and error prone, especially in a non-trivial example. Fortunately jenny allows us to specify nonsensical pairings using the “–w” flag.

We can now try this - it instructs jenny to create all pair-wise test cases but to avoid pairing “1a” with “2b”:

jenny -w1a2b 3 2 2
The output would look like this. Note that we have covered all pair-wise test cases without ever pairing “1a” with “2b”:
1a 2a 3a
1b 2b 3b
1c 2b 3a
1b 2a 3a
1a 2a 3b
1c 2a 3b
Not only did jenny deal with our restriction, it even did with one fewer test case than before. This is an important feature with plenty of real world use potential. In my daily test case creation I run into scenarios like this regularly, and I suspect others do too.

Appendix C: Processing jenny's output


Earlier in this review I noted that regardless of what your problem domain looks like, the results generated by jenny will look like this:
1a 2b 3a
1b 2a 3b
1a 2a 3b
1b 2b 3b
1b 2a 3a
This is not particularly helpful when your problem domain looks like this the following. You would be left to map the values by hand.

OS

Browser

JavaScript

Mac OS X

Mozilla

Enabled

Win 2000

Netscape

Disabled

Win XP



One of the jenny's weaknesses is also a strength. Jenny works from the command line and writes to the standard output stream. This means that the command line parameters can be scripted, but more importantly, results from jenny can be piped into something else. If you want to do a quick check of how many test cases jenny will generate for a given input see the following two examples. Jenny's output is piped into the “wc” (word count) program. wc tells us is the number of lines sent; in our case this is the number of unique test cases.

In this case, jenny generated 7 unique test cases:

$ jenny 3 2 2 | wc -l
 7
In this case, jenny generated 34 unique test cases.
$ jenny 7 4 4 4 3 2 | wc -l
 34
Another powerful example of jenny's strength with respect to standard output is using another tool like “sed” to automatically translate the results into test cases we can use.

This command:

$ jenny 3 2 2 | sed -e 's/1a/MacOS_X/g' -e 's/1b/Win_2000/g' -e 's/1c/WinXP/g'

Will produce:

MacOS_X 2b 3a
Win_2000 2a 3b
WinXP 2a 3a
Win_2000 2b 3a
MacOS_X 2b 3b
WinXP 2b 3b
MacOS_X 2a 3a
The data can also be sorted in any manner you prefer like this:
$ jenny 3 2 2 | sed -e 's/1a/MacOS_X/g' -e 's/1b/Win_2000/g' -e 's/1c/WinXP/g' | sort
MacOS_X 2a 3a
MacOS_X 2b 3a
MacOS_X 2b 3b
WinXP 2a 3a
WinXP 2b 3b
Win_2000 2a 3b
Win_2000 2b 3a
Note that for the sake of brevity and simplicity I have only translated the first column (the “OS” variable) of data. In practice you would translate all columns. As the sed commands get more verbose and wrap on the command lines, it makes sense to place them all in a sed script and refer to the script. For example, if all of your sed commands are in a file named “sed_commands.sed”:
$ jenny 3 2 2 | sed –f sed_commands.sed | sort