The Dirty Little Secrets of Real-World Math

July 13th, 2008

In my “Real-World Math” article that posted to StickyMinds on July 14, I didn’t have room to explore some of the dark secrets underlying the article. First, discussing mathematical topics with my peers and editors often gave me feelings of inadequacy. I mentioned probability theory, and one person starts to talk about how he regularly uses combinatorics to analyze test inputs. The terms he used sounded vaguely familiar, but it would have taken a lot of study for me to really grok what he was doing. When he related a story about bivariate data, I knew I was really on shaky ground. When I posed a question on the Toolsmith Guild about using logarithms to speed up the search for the boundary between a pass and a failure in a test, people started talking about root finding algorithms, slopes, and convergence, and I feared that I didn’t have the foundation for having an intelligent conversation.

Still, I pressed on, determined to write about only things that I understand reasonably well. The finished result, though, doesn’t show how the sausage was made.

It’s all story problems

Remember how annoying it was to do “story problems” in math class, where you weren’t spoon-fed the numbers you needed to plug in to a formula? Well, bad news - most real-world math challenges are story problems, and the relatively simple algebra example in my Real World Math article was one of them. I didn’t describe the difficulty I had in setting up the problem, starting with defining the variables. Is “D” the number of developers before or after some of them become testers? Do I need one variable for before and another for after? “T” is the number of testers, which is 0 right now, but may be non-zero later. I had a few false starts before I decided that it would work best for all variables to reflect the state of things after some of the developers had become testers.

I first set up the equations like this:

T = D / 4
T + D = 200

And solved for D like this:

T + D = 200
D / 4 + D = 200
D(1/4 + 1) = 200
D = 200 / 1.25
D = 160

This is somewhat more complex than the final version I settled on for the article, though it’s just as valid. It seemed a bit more straightforward to represent the first formula as “D = T * 4″, and I solved for T in the calculations instead. I was able to avoid the ugly “1.25″ in the new version.

I imagine that textbook authors go through this same process, and while it’s nice that the published version looks elegant and easy to understand, I think people can get very frustrated when their own calculations look uglier and more belabored than the polished examples in the books. Real-world math is uglier than the textbooks, and my own article, imply that it is.

Bad test code

While writing about the modulo function, I found a bug in my test code. I was keeping a count of the total number of virtual users that had been started, like this:

wlGlobals.userCount++
wlLocals.userNum = wlGlobals.userCount % wlGlobals.totalUsers

This code has a race condition, because it’s multi-threaded. Thread 1 could execute “wlGlobals.userCount++”, then get interrupted so thread 2 can run, which then also executes “wlGlobals.userCount++”, and when we eventually switch back to thread 1, it ends up using the same userCount value as thread 2. I explored ways to do locking to try to make the code thread-safe, but then found the built-in “ClientNum” variable, so I didn’t need to keep my own count after all.

I often find bugs in my test code when trying to explain it to other people. I should do that more often.

How does a logarithm work?

In the article, I write about my disk stress test that chooses a chunk size using this formula: “int exp(rand(log($filesize)))”. I admitted that I still don’t feel like I understand logarithms well enough to understand why this works, so I ran some experiments that showed that it does accomplish what I wanted. I knew I wanted a formula that used a logarithm function, and I used a good deal of intuition to cobble together the formula. Maybe my subconscious understands it, but I still have a nagging sense that I don’t understand what’s going on. I understand the basic concept. I can even draw the rough shape of it on a graph. But I still don’t feel I understand why generating a random number after taking the logarithm then immediately reversing the logarithm actually works. Maybe this is why we’re supposed to show our work when we do our math homework. In this case guessed the answer without doing the work at all.

While trying to study logarithms with Google’s help, I saw that the derivative of the logarithm function is the inverse function (1/x), and a graph of the inverse function looks just like I want the distribution of my test values to be. But I wasn’t about to try to bring calculus into the conversation. That would just ramp up my feelings of inadequacy even further.

I even resorted to slide rules. Having heard that slide rules use a logarithmic scale, I asked my father-in-law to get out his slide rule and show me how it works. He gingerly handed me his prized slide rule, which his own father had used. And I dropped it. Shattered the cursor beyond repair. Anyway, long story short, with eBay’s help I found a replacement cursor, and several slide rules I could keep for myself. But I still haven’t gotten that lesson, and the online tutorials I found have shown me that these things aren’t as simple as I had hoped. So while previous generations may have a deep understanding of what makes a logarithm tick, they have years of experience that my digital calculators have deprived me of.

Oh, one more thing. The first version of the disk stress algorithm that I used actually first calculated the position of the file to start writing from using a random number on a linear scale, then used the logarithmic scale to calculate the chunk size. The range of the chunk size in that case was from 1 to the end of the file from that starting position, making the average chunk size perhaps half as big as it is with the formula I described in the article. I decided it would be more straightforward to first calculate the chunk size, up to the full length of the file, then calculate a random starting position using a range of possible positions where the chunk wouldn’t run off the end of the fixed file size. So again, after having to explain my code, I found a way to improve it.

Popularizing math

I think we need more authors to help popularize mathematical concepts. Like Ivars Peterson did in Islands of Truth: A Mathematical Mystery Cruise. But Peterson talks about four dimensional geometry and fractals, not more practical applications that people are likely to use regularly. I even had a hard time getting into George Polya’s classic book How to Solve It, often on recommended reading lists for computer professionals.

Is math just inherently too difficult to write about in a way that people understand? For my StickyMinds article, we even had to seek outside help to proofread the math. But I don’t think it’s impossible to make math approachable. What do you think?

Unit testing - meeting the Feathers standard

June 9th, 2008

In a posting about unit testing a few months ago (Real unit tests, and bugs that go THWACK), I talked about Michael Feathers’ rules for unit tests, from page 12 in Working Effectively with Legacy Code. He also has a 2005 blog posting about it (A Set of Unit Testing Rules), where he added one more rule, which I’ll come back to in a minute. Feathers says it’s not a unit test if…

  • It talks to the database
  • It communicates across the network
  • It touches the file system
  • You have to do special things to your environment (such as editing config files) to run it.

I don’t think these rules get nearly enough attention. But I’m proud to say that I’m one step closer to meeting the standard in my Perl code, because I’ve found a way to avoid the filesystem. I’ll use a trivial little class as example. This class in the file “foo.pm” reads in a text file with lines formatted like this - “name=value”, and has a method for returning the value for a given name:

use strict;
use warnings FATAL => 'all';

package foo;

sub new {
  my $class = shift @_;
  my $file = shift @_;
  my %self;
  open FILE, $file or die "open '$file': $!\\n";
  while (<file>) {
    chomp;
    /^(\w)+=(.*)$/ || die "syntax error in '$_'\\n";
    $self{$1} = $2;
  }
  bless \%self, $class;
}

sub get {
  my $self = shift @_;
  my $name = shift @_;
  return $self->{$name};
}

1;

To avoid using the filesystem in the unit tests for this code, I need to mock out the filesystem operations. I imagine that there’s a way to override Perl’s builtin file operators like open and <>, but I found it easier to just change my code to use object-oriented file operations that will be easier to mock. So I changed the code like so:

use strict;
use warnings FATAL => 'all';
use IO::File;

package foo;

sub new {
  my $class = shift @_;
  my $fileName = shift @_;
  my %self;
  my $file = new IO::File $fileName or die "open '$fileName': $!\\n";
  while (defined ($_ = $file->getline)) {
    chomp;
    /^(\w)+=(.*)$/ || die "syntax error in '$_'\\n";
    $self{$1} = $2;
  }
  $file->close;
  bless \%self, $class;
}

sub get {
  my $self = shift @_;
  my $name = shift @_;
  return $self->{$name};
}

1;

The behavior hasn’t changed, but now it’s easier to test. Here’s my fooTest.pm file with the beginning of a unit test suite -

package fooTest;

use strict;
use warnings FATAL => 'all';
use base 'Test::Unit::TestCase';
use Test::MockObject;

my $iomock;
BEGIN {
  $iomock = Test::MockObject->new;
  $iomock->fake_module("IO::File",
    getline => sub { },
  );
  $iomock->set_true("close");
  $iomock->fake_new("IO::File");
}

use foo;

sub new {
  my $self = shift()->SUPER::new(@_);
  return $self;
}

sub testEmpty {
  my $self = shift @_;
  $iomock->set_false("getline");
  my $foo = new foo("dummy");
  $self->assert_equals(undef, $foo->get("nothing"));
}

sub testTwoLines {
  my $self = shift @_;
  $iomock->set_series("getline", "a=1\\n", "b=2");
  my $foo = new foo("dummy");
  $self->assert_equals("1", $foo->get("a"));
  $self->assert_equals("2", $foo->get("b"));
}

sub testBadSyntax {
  my $self = shift @_;
  $iomock->set_series("getline", "*=1\\n");
  eval {
    new foo("dummy");
  };
  my $err = $@;
  $self->assert_matches(qr/syntax error/i, $err);
}

This test is based on the xUnit-style Test::Unit::TestCase, which requires just a bit more code to actually get it to do something -

>perl -e "use Test::Unit::TestRunner; Test::Unit::Test
Runner->new->start(@ARGV)" fooTest.pm
...
Time:  0 wallclock secs ( 0.05 usr +  0.02 sys =  0.06 CPU)

OK (3 tests)

The tests use the Test::MockObject module to handle the dirty work of mocking the standard IO::File module. Instead of creating temporary files to use as test input, I simply tell the mock object what to return from the getline method. So I can stay out of the filesystem and not worry about creating and cleaning up temporary files. This also makes it easier to run the tests in parallel. In fact, here’s the extra rule that Feathers added in his blog post - it’s not a unit test if…

  • It can’t run at the same time as any of your other unit tests

I don’t think my test above meets this standard, because the $iomock object is shared among all the tests. I’m not sure yet how to overcome that limitation. Also, some of the MockObject library is still a mystery to me, like how to mock the constructor for a class. But I’m one step closer to the Feathers standard.

Lower impact travel

May 22nd, 2008

After flying in to Boston’s Logan Airport and lining up for the taxi queue, I was surprised to see a sign saying that I could request a hybrid vehicle. And in fact, I saw a hybrid in the queue about to leave with someone. So when I got to the front of the line, I decided to find out low much longer I would have to wait if I requested a hybrid. No hybrids were waiting, so I stood to the side after making my request. After a few minutes, the staff started kicking cabs out of the line, saying they were making room for a hybrid. After less than a five minute total wait, a hybrid 2007 Toyota Camry pulled up, jumped to the front of the queue, and were were on our way. I speculated with the driver how long it would be before most cabs were more efficient, and the hybrid privileges would no longer be offered.

Another thing that lowered the impact of this trip was that for one segment of a multi-city itinerary, I rode on Amtrak. Five hours from Boston to Philadelphia, during almost all of which I was getting work done, plugged into a 120 volt outlet, fielding phone calls, and not being told when I couldn’t stand up or lean my seat back. This was much cheaper for my client than a plane ride, and because the only last-minute options I had for air travel took me far out of the way on a connection, the direct train trip was also the fastest option. Unlike flying over all the intermediate states by air, I felt somewhat more like I was actually visiting the intermediate states, however briefly. Not having checked the details of the route, I was surprised when we made a stop at Penn Station, right underneath midtown Manhattan. I suppressed the urge to run upstairs to the street and buy a hot dog.

One more small thing I’ve started doing is taking the mostly new bars of soap I use at the hotel with me in a plastic bag when I leave. It’s amazing how long they last, and how many new bars I would have otherwise opened and left to be thrown out. Of course, I also hang up my towel and don’t request that the sheets be changed every day if I have that option, though I usually find the towels dry and neatly folded in the evening, so I have to wonder if the housekeeping staff isn’t replacing the towels anyway.

Did you catch the flaw in my hybrid story? I neglected to check whether the hotel had a shuttle service. If the shuttle would be making the trip anyway, my presence on it would have little impact, even if it had a high-emission engine. No matter what we do, there is always someone who is greener-than-thou and willing to sacrifice more convenience for the greater good.

Going back to the CAST conference

May 15th, 2008

I’ve just registered for CAST 2008, the third year for the Conference of the Association for Software Testing. This time it’s in Toronto, July 14-16. Let me know if you’re going too.

Trying to get home from my previous trip to Toronto, I was stranded for two days in a snowstorm. I’m hoping it’ll be a better trip in July.

Freelancers Are People Too, But Different Kinds of People

May 6th, 2008

Jason Gorman again inspires me to write, in response to his post Freelancers Are People Too. He says that most people who write code for a living are freelancers. That’s an interesting statistic, and if it’s true, I would be curious to see a source for it.

What really got me thinking was his story about a survey that one of his clients made of its employees. Freelancers were not included in the survey, and Jason claims that this was a mistake. I don’t know what the survey was about, but Jason makes a reasonable argument that freelancers have opinions that could be valuable to their clients. I’ve seen cases myself where organizations sought feedback from employees but not contractors.

But between the lines, Jason seems to be saying that he doesn’t like being treated like a second-class citizen. I don’t really share this concern. I never forget that I am a service provider for the organizations I’m working for. In fact, in the US the IRS demands that I not act too much like an employee, lest they judge that I actually am an employee and make the client pay additional taxes. There are many things going on between employee and employer that I’m not a part of, and I’m thankful to not be involved in some of them.

A contractor relationship is subtly but qualitatively different from an employee relationship. I want my clients to be successful, but I’m also concerned with making my own business successful. There’s a clear and contractually-defined line between them and me. I still get to feel the thrill of victory when a project is successful, and layered on top of that is the thrill of success in my business and the improved market perception of my services. I also know my engagement with them could vanish in the blink of a budget shortfall.

Another factor that Jason may have in mind is contractor retention. Employers take steps to encourage their employees to keep working for them, but don’t put as much effort into keeping their contractors happy. Given the prevalence of contract labor in the software industry, an investment in “contractor retention” may be worthwhile. In Jason’s case, the client he was frustrated with may have been risking an early departure from him.

Public courses offered for test management and performance testing

May 1st, 2008

I have added three public course dates to my training offerings, all via Rex Black Consulting Services. I will be teaching all of these -

Managing the Testing Process (brochure):

June 25-27, 2008 – Dallas, TX

Performance Testing Immersion Workshop (brochure):

September 22-24, 2008 – Denver, CO

October 21-23, 2008 – Austin, TX

If you sign up, please tell them I sent you!

What hockey sticks and vegetables have to do with software performance

April 21st, 2008

My StickyMinds column posted today - Peeling the Performance Onion, co-authored this time with Rex Black. When I taught the Performance Testing Immersion Workshop recently, the students were surprised to learn about hockey-stick shaped performance graphs. They also learned about how performance tuning is similar to peeling an onion, and in the article I extended the vegetable theme by applying Rudy’s Rutabaga Rule. (By the way, I’m amazed that my new column is already on the first page in a Google search for “Rudy’s Rutabaga Rule.” And now even this blog entry is also on the first page.)

So anyway, we decided to share these concepts on StickyMinds. How well do these ideas fit with your experience? Please post your comments here or on StickyMinds.

Real unit tests, and bugs that go THWACK

April 1st, 2008

Jason Gorman posted about “The Future Of TDD - Real-Time Feedback As You Type?” on his blog. He says that unit tests should give us feedback as fast as the feedback he got when dropping his security badge -

“On my way back from lunch today, I was walking down the stairs when I heard a loud “thwack” and turned round to see that my security pass had come loose from its clip and fallen to the floor.”

Wouldn’t it be great if bugs made a thwacking sound the moment they landed in our code? He goes on to imagine an IDE someday running unit tests as we’re typing code so we get real-time feedback. He suggests that if it takes five minutes to run unit tests, that’s far too long, so we’ll have to wait until computers get much faster before his idea is feasible.

But I would suggest that if we have unit tests that take five minutes to run, we may not be doing unit testing properly. I recently wrote a unit test for a Perl module using Perl’s xUnit - Test::Unit. There are 18 tests, achieving close to 100% coverage, and they run in about 200 milliseconds of cpu time and well under a second of elapsed time. I like Michael Feathers’ criteria for unit tests, which say that they don’t talk to a database, don’t communicate across a network, don’t touch the filesystem, don’t require configuration files, and if a test case takes longer than 1/10 of a second, it’s glacially slow (see Working Effectively with Legacy Code). I did violate one of these, because my tests do create and delete files several times; I haven’t found a good way to mock file access in Perl. But I’m happy with the subsecond run time.

It’s conceivable that a large system could have five minutes worth of well-written unit tests. Jason hinted that he’s thinking of running all of the system’s unit tests when he says that we could try to take a shortcut and only run the unit tests that are dependent on the code that changed. But that doesn’t sound like unit testing to me, it sounds more like integration testing. I do have suites of subsystem tests and system tests that take more than a minute to run, but I don’t call them unit tests. So we’re probably using different definitions of what a unit test is. I know that many people use the term “unit test” very loosely, which tends to mask the fact that people often aren’t really getting the benefits of good unit testing.

So I think that Jason’s desire for very quick feedback is closer to our grasp than he thinks. We need to test our code in extreme isolation from the rest of the system, so that as many bugs as possible will make a loud “thwack” in the unit test environment, before we pull out the slow integration tests.

Update: seeing the inelegance of “close to 100%”, I added a few tests and now have 100% branch and condition coverage for my class. But I still have much to learn about migrating my black box testing skills to unit testing.

Gray Matters podcast on the Boneyard and the test tools market

March 26th, 2008

SQE’s Joey McAllister interviewed me for the March 2008 Gray Matters podcast (click the link for the mp3 file), where we talked about the testingfaqs.org Boneyard and the strength of the worldwide test tools market.

International consultant

March 5th, 2008

What is an “international consultant?” Someone willing to travel to help clients in any country? If that’s the case, I’ve been an international consultant since I got my passport shortly after opening my consulting practice, just in case. But that doesn’t seem sufficient for the title “international consultant.” How about having clients in at least eight countries? But I hadn’t actually left home to do what I do for them, so that didn’t seem like good enough either.

Perhaps tonight I’m a real international consultant. I’m sitting in a hotel not far from Toronto, Canada, where I’m teaching a course. So though I’m barely across the US-Canada border, I’ve finally used my passport for something other than a pleasure trip. Sorry to dispel the image if you thought I was already a globe trotting road warrior. :-)

It doesn’t matter to me what you call me, as long as people keep calling me with interesting software testing challenges I can help with.