Randomness in Tests
Wouldn't it be cool if a few uses of the
random module in your unit tests
could discover bugs in your code? Meh.
The problem is that random tests are non-deterministic, meaning they lack the highly desirable property that if they pass in my local development environment, they'll also pass on the build server.
But... you might find bugs! Practically for free!
Martin Fowler went so far as to say that non-deterministic tests are: 1. "useless" and 2. "a virulent infection that can completely ruin your entire test suite"
Let's say you have a few tests that just fail every now and then. Maybe they're integration tests that are impacted by the load on some server. It would be great to better isolate the code under test so that doesn't happen, but that just hasn't hit the top of the priority list.
This trains the team to just mash the Rebuild button on the build server until the tests pass. What if there was a real bug in there? Curiosity is overcome by the need to get a build, so it would probably go unnoticed.
While accidentally non-deterministic tests are unfortunate, deliberate randomness is annoying. I saw a test recently that randomly picked one of a half-dozen enumeration values for any run. It failed once. Huh, that's weird. Run again, it passed. Run it three more times and I can't get it to fail. OK, whatever.
There was an actual bug there -- one of the enum values had the wrong numerical value associated with it, and the test caught it, but so rarely that it just seemed like some kind of fluke. It probably would have been better to run the test with all of the enum values.
So no randomness at all in tests?
Well, sometimes it is really convenient to use randomness. For example, if you have database models with UUIDs as primary keys, it gets awkward if aren't allowed to generate random IDs for test objects.
Or maybe you're using something like Factory Boy. That library implements the "Object Mother" pattern, wherein a factory creates objects with as many default values as possible, which is convenient for testing.
For example, let's say you want to create a few User objects, and a user has a bunch of required fields, like first name, last name, email, phone number, address, birth date, etc. With Factory Boy, I can just say:
user1 = UserFactory.create() user2 = UserFactory.create() user3 = UserFactory.create()
Each created user gets default values for each field, though I could pass in an explicit value for any of them, should that be relevant to the test. They're also unique if needed, like if it is invalid for two users to have the same email. This is really nice because it doesn't obscure the logic of the test with tons of irrelevant setup -- if I don't care what a specific field value is, it just gets filled in.
Even so, you can meet these requirements without randomness. If you don't care
what a user's email address is, just that it is unique, a counter
firstname.lastname@example.org, etc.) works just as well as
"realistic" data randomly generated by something like
So here are some guidelines for randomness:
OK, fuzz testing is inextricably tied to randomness, and can be a valuable quality and security tool. But I think it is important to separate functional and fuzz testing. The functional tests that would make up the commit phase of your build pipeline should be fast and deterministic. A later fuzz phase can take longer and, until a very high level of maturity is achieved, be run with the expectation that some human analysis will be required. Even for fuzz testing the second two guidelines above still apply.