There is a notion, attributed to Heraclitus, that roughly states that you cannot step into the same river twice. The general idea, as I understand it, is that once the water in the river passes you, the river is necessarily different, and, over time, you are also different. The exact water has passed, some sediment erosion has occurred, the fish that was there is now downriver, you are older or more experienced, etc. The river has changed and so have you.

Let’s apply that notion to software testing. Can you run the same test twice? In general, no you can’t, at least not easily. At the very least, the time of day is different. Oh, wait, we have VMs and containers where we can control the date and time; outstanding! But what about the underlying system that runs the VM or the container; what’s its time of day and does it matter for our testing purposes? What about threading and concurrency? Did all the operations execute in the exact same order for the exact same duration with the exact same delay between each of the operations? What about the stimuli testers cause? Are they issued at the exact same relative time every time? What about the responses? Are they received at the exact same relative time every time? Does checking the responses take the exact amount of time every time?

The answer to these questions is no, not at all. If we look at testing from this standpoint, it is not possible to run the exact same test twice. But does that matter?

In a previous blog post, I wrote about appreciable vs non-appreciable differences in our applications and how those differences affect our testing and automation. For a brief summary, appreciable differences are those differences that present intolerable risks if our automated test scripts treat the differences as equivalent; other differences are non-appreciable. I’ll use the same concepts here, but I’ll apply them to the transient nature of our computer systems.

Considering most of the applications we test, differences in the time of day at which our tests or automation run are non-appreciable differences. Running a test script that adds a new customer to a database is not a test that is usually sensitive to time. This stance changes, however, if we are testing an application or feature that is necessarily sensitive to specific dates, days, or times. Think about an application that is supposed to start a new log file at midnight every day. Certainly, we need to test across the midnight boundary. Similarly, do we have operation timeouts that report errors or cause retries on timer expiry? If so, time also matters in those scenarios, and the ability to control time and date on the systems under test matters as well. What if the timeout timer started just before a switch to or from Daylight saving time (DST) or British Summer Time (BST)? We need multiple test scenarios, i.e., test cases or test automation scripts, to check for these identified boundary conditions. There are many more of these kinds of boundary conditions when testing capabilities that are sensitive to times and dates. Check out Noah Sussman’s killer blog post, Falsehoods Programmers Believe About Time. I’ll wager there are many items in that blog for which we don’t routinely test; the question is, however, under which circumstances do any of those items become appreciable differences?

Noah’s post also applies to automators, right? I keep saying automation development is software development; automators need to have a gander at his blog post as well. What assumptions do we make about time and, by extension, date? Are we timing roundtrip messages? Or GUI interactions? What about our timestamps for our automation logs? What happens if we happen to be running across that DST/BST switchover?

Time is but one of the differences that we can encounter in software testing and automation that cause us to “step into a different river”. Machine load, network load, and 3rd-party responsiveness are a small subset of the other differences in our application environment and our test environment that change the nature of “the river” or of ourselves. We as testers and automators change as we grow and learn. Things that we thought were appropriate yesterday may not be appropriate today.

For a specific application, e.g., your application, do these fluctuating factors matter? They might, but in general, they probably don’t. Sometimes. Maybe. Usually. We need to do our own assessments of our specific systems to understand what matters and what doesn’t, and what costs-vs-risks we are undertaking based on our decisions. As always, it depends; it depends on our context (y’all tired of hearing that yet?).

Like this? Catch me at an upcoming event!