Hot on the heels of my previous blog, “Is It Automated?”, I was inspired to ponder this question: “Do automation scripts always pass or fail?”. As with my first StickyMinds article, I was inspired to think about this due to an email exchange with automation expert Greg Paskal; I blame and thank him for being this post’s catalyst.

Greg sent a question to our colleague Mark Bentsen and me asking about vocabulary for non-Boolean test results. His question pertained specifically to test scripts that were likely to fail eventually. Some of his results indicated that the scripts were taking longer and longer to execute. They were not yet reporting failures, but the behavior indicated a looming issue. Mark had the idea of adding another column to the execution report that depicts other aspects of the results, in addition to the original pass/fail. This idea is a quick, relatively inexpensive way to report the additional information, but it didn’t sit well with me…at first.

Let me explain.

I’ve long been a fan of pass/fail results for traditional “test-script-based” automation: if no assertions fail, then the script has a result status of pass, otherwise, this status is “fail”. A status of “fail” is an alert that someone needs to evaluate and act upon the results.

At a previous company, we had result statuses of “pass”, “fail”, and “warn”; “warn” was required by the test team. When pressed for how “warn” would be used, the response was “to indicate that the results would need to be looked at to determine pass or fail”. Fair enough; today I’d call those instances something like indeterminate, but I was not so sophisticated back then. I was, however, sophisticated enough to notice that there was a bit of “metrics massaging” when reporting “percent pass” and, in most cases, those scripts with a result status of “warn” would get treated as “didn’t fail”, and therefore “pass”.

In retrospect, most teams I’ve worked with eventually build two buckets of these result statuses. These buckets generally take one of two formats:

  • Pass or “didn’t pass”, where all “didn’t pass” scripts are treated as a fail
  • Fail or “didn’t fail”, where all “didn’t fail” scripts are treated as a pass

These realizations shaped my opinion to be “if we aren’t going to treat scripts with different results differently, then why bother with different results.”

Fast forwarding to today, Greg’s question and Mark’s response got me thinking: what are other legitimate states of automation results apart from pass and fail? I discovered that, for me, the problem isn’t really the non-Boolean result statuses; the problem is treating different result statuses the same. The key is presenting the statuses in a way that screams “Hey human, come look at this! There may be a problem!” so that they are not treated the same.

Great! We figured out that we need to present statuses in a way that is understandable by our team and also conveys the appropriate sense of urgency (added dimension on a dashboard, additional result status types, etc.). Now, we need the data to determine and present our new reporting facet(s). In Greg’s case, he had the data to make the “slowing down” determination, but many of us may not be logging at this level of detail. If not, we need to log additional data from our automation executions. We should look at our logs and determine if we have or could add information that would help us add one or more reporting facets. If we have access to them, cross-referencing automation logs with the logs from the program being tested may be immensely helpful. Imagine being able to mark a test script’s results as “indeterminant” because all the steps passed but our script detected a warning in the product’s execution logs!

I’ve focused on automation results in this post because that’s what Greg’s email was about. It’s possible that this extends to non-automation results as well, but, to paraphrase Alton Brown, that’s a topic for another show. Also, a big thanks to Greg and Mark for the inspiration.

Like this? Catch me at an upcoming event!