Sola Virtus Invicta.

This is a software testing blog I wrote sporadically for a few years between September 2013 and January 2017. The formats here are a bit funky with the images missing. Hit the button over there to view it in situ on my old free Wordpress site.

Testing Testing

Oracles: for problems and planning

An introduction to how oracles are useful thinking tools for finding and justifying problems

An important part of testing is finding problems, or at least potential problems, with the product you are testing. As Michael Bolton says, as testers we ought constantly to be asking ourselves the question; “is there a problem here?”

So, how do we answer that question? When you see something that doesn’t look right, how do you know it’s a bug?

When asked that question, most testers will usually come up with variations on some or all of the following:

  • “I know it’s a bug because it doesn’t meet the acceptance criteria”
  • “I know it’s a bug because it wasn’t like that before”
  • “I know it’s a bug because it seems wrong”
  • “I know it’s a bug because there was an error”

These statements though, as well as many others like them, are quite shallow attempts to communicate the real reason for the belief that something constitutes a problem. What’s happening underneath, usually subconsciously, is that the tester is applying an oracle.

An oracle is a heuristic principle or mechanism by which we recognize a problem.

Heuristics

Before we delve further into oracles, we need to take a little bit of time to talk about that word heuristic that I just used, otherwise things are going to get confusing.

A heuristic, in testing terms, is a fallible method for solving a problem or making a decision. So, it’s pretty much a fancy way of saying “rule of thumb”. In other words, you can use a heuristic and sometimes it will work, and sometimes it won’t. Because it’s not absolute, it needs to be used with skill and judgement.

James Bach explains this nicely with the idea that a hammer is a heuristic:

a hammer can help a carpenter solve a problem, but does not itself guarantee the solution … I like this example because it’s so easy to see that that a hammer may be critical to a skilled carpenter while being of little use to an unskilled lout who doesn’t know what to pound or how hard to pound it or when to stop pounding

In this sense, we must apply the same skill and judgement when we employ heuristics to help us with our testing, which is to say we should strive to be conscious of the heuristics we deploy, lest we expose ourselves as unskilled louts.

Now, let’s distinguish between heuristics and oracles.

All oracles are heuristic, but not all heuristics are oracles. To put it another way, you could say that all oracles have the property of being heuristic – that they are fallible. However, not all heuristics can be applied as oracles – to help us recognize a problem.

Michael Bolton draws an analogy to clarify: all iPhones are smartphones, but not all smartphones are iPhones. In the same way, oracles are a subset of heuristics, but there exist heuristics that are not oracles.

Probably the simplest way to put it, for our purposes here, is that heuristics are a fallible method for solving a problem or making a decision; while oracles are a special kind of heuristic that help us make the specific decision: is there a problem here?

So, back to oracles (which we know are heuristic)

As we established earlier, an oracle is a heuristic principle or mechanism for recognizing a problem.

So, how do they help us to do that?

Earlier, we saw some typical ways in which testers recognize problems with the software that they’re testing:

  • “I know it’s a bug because it doesn’t meet the acceptance criteria”
  • “I know it’s a bug because it wasn’t like that before”
  • “I know it’s a bug because it seems wrong”
  • “I know it’s a bug because there was an error”

However, what the tester really means here is:

  • “I suspect it’s a bug because it’s inconsistent with claims made about the product
  • “I suspect it’s a bug because it’s inconsistent with the historical state of the product”
  • “I suspect it’s a bug because it’s inconsistent with the desires of a reasonable user”
  • “I suspect it’s a bug because it’s inconsistent with the conventions of the product itself”

First, let’s note that we no longer know, we now suspect there’s a problem. This is because we now know that oracles are heuristic. In other words, while we have applied an oracle, we’re aware that that doesn’t guarantee that what we’ve observed is objectively a problem.

As an example, imagine you’re testing a revamp of the UI for your website. The whole interface has been re-thought and re-designed to enhance its consistency with usability standards and modern web conventions. Initially it will feel alien and difficult to you and established users, and from a usability perspective it will feel almost entirely inconsistent with the historical state of the product.

However, the website is now more consistent with the desires of a reasonable user (who isn’t pre-conditioned to the previous interface) and more consistent in terms of the conventions it sets for itself (especially if the previous design had been arrived at piecemeal, over time).

In this instance, you’re unlikely report bugs because of inconsistency with the historical state of the product, as consistency with user desires and product conventions are more important.

The second thing to note is that the structure of each statement is essentially the same:

  • I think it’s a bug because X is inconsistent with Y

Here, X is the behavior or property that the tester has observed, while Y is the oracle that they’re applying. Which is to say, when testers identify a problem, it is most often because of a perceived inconsistency between observed behavior or properties of the product and an oracle.

In other words, oracles are most often consistency heuristics, which is to say, we compare the product itself to an oracle and we look for consistency. When we see inconsistency, this is the trigger or alarm that alerts us to the possibility of a problem. There are exceptions though – some oracles require consistency or some other judgement to indicate a problem, as we’ll see soon.

This is hopefully feeling pretty natural to you, even if you’re still not on board with all that awfully verbose heuristics stuff. Most testers understand that, at a basic level, when they are testing they are comparing the product in front of them to a series of expectations or ideals which, if not met, means there might be trouble.

What most testers don’t necessarily know, is how to consciously apply those oracles – i.e. the series of expectations or ideals – in order to expose a wider array of trouble more efficiently.

So let’s do that now.

FEW HICCUPPS

To consciously apply oracles when we are testing, we must first think carefully about the many and varying things which might reasonably constitute a problem for us, and give them handy labels. Fortunately, Michael Bolton has done this for us in Testing Without a Map, and later refined what he came up with in FEW HICCUPPS (you really ought to read these, but we’ll cover the contents thoroughly here too).

Essentially, he identified that there are eleven main types of oracle that we apply when we are testing software, that he (and James Bach) have thought of so far. These are:

  • Familiarity: we expect the product to be inconsistent with patterns of familiar problems
  • Explainability: We expect the product to be understandable to a degree that we can articulately explain its behavior to ourselves and others
  • World: we expect the product to be consistent with things that we know about or can observe in the world
  • History: We expect the present version of the system to be consistent with past versions of it
  • Image: We expect the system to be consistent with an image that the organization wants to project, with its brand, or with its reputation
  • Comparable Products: We expect the system to be consistent with systems that are in some way comparable. This includes other products in the same product line; competitive products, services, or systems; or products that are not in the same category but which process the same data; or alternative processes or algorithms
  • Claims: We consider that the system should be consistent with things important people say about it, whether in writing (references specifications, design documents, manuals…) or in conversation (meetings, public announcements, lunchroom conversations…)
  • Users’ Desires: We believe that the system should be consistent with ideas about what reasonable users might want
  • Product: We expect each element of the system (or product) to be consistent with comparable elements in the same system
  • Purpose: We expect the system to be consistent with the explicit and implicit uses to which people might put it
  • Statutes: We expect a system to be consistent with laws or regulations that are relevant to the product or its use

Michael Bolton, FEW HICCUPPS, 2012

In other words, what Michael is saying here is that when you think you have found a bug in the product you’re testing, you probably feel that it’s a problem because it is inconsistent with one or more of the above oracles. The exceptions are the first two – with ‘familiarity’, you’ll sense a problem if you observe consistency between product and oracle; with ‘explainability’, a problem is signaled not by relative consistency (or lack thereof), but instead by a more innate value of coherence in the product. Thus, as touched on earlier, most oracles are inconsistency heuristics, but not all are.

On that point, do remember that we must apply heuristics with skill and judgement. It is possible that an observed behavior may be inconsistent with (or otherwise in contradiction of) multiple oracles. Or it may be inconsistent with one, but noticeably consistent with another. In other words, you may have to apply skill and judgement when you determine which oracles you should and shouldn’t apply in each case.

At this stage, I should highlight that FEW HICCUPPS, as a mnemonic tool and as a collection of useful test oracles, is itself heuristic. By which I mean, you may well find a legitimate problem that is not inconsistent with any of these oracles. That does not mean it is not a problem. FEW HICCUPPS is heuristic because while it will often help you identify the oracle you’re using; it is not necessarily a complete set of test oracles.

Conscious application of oracles

Now that we’re conscious of the oracles we most commonly use in testing, let’s revisit our previous statements:

“I know it’s a bug because it doesn’t meet the acceptance criteria” “I suspect it’s a bug because it’s inconsistent with claims made about the software Claims
“I know it’s a bug because it wasn’t like that before” “I suspect it’s a bug because it’s inconsistent with the historical state of the product” History
“I know it’s a bug because it seems wrong” “I suspect it’s a bug because it’s inconsistent with the desires of a reasonable user” Users’ Desires
“I know it’s a bug because there was an error” “I suspect it’s a bug because it’s inconsistent with the conventions of the product itself” Product

As you can see, in each case, the original statement was indirectly applying one of the oracles that Michael identifies. In the first example, any “acceptance criteria” are a set of claims made about what the product should do or look like. The tester who recognized that bug for that reason was using that oracle unconsciously.

Sometimes this isn’t a problem, but the third example can show the real power of being conscious of the oracles we’re using. I’m sure all of you have, at some stage or another, gone to your developer and said “this doesn’t seem right”, or “it just feels wrong that it works this way”, and I’m sure at least most of you have at some stage received a disparaging response. And fair enough too… that’s a pretty vague and fuzzy statement you made.

However, if you had turned to your developer and said that this behavior is “inconsistent with the expectations of a reasonable user of our product” (and maybe gone on to describe that user and why they may reasonably behave in that way), you’ve immediately delivered a far more compelling justification for it being something worth fixing.

Finally, being conscious of these oracles is a powerful tool because when we’re unconscious of them, it’s really easy to forget about them. I’ve seen this in action. When coaching testers or running training courses, I’ve continually asked them how they recognize problems in their day to day work. The four statements I outlined above are a generous summary of the responses I typically receive.

In other words, at best most testers I’ve questioned mostly only identify bugs where the behavior is inconsistent with claims about the product (by far the most common, and usually hand in hand with a checking-heavy paradigm of testing), the historical behavior of the product, expectations of a reasonable user, or conventions or patterns established within the product itself.

Which is to say, many testers are often not consciously looking for or finding bugs that are inconsistent with any of the other oracles that Michael identifies! I’m sure you can see how that might be problematic.

Oracles as test planning tools

As we’ve established, an oracle is a heuristic principle or mechanism for recognizing a problem (I’m repeating that on purpose, it’s worth knowing).

And this is indeed the typical way in which oracles are used. When you are performing or have performed testing, and you observe something unexpected, you can review your oracles to identify the inconsistency that is in play, and use that to more credibly and powerfully justify why the bug matters, and ought to be fixed (or perhaps, why it doesn’t, and shouldn’t).

However, I believe that oracles are equally potent tools to harness whilst you are learning about a product, or designing your tests, or planning your coverage.

Thinking about oracles at this stage can help us to identify the kinds of problems that will be most worrisome for us, and thus which types of problems we should be trying especially hard to identify, if they exist. Knowing this allows us to then tailor our style of testing, our coverage, and the techniques and heuristics we apply, to give ourselves a better chance of finding important problems quickly.

For instance, if you’re developing an administrative application that will be used internally within your organization, it’s likely that you wouldn’t be too concerned with problems that are inconsistent with the image your organization is looking to project. Far more likely is that problems that threaten the purpose or users’ desires will be your priority – to ensure your admin tool is actually useful, and doesn’t leave everyone inside the organization complaining to you about it.

Armed with that information, you’ll likely be better placed to consider the type of experiments you want to perform, and will spend less initial time concerned with the look and feel or general design of the application and more on function and data focused test activities, which are more likely to expose significant problems.

In this sense, being aware up front of the oracles that are most relevant or significant to our testing project can better equip us to find important problems quickly. Of course, we should never close our minds to other problems, as we can never fully anticipate the types of problems we might encounter and, as we go, we may re-evaluate the importance or priority of some aspects of the product.

Oracle-driven test modelling

This idea of using oracles throughout the testing process ties into an exploratory style of testing – where learning, design, performance and interpretation occur in parallel throughout a testing project. It also highlights the important notion that oracles should form part of the model we use to inform our testing.

All testing we perform is informed by the mental model (or models) that contain our understanding of the product and how it should function or perform, our intended test approach and coverage, and other information about the project environment or the quality criteria we’ll apply to help us to evaluate the product and achieve our testing mission.

I’ve written previously (see ‘Visual Modelling: Sharing the Magic of Testing’) that this model combines both explicit and tacit knowledge to comprise a completely unique perspective of the testing problem, and that as such visualising this model can be a powerful approach to minimizing the differing interpretations and assumptions that are inherent in the software development process.

It seems obvious to me that oracles are an important part of this model. I touched earlier on the idea that many testers apply oracles unconsciously, and we’ve explored how a heightened awareness of the oracles they’re applying can increase the power of the testing they perform.

As such, including this information as part of a visual model will serve only to reinforce these benefits, whilst also allowing other stakeholders to gain insight into those oracles may be especially influential to the tester when they evaluate the product, and therefore to provide challenge or feedback to that selection.

Indeed, I can imagine an “oracle-driven” approach to creating a visual testing model being an effective and inspiring way to approach a testing problem. Taking the FEW HICCUPPS set of oracles as a starting point, the tester could ask what kind of problems might be most important to discover, or in what ways the product under test might be inconsistent with these oracles, and then flesh out their test approach and coverage to focus on those areas.

Conclusion

So, we’ve established that an oracle is a heuristic principle or mechanism for recognizing a problem.

But we’ve also hopefully shown that to the skillful tester, an oracle can be so much more than that.

By being conscious of the oracles that they may or do use in their testing, a tester can more credibly and effectively justify the problems they find, but also more powerfully focus and design their testing to expose more important problems, more quickly.

Read More