I was having a discussion the other day where we were looking at some problems and discussing whether it was sufficient to treat symptoms... I remarked that by treating symptoms, rather than root causes [notes n1], and not allowing time to find the real problems then we would, in many cases, be fooling ourselves [notes n2].
Why? (I was asked)
People like to solve problems that they can see - which means we have a tendency to "fix" the problem we see in front of us. This occurs more often if the problem appears to have a "straightforward" fix -> I think of this as a form of cognitive ease in action. Digging for root causes is a challenging activity, and we sometimes want to believe that the cause we identify is good enough to fix. For another example of cognitive ease, with best practices, see ref [2].
An illustration of this - that I have seen in one form or another - is that we settle for the first solution without understanding (or trying to) the root cause. There is no guarantee that fixing a symptom will make the problem better. Many times, the problem improves for a while, but then re-occurs in another form. Now there is a "new" problem to solve, which usually has the same (or similar) root cause, so from the system perspective it's ineffective [notes n3].
Ineffective, because many problems in processes and organisations are often non-linear, but we often try to solve "linear" problems, and...
Linear Filter
I expanded, that I think of this from a systems thinking perspective as applying a "linear filter" to a "non-linear system".
What? (I was asked again) Linear vs Non-Linear?
Non-linear -> multiple interactions affect the output vs linear -> the output is directly proportional to the input. So the application here is that there are usually multiple causes for a problem -> when I perform an assessment after a root cause analysis (RCA) activity I usually take them root cause by root cause, in the order that we think will have the biggest impact on improvement.
Ok, time for a....
A Real-life example
A fault (bug) report was written by a customer -> initial RCA shows that some "test" was not executed that would (in theory) have caught the problem. This is a symptom and a "linear" view of the problem. The "linear filter trap" is to then consider this as the root (or most important) problem. [notes n6]
Digging deeper shows that the team had it's priorities changed during execution, to make an early drop (resulting in some negative and alternative use-cases being delivered later). This, in itself, is not a problem but the communication that was associated with the "early drop" didn't reflect the change.
In this case, some of the underlying problems were a set of communication issues:
- Partly in the team connected with their story [notes n4] (especially their testing silent evidence [notes n5]),
- Partly connected with the stakeholder that changed priorities and may have had a duty to follow the change of expectations through the delivery chain and what that might mean at those different stages, and
- Communication with the customer to ensure that they are aware of the staged delivery, and what means to them.
Another example of a root cause analysis can be found in ref [3].
And finally
Tackling and fixing symptoms is a very natural activity - very human. But it is not always enough. Sometimes it is enough - it depends, of course, on the scale of the problem and the cost of investigating the underlying problems and tackling those. Sometimes, the underlying problems cannot be fixed and it is sufficient with easing the symptoms.
But I believe, as good testers, it is important to understand the difference between symptoms and root causes, especially where it affects either the testing we do or affects the perception of the testing we do. This is important where a perception of "testing or testers missed something"... So,
Be aware of the linear filter trap!
Notes
[n1] In philosophy and sociology, root causes and symptoms are usually referred to as distal and proximate causes, see ref [1] for more background.
[n2] Slightly naughty of me, playing on the fact that most people don't like to think that they are fooling themselves, but that's a different story...
[n3] The times when it might be effective are when we (some stakeholder) is prepared to take the cost of fixing some problem now. A problem here is that stakeholders that are project-driven have, by the nature of the task, a propensity to see only as far as the end of the project. A product owner may have a different perspective - bear this is mind when someone is deciding whether it's a project problem or a product problem - or even a line organisation problem.
[n4] Story here means the story about the product and the story about the testing of the product.
[n5] Here, testing silent evidence refers to the elements not tested and thus not reported - their significance is assumed to be not important. For further background see ref [4].
-->edit-->
[n6] I should add that the problem with the trap in this example is that I have seen this in the past trigger one of two responses: (1) A perception that the testers are at fault, which becomes a myth/rumour with a life of its own ; (2) A knee-jerk reaction to implement some extra oversight of the test displicine or team as a whole -> in the worst case it becomes a desire to introduce some additional "quality gate". This is a good example of when reacting to the perceived symptom is both ineffective and counter-productive for the organisation.
References
[1] Wikipedia: Proximate and ultimate causation
[2] The Tester's Headache: Best Practices - Smoke and Mirrors or Cognitive Fluency?
[3] The Tester's Headache: Problem Analysis - Mind Maps and Thinking
[4] The Tester's Headache: Mind The Information Gap
No comments:
Post a Comment