Monday, 22 August 2011

Framing: Some Decision and Analysis Frames in Testing


" #softwaretesting #testing "

What is a Frame?
The following is from Tversky and Kahneman's description of a decision frame, ref [1],:
We use the term "decision frame" to refer to the decision-maker's conception of the acts, outcomes, and contingencies associated with a particular choice. 
The frame that a decision-maker adopts is controlled partly by the formulation of the problem and partly by the norms, habits, and personal characteristics o f the decision-maker
When using a decision frame to analyze a problem and come to a decision they call this framing. So, I'll refer to a frame as relating to a decision (or analysis) frame.

Factors at play
As mentioned, many different factors affect how we analyze problems, including:

Temperament/Emotions
  • Anger
  • Happy
  • Optimistic
  • Pessimistic
  • Tiredness
  • Fear (e.g. of failure)
Experience
  • Lessons from past situations - own experience and feedback
  • What has been learnt recently
  • Complacency due familiarity
Strategy
  • Your own vs someone else's
  • Aggressive
  • Military campaign - lots of detailed planning
  • Reactive
The factors and the weight given to them might be different for:
  • Stakeholder view ("Upgrade needs to be reliable", "Of the new feature set only x is live in the quarter")
  • Tester view ("Which risks are most important to look at now?")
  • Developer view ("I did a fix, can you test that first?")

The stakeholder, developer, tester and any other role in the project has a set view on priorities and aims with the project - agendas maybe - and one challenge is in trying to tie these together, or at least understand the differences and how they impact our communication. They all may have similar product goals but their interpretations to their work may be different - their influences and micro-decisions will be different meaning that transparency in communication is important. 

But, there's a catch - the way we present information can affect its interpretation - depending upon the frame that a stakeholder is adopting.

Think of a frame as a filter through which someone looks at a problem - they're taking in lots of data but only the data that gets through the filter gets attention (the rest may end up in the subconscious for later or isn't absorbed), "I have my show-stopper filter on today so I don't notice the progress the team has made…" 

So, being aware of the potential different types of frames that each project member might have as well as some traps associated with frame formulation is important.

Stakeholder Frames
Might include:
  • Emphasizing minimum time for product delivery
  • Emphasizing short iteration times and delivering quickly
  • Trying to minimize cost of product development (cost of testing?)
  • Emphasizing future re-use of the development environment (tendency to worship automation?)
  • Aiming for a reduced future maintenance cost
Tester Frames
Might include:
  • Emphasizing the favourite test approach
  • Emphasizing areas of greatest risk (to?)
  • Emphasizing the last successful heuristic that found a show-stopper bug
  • Emphasizing focus on the data configuration that found the most bugs the last time
  • Emphasizing conformance to a standard over a different aspect of the product
  • Emphasizing the backlog item that seems the most interesting
  • Emphasizing widespread regression as "fear of failure / breaking legacy" affects analysis
  • Emphasizing feature richness over stability
Note, in isolation, some of these frames may be good, but they might not necessarily be good enough.

Framing Problems in Testing

Functional Blindness or Selective Perception

Russo and Schoemaker called it Functional Blindness. This is the tendency to frame problems from your own area of work or study. Dearborn and Simon called this Selective Perception, ref [3], where they noted that managers often focus their attention on areas that they are familiar with - sales executives focussing on sales as a top priority and production executives focussing on production.

In testing this may translate into:
  • Testers with mainly performance test experience focussing on those areas
  • Recent customer support experience leading to a preference to operational and configuration aspects
  • A generalist spreading effort evenly in many areas
Sunk-Cost Fallacy

This is the tendency to factor in previous investments to the framing of the problem, link. A good example is James Bach's Golden Elephant Syndrome, ref [4].

In testing this may translate into:
  • The latest favourite tool or framework of the execs must be used as there has been so much investment in it.
Over-Confidence

As we've seen above there can be many different ways of framing the problem. It's important to be aware of this. There is a trap that testers can think they've done everything they need - their model/s was the most adequate in this situation. 

Here the warning is against complacency - re-evaluate periodically and tell the story against that assessment. It may be that an issue you find during testing affects some of your initial assumptions - the approach might be good, but maybe it could be better. 
(It might be that you can't change course/approach even if you wanted to, but that's good information for reporting to the stakeholder - areas for further investigation.)
Whatever your model, it's one model. Is it good enough for now? What does it lack - what product blind spots does it have?

Measurements and Numbers

Decision frames and framing sometimes uses a way of measuring whether the frame is good or useful - or whether alternatives are equal. There is a danger here when numbers and measurements get involved.

In business and everyday life there can be occasions when figures and measurements are presented as absolutes  and other times when they're presented are relative figures. They can be misleading in both cases, especially when not used consistently. 

Project stakeholders are probably very familiar with looking at project cost and overrun in absolute and relative terms - depending on how they want the information to shine.

So it's very easy for testers to be drawn into the numbers game - and even play it in terms of absolute or relative figures.
  • "This week we have covered 50% of the product"
  • "Our bug reports have increased 400% compared to the previous project"
  • "The number of tests to run is about 60% of the last project"
  • "5 bug reports have been implemented in this drop"
  • "All pre-defined tests have been run"
As you can (hopefully) see this is just data - not necessarily information that can be interpreted. So, beware of number usage traps in the problem analysis and formulation - both in those given to you and in those you send out,

Another aspect of problems with numbers and decision framing can be thought of as the certainty effect, ref [6]. This can affect how we frame problems - and even how we should communicate.

Frames and Framing Need Maintenance

Analyze and periodically check that your assumptions are correct. Sometimes the emphasis of the project changes - the problem to solve changes. Is the frame still right or correct? Are the parameters of the problem still the same, are the reference points and ways in which to measure or judge the frame - are they the same - if not, time to re-evaluate.

Working with Frames
  • What frames do you and your project / organization start with? (Subconcious default)
  • Are there alternative frames to consider? How many were investigated?
  • Look at what each frame includes and excludes
  • What is the best frame fit for the situation / project? (Do all involved agree on the 'good enough' frame?)
References
[1] The Framing of Decisions and the Psychology of Choice (Tversky & Kahneman, Science Vol 211, No. 4481)

[2] Decision Traps: The Ten Barriers to Brilliant Decision-Making and How to Overcome Them (Russo, Schoemaker, Fireside,1990)

[3] Selective Perception: A Note on the Departmental Identifications of Executives (Dearborn, Simon, Sociometry Vol 21, No 2, June 1958)

[4] James Bach "Golden Elephant" Syndrome (Weinberg, Perfect Software: And Other Illusion about Testing, Dorset House, 2008, p. 101)

[5] Calculated Risks: How to Know When Numbers Deceive You (Gigerenzer, Simon and Schuster, 1986)

Sunday, 14 August 2011

Taylorism and Testing

" #softwaretesting #testing "

I've had this in draft for a while, but was triggered to "finish" it by reading Martin Jansson's recent posts on working with test technical debt and nemesis of testers.

Taylorism
When I think about Taylorism I'm referring to the application of scientific principles to management that direct division of labour, "process over person" and generally anything that is the result of a time and motion study (however that might be dressed up).

The result of this is categorising "work" into divisible units, estimating the time for each unit and the skills required to do each unit. Once you have these, then plugging them into a gantt chart is a logical management step.

Estimating
Estimating work using some estimation guide is not the problem here. The problem is when that guide becomes the truth - it becomes some  type of test-related "constant". It's used as such and, more importantly, interpreted as a constant.

Problems with constants might occur if you discuss with your stakeholder items such as:
  • Time to write a test plan
  • Time to analyse a new feature
  • Time to review a requirement
  • Time to write/develop a test case
  • Time to execute a test case
  • Time to re-test a test case
  • Time to write a test report
Traps?
Stakeholders don't usually have time to go into all the details of the testing activity, therefore it's important as testers to not let the items that affect any estimation be considered as constants. This is highlighting to the stakeholder that the assessment of the question depends on the specific details of the project, organisation and problem at hand.

So, re-examining the above items might give some additional questions to help, below. (This is just a quick brain-dump and can be expanded a lot)
  • First questions:
    • How will the answers be used?
    • How much flexibility or rigidity is involved in their usage?
  • Time to write a test plan
    • Do we need to estimate this time?
    • What's the purpose of the plan?
    • Who is it for?
    • What level of detail needs to be there now?
    • What level of detail needs to be there in total?
    • Am I able to do this? What do I need to learn?
  • Time to analyse a new feature
    • Do we need to estimate this time?
    • How much do we know about this feature?
      • Can we test it in our current lab?
      • New equipment needed?
      • New test harnesses needed?
    • Am I able to do this? What do I need to learn?
  • Time to review a requirement
    • Do we need to estimate this time?
    • Are the requirements of some constant level of detail?
    • How high-level is the requirement?
    • Are the requirements an absolute or a guide of the customer's wishes?
    • How often will/can the requirements be modified?
    • What approach is the project taking - waterfall or something else?
  • Time to write/develop a test case
    • Do we need to estimate this time?
    • Do we all have the same idea about what a test case means?
    • Do we mean test ideas in preparation for both scripted and ET aspects of the testing?
    • Do we need to define everything upfront?
    • Do we have an ET element in the project?
    • Even if the project is 'scripted' can I add new tests later?
    • What new technology do we have to learn?
  • Time to execute a test case
    • Do we need to estimate this time?
    • In what circumstances will the testing be done?
      • Which tests will be done in early and later stages of development?
      • Test ideas for first mock-up? Keep or use as a base for later development?
    • What is the test environment and set-up like?
      • New aspects for this project / feature?
      • Interactions between new features?
      • Do we have a way of iterating through test environment evolution to avoid a big-bang problem?
    • What are the expectations on the test "case"?
    • Do we have support for test debugging and localisation in the product?
    • Can I work with the developers easily (support, pairing)?
    • What new ideas, terms, activities and skills do we have to learn?
  • Time to re-test a test case
    • Do we need to estimate this time?
    • See test execution questions above.
    • What has changed in the feature?
    • What assessment has been done on changes in the system?
  • Time to write a test report
    • Do we need to estimate this time?
    • What level of detail is needed?
    • Who are the receivers?
      • Are they statistics oriented? Ie, will there be problems with number counters?
    • Verbal, formal report, email, other? Combination of them all?
Not all of these questions would be directed at the stakeholder.

Depending on the answers to these questions will raise more questions and take the approach down a different route. So, constants can be dangerous.

Stakeholders and Constants
When stakeholders think in terms of constants then it's very easy for them to think in taylorism terms, think of testing as a non intellectually challenging activity and ultimately think of testing as a problem rather than an opportunity for the project.

Some questions that might arise from taylorism:
  • Why is testing taking so long?
  • Why did that fault not get found in testing?
  • Why can't it be fully tested?
Working against Taylorism in Stakeholders

The tester needs to think and ask questions, both about the product and what they're doing. Passive testers contribute to mis-understanding in stakeholders - the tester is there to help improve the understanding of the product.

The relationship of a tester to a stakeholder changes as the tester adds value to project, product and organisation. So, ask yourself the question if and how you're adding value. It's partly about building your brand, but also it might be about understanding the problems of the stakeholder.

The stakeholder frames a problem and presents that to you in a way which might be different from how the customer intended. A good starting point with some questioning is to think in terms of context-free questioning (see Michael Bolton's transcript from Gause and Weinberg, here).

Build your brand, add value to the organisation and product and ultimately the problem with Taylorism will recede.


References
  1. Wikipedia intro on Scientific Management 
  2. Wikipedia intro on Time and Motion Studies
  3. Building your Test Brand
  4. Transcription of Gause and Weinberg's Context-Free Questions

Sunday, 7 August 2011

Reflections on "Done": Regression & Interaction Testing




" #softwaretesting #testing "

If you haven't read it then go and read Michael Bolton's post and comments regarding the usage of "done", here.

From some of the comments, discussions and opinions elsewhere there is a notion that "done" is the complete package. This was the point of Michael's post. But with each sprint, delivery or increment the product evolves - and so any definition of done will be susceptible to problems with definition.

It's very easy to have a mission that includes "existing features work as before" - but there is also a risk of missing something here….

The product evolves - interactions between features and within the system change. Not all of this can be anticipated in advance.

Interactions
So, however you might define "done" for your delivery there will (inevitably) be an area of unknowns that may* be investigated.

It's not as simple as saying this is covered by the exploratory testing (ET) part of the sprint. The ET may be for the new feature, but there is also an element that could* be devoted to learning about interactions between the feature and existing product features or components, and even understanding how the whole behaves (again**).

Of course, a good first step here is to separate out the ET missions for the new feature and the ET missions for interactions and existing features.***

Regression
Some of this might be covered in some "regression" and characteristics measurement and monitoring. But the problem with "regression" is that it doesn't necessarily have an element of learning about how some existing test suite (especially applicable to automated suites) works with the "new" (evolved) system.

An automated regression suite usually has a notion of "this is a fixed reference point" - this script should**** usually work. But the testing that is missing is usually the evaluation of "if and how" it works in the new system. This evaluation is commonly limited to looking at failures and how they should be fixed.

Some shops wrap this up as a "regression" element into the increment's mission (or even definition of done) - but wrapping things up (implying) in some other activity is exactly the original problem with "Done" - it doesn't necessarily reflect the focus of the problem in front of you at a given time.

Now, being able to reflect "exactly" the problem in front of us is something that can't be done "exactly" - that's why we need testing to help evaluate. So, dimensioning (or estimating) an activity should be wary of this.

Ultimately, a lot of this comes down to good testing - separating the assumptions (and implications) from the problem in front of you and illuminating these assumptions and implications with the stakeholder as early as possible.

Good testing starts early!

* even "should" or "must" depending on the context
** understanding how the system behaves is an iterative task.
*** Note, this is all depending on the specific context - it's not automatic to say "do this for every iteration" - each delivery must be assessed on its needs and usage.
**** "should" is one of the most over-and-misused words within software development!


Friday, 5 August 2011

How is your Map Usage?


" #softwaretesting #testing #models "

Preamble
On a recent trip abroad we hired a car which had satnav. The party that we joined also had a range of different satnav products - different makers and no doubt different map versions. 

On one occasion we had a convoy heading to a destination some 35km away, 3 cars (A, B & C) using their respective satnav.  The cars departed for the destination in order ABC but arrived in order BAC each having taken different routes.

First Observation
Now there were a range of different parameters that could affect the route calculation:
  • Shortest route
  • Fastest route
  • Most economical route
  • Route avoid certain points
  • Accuracy of map
  • Plus other factors that are not to do with the map itself - wrong turnings, not driving according to the assumed parameters (eg speed), traffic accidents, etc, etc
After more usage of the satnav I realised some other points...

Second Observation
We tended to notice less en route to different places when using (relying on) the satnav. If we hadn't had it we would probably have noticed landmarks and features in more detail. But now we were in a 'different world' - following someone else's view of the world.

Third Observation
Reliance on the map guidance became less as familiarity of the surroundings increased. We were in the areas for a week and reliance on the map became less over time.


These observations are directly comparable to my software testing experience. The map is just a model - a certain level of detail used for a certain purpose. Change the detail or the purpose and it is a different model.

Also, the use of a map (model) can change over time. On one occasion it might be used as a main guide to get to a destination and after a period of familiarity it is used as a waypoint on course to a destination. Something to help orienteer along the way. I've touched on some of these re-use ideas before when walking in the woods.

Notice the parallels with use and purpose between maps and model usage in software testing.

Some maps are old, some new, some sightseeing, some for walking & hiking. Some use maps as a history - tracking where they have been - they can even use someone else's map and make notes about features they have seen


Points to note with your map (model) usage:

Purpose
  • Is it a normative or informative reference? Does it describe the contours of the landscape (main claims of the product) in a lot of detail, or is it a hand-sketch of the area in much less detail (key features to look at)?
  • Is it a record of progress (as in coverage of features)?
  • Is it a partial plan (X marks the spot - lets look for treasure and other interesting things along the way)?
  • Is it a mission objective?
  • Is it a sightseeing guide (testing tour)?
  • Is it a record of danger (places to avoid or see, marshes and bogs not suitable for walking - bug clusters in the product)?

Representation
  • One thing to note - maps (models) get old. Sometimes that's ok - some features do not change (ancient monuments, town locations, product features). 
  • Sometimes it's not - you want the latest sightseeing information (tourist attraction) or danger areas (buggy areas of the product).
  • Ultimately a map (model) is a view of a landscape (or product). There might be a lot of personal views on the map (model) - what is included, to what detail, what is left out or even omitted.
  • Whether new or old, the use of it should be able to be dynamic - that is, your use of it is the aspect that adds value to the journey (testing session/s).


Caution for Map (Model) use in Software Testing 

From the first observation, and the points above, the models (of product or problem space) should always have a known purpose - fitting the right model to the right objective - and should (ideally) be used in a dynamic way. 

From the second observation this is a caution to question the assumptions connected with the model. If you're in unfamiliar territory then you may need to rely on the model for guidance, but use it dynamically. Is the information correct, what information can I add? Don't just follow - question (even internally) as you go. Even question about how the terrain would look if the map/model was wrong - to give yourself clues about what to look for.

If you rely too much on the map - whether it's someone else's or your own - then you there is a potential to lose touch with your testing objectivity/eye/mindset  - something that is needed for good testing.


Still to come

From the third observation - there are aspects of time perception and the amount of information processing we make (or shortcut) and aspects of focused and stimulated attention - this is an area I'm still researching with potential implications for scripted and exploratory testing (more to come). 


So, how is your map reading and usage? How is your model usage?

Wednesday, 3 August 2011

Carnival of Testers #24


" #softwaretesting #testing "

July was a month for a wide variety of blog posts, with many tidbits….


Ambiguous Words
  • Michael Bolton looked at problems with using the word "Done", here.
  • Pete Houghton on his 2 minutes using Bing Maps and thoughts about consistency and convention.
  • Good points from Eric Jacobson about what is testable, here.


Experience
  • There has been some good blogging on the STC. Jeff Lucas gives some views on scripted automation, here.
  • Joel Montvelisky with some experiences on finding the right testing effort for the product and the organization. Also, here, with some points about thinking before measuring.
  • Still on metrics, Rob Lambert gave a good example of how using measurements as absolutes will give the wrong conclusion.
  • Markus Gärtner's thoughts on a Pomodoro approach to testing.
  • The problem of spam and moderating discussions was highlighted by Rosie Sherry, here.
  • Are you "doing agile"? Some thought-provoking points from Elisabeth Hendrickson, here, on checking how you're agile implementation looks. (This was re-published in July)
  • Good learning experiences from Pradeep Soundararajan about rapid software testing in action, here.
  • Part 3 of Martin Jansson's look at testing debt is worth checking out.
  • Catherine Powell nearly rants, but then reflects on learning, here.


Other
  • Lisa Crispin's review of the Clean Coder sounds like an interesting read.
  • CloudTest Lite got some encouraging words from Scott Barber, here, and Fred Beringer, here


And finally….



Until the next time...