Exhaustive Testing


 The following is  a response i sent to Kit who commented on my blog on ‘Insufficient Testing’ ….

Thanks for your comment. It’s almost a catch 22 situation. One of the principles of testing (according to ISTQB) is that Exhaustive Testing is impossible – I agree but the question is how much do you test and when do you know enough is enough?

For a complex system my thoughts would center around risk and priorities as your starting point. The approach or method used would ultimately rest on what level of auditability you must provide to the Business (they ultimately make the decision to go or no go.) Personally I would still use Exploratory Testing (if I was ‘allowed’ to) because in my experience I would be more likely to find something of value more often than through scripts.

However, in saying that, if the test team is involved right at the beginning of the project through walkthroughs, reviews or inspections (or any other type of review)than clarification and understanding will no doubt increase amongst the testing team with regards to the system.

After doing a Wikipedia search on Dr. Deming, one of his quotes is quite applicable to software testing… “Acceptable Defects: Rather than waste efforts on zero-defect goals, Dr. Deming stressed the importance of establishing a level of variation, or anomalies, acceptable to the recipient (or customer) in the next phase of a process. Often, some defects are quite acceptable, and efforts to remove all defects would be an excessive waste of time and money.” It is known that major commercial software often ships with known (and unknown) defects – MS Windows, Firefox v2.0 etc – its is reasonable then for the business to decide how much of the ‘risk’ they wish to carry. Testers should provide the necessary information to enable business to make that decision (good or bad).

At one New Zealand bank that I worked in, the test team I became involved with tried hard to exhaustively tested everything in a very complex application. The upshot was that one release took almost 12 months to ‘complete’ testing (there were other factors involved – personnel, political and management)BUT I guarantee that they could not say that that application was bug free. So I guess that leads to the second question – how much is enough?

James Bach says “When I exhausted the concerns of my internal critic (and external critics I asked to review my work), I decided it was good enough” (refer http://www.satisfice.com/articles/how_much.shtml).

NASA’s software safety standard (http://satc.gsfc.nasa.gov/assure/nss8719_13.html) NASA-STD-8719.13A September 15, 1997 – Section 3.4.5 says “The test results shall be analyzed to verify that all safety requirements have been satisfied. The analysis shall also verify that all identified hazards have been eliminated or controlled to an acceptable level of risk. The results of the test safety analysis shall be provided to the ongoing system safety analysis activity.” What then is an acceptable level of risk and acceptable to whom? Risk is then defined in this document as “…As it applies to safety, exposure to the chance of injury or loss. It is a function of the possible frequency of occurrence of the undesired event, of the potential severity of resulting consequences, and of the uncertainties associated with the frequency and severity.” Also in the document under section 1.4 Tailoring it says “….The tailoring effort shall include definition of the acceptable level of risk, which software is to be considered safety-critical, and whether the level of safety risk associated with the software requires formal safety certification.” Therefore at the end of the day , it’s a business decision taken within context of the business. As testers, we can test complexity within the context of the project and report back our findings – it is then up to those charged with making the ‘big’ decisions, to make them – or not!

Insufficient Testing

F-22 RaptorsIs a test team ‘liable’ if the product/software fails in some way? A recent post to the Software Testing Yahoo groups forum brought this to light and got me thinking.

Jared Quinert – a proponent of ET from Australia said “…a lack of testing – that insufficient testing requires some co-conspirator to cause a project to fail?
Sadly, nothing stops people trying. Googling ‘”insufficient testing” project failure’ goes some way to demonstrating this.”

So i did….try googling “insufficient testing” and see what comes up. There are, according to Google, 493,000 references to insufficient testing. This then begs the question – What is insufficient testing?

I worked recently within a test group that was fixated on exhaustive testing – they literally wanted to test everything and anything (and with good reason i might add – the situation i.e. context – surrounding them was NOT conducive to a co-operative approach. The harder the test group tried the more they got blamed.) It was hard to changed that mindset because they had litteraly been burnt in the past. What this meant was a huge overhead in terms of time. This group is the opposite of insufficient testing because they wanted to do everything.

However, it is a fact of life (this has been well documented in a number of articles, blogs etc) that software testers cannot find everything. Software is complex (ask NASA), software can be daunting and despite testing things do go wrong – just ask the US Air Force

(http://en.wikipedia.org/wiki/F-22_Raptor#Recent_developments )

“While attempting its first overseas deployment to the Kadena Air Base in Okinawa, Japan, on 11 February 2007, a group of six Raptors flying from Hickam AFB experienced multiple computer crashes coincident with their crossing of the 180th meridian of longitude (the International Date Line). The computer failures included at least navigation (completely lost) and communication. The fighters were able to return to Hawaii by following their tankers in good weather. The error was fixed within 48 hours and the F-22s continued their journey to Kadena”

Was this fault because of insuffcient testing or was it the result of other factors? In my experience of failed projects, insufficient testing usually isn’t the cause rather a lack of cohesion between PM, vendor, BA’s, developers, testers – each group assumed a territorial stance and placed their ego in the way.

As Gen. Colin Powell (ret) says ” never let your ego get so close to your position that when your position falls, your ego goes with it.”

Often there was some sort of conflict or barrier (whether declared or otherwise) that existed in which the leadership group was unable to break through. Disharmony in a project team will definitely achieve less with more.

So then is insufficient testing clearly a fault of the test team?

 Sometimes it is.

If the team was not aligned to the Project goals and was off on their own agenda then yes. However, if there are external influences involved then insufficient testing may be a symptom of a bigger problem.