Test Details

We are often asked how we test our own products. Following a rigorous code inspection of all changes, (its around 120 KLOC at the moment), the Safer C toolset is then subjected to the following dynamic tests which together contain just over 279,000 test cases:

Test	Requirement
FIPS160 regression test	The same suite as NIST use for validating C compilers. It contains well over a 1000 files and several hundred thousand lines of code. Our parsing engine must correctly parse the entire suite for syntax and constraints. Apart from two issues where we disagree on the interpretation of the standard, it does.
Internal regression test	A series of nearly 300 files containing an eclectic mixture of strange C fault modes from around the world we have collected over the years. Correct parsing and warning for each fault mode required.
Oakwood Computing MISRA-C test suite regression test	As high a detection rate as possible for the required and advisory rules. Currently running around 95% for the required rules on our internal suite although the new official conformance suite will affect this.
Numerical C regression test	A regression suite of several hundred files based on the well-known Numerical Recipes C suite by Press et. al. Correct parsing and warning for each fault mode required.
Stress testing	Amongst other things, no core dumps when given crazy inputs, (for example an object file). We also give it to Dr. Tim ("Wrecker") Hopkins of the University of Kent Computing Laboratory, the best stress tester we have ever found. This includes the customary 'sitting-on-the-keyboard' test and many other novel and highly sophisticated test techniques.
GUI testing	Identical scripts for each supported platform are exercised and each option checked. This is manual only.

The source code is internally annotated against the requirements and the parser must also pass itself with no statically detectable code defects.

Finally, when all else fails, we also use several beta testers who can generally break the best products within a very short time, (its a calling) and install the product several times on as many platforms as we can get our hands on. The depressing thing is that its a complex product and after all this, there will still be bugs in it, but hopefully not very many.

Current post-release defect density:-

Parsing engine:- 0.26 per KLOC (1000 lines of source code)

Interface:-0.93 per KLOC