Does automated enforcement in static analysis provide all you need?
September 08, 2015
Software development activities should include source code reviews to enhance software quality and prevent or remove software defects, and static anal...
Software development activities should include source code reviews to enhance software quality and prevent or remove software defects, and static analysis tools can automate a significant part of this activity while reducing its costs. Code reviews are generally carried out based on coding standards and/or checklists that define which violations or defects should be identified and corrected.
Looking at the C language in particular, popular examples of coding standards are MISRA C and CERT C, which provide guidelines to enhance safety and security respectively (although there is some overlap between the two scopes). MISRA C guidelines have been developed with a particular attention to their enforceability with static analysis and this is reflected in a significant amount of enforcement that can be automatically achieved.
However, two unavoidable limitations stand in the way of fully automated enforcement:
1. In some cases, it’s impractical or impossible to formalize all the information that a static analyser would need to fully enforce a guideline.
2. For certain guidelines, it’s theoretically impossible to construct a general algorithm that may detect all violations (no false negatives) and only violations (no false positives), even if all the information is available to the algorithm, and even if the algorithm can be extended to clear any specific false positive or false negative.
In the latest version of MISRA C (2012), these limitations are reflected in the classification of guidelines. When sufficient information can be provided, a guideline is classified as a rule; otherwise, it’s classified as a directive. When a general algorithm can be constructed, a rule is classified as decidable; otherwise, it’s classified as undecidable.
Guidelines have different priorities and different scopes, but just to get an initial feeling of the potential extent of automated enforcement, the 159 guidelines are partitioned into 16 directives, 27 undecidable rules, and 116 decidable rules.
An example of a directive is all code shall be traceable to documented requirements. In this case, it’s not enough to provide a static analyzer with the whole source code and the configuration of the compiler used to build the application. And it would be impractical or impossible in the first place to formalize any non-trivial requirements.
An example of decidable rule is #undef should not be used. In this case, an algorithm can be constructed to scan any source code and report all occurrences and only occurrences of the #undef pre-processing directive.
An example of undecidable rule is a project shall not contain unreachable code. Can you imagine an algorithm that may precisely identify all instances of unreachable code in any projects?
Undecidability can be a rather unintuitive concept. Software developers are usually presented with a continuous stream of problems to solve, ranging from trivial to impossible, where the limit of what can be achieved is usually determined by familiar factors such as lack of information, excessive problem complexity, steep increase in resource consumption with the domain scope, etc.
Besides all these factors, the automated enforcement of coding standards (or any other informal way of automatically detecting software defects) involves constructing algorithms that may in principle analyze themselves, which introduces a circularity that would result in a paradox, if an additional fundamental limitation – undecidability – didn’t stand in the way of constructing a sound and complete analyser.
If you’d like to know more about undecidability, I’ll be presenting the webinar “Undecidability in Static Code Analysis” on September 30.
Fulvio Baccaglini is a senior software engineer working in the R&D department of Programming Research (PRQA). He holds a master’s degree in Electronic Engineering speciality Computer Science from the Politecnico di Milano and is a member of the MISRA C Working Group.