Just how effective is static analysis, what does it protect you from?
There is a lot of attention given to static analysis tools, especially from the software security community - and some serious venture capital money being thrown at static analysis tool providers such as Coverity.
The emphasis on using static analysis tools started with Cigital's CTO Gary McGraw in his definitive book on Software Security: Building Security In. In a recent interview with Jim Manico for OWASP (Jan 2009), Dr. McGraw went so far as to say that
“My belief is that everybody should be using static analysis tools today. And if you are not using them, then basically you are negligent, and you should prepare to be sued by the army of lawyers that have already hit the beach”.
Statements like this, from a thought leader in the software security community, certainly encourage companies to spend more money on static analysis tools, and of course should help Cigital’s ownership position in leading static analysis tool provider Fortify Software, which Cigital helped to found.
You can learn more about the important role that static analysis plays in building secure systems from Brian Chess, CTO of Fortify, in his book Secure Programming with Static Analysis.
Secure software development maturity models like SAMM and BSIMM emphasize the importance of code reviews to find bugs and vulnerabilities, but especially the use of static analysis tools, and OWASP has a number of free tools and projects in this area.
Now even Gartner has become interested in this the emerging emerging static analysis marketand its players – evidence that the hype is reaching, or has reached, a critical point. In Gartner’s study of what they call Static Application Security Testing (SAST) suppliers (available from Fortify), they state that
“...enterprises must adopt SAST technology and processes because the need is strategic. Enterprises should use a short-term, tactical approach to vendor selection and contract negotiation due to the relative immaturity of the market.” Well there you have it: whether the products are ready or not, you need to buy them.
Gartner’s analysis puts an emphasis on full-service offerings and suites: principally, I suppose, because CIOs at larger companies, who are Gartner’s customers, don’t want to spend a lot of time finding the best technology and prefer to work with broad solutions from strategic technology partners, like IBM or HP (neither of which has strong static analysis technology yet, so watch for acquisitions of the independents to fill out their security tool portfolios, as they did in the dynamic analysis space). Unfortunately, this has led vendors like Coverity to spend their time and money on filling out a larger ALM portfolio, building and buying in technology for build verification, software readiness (I still don’t understand who would use this) and architecture analysis rather than investing in their core static analysis technology. On their site, Coverity proudly references a recent story "Coverity: a new Mercury Interactive in the making?", which should make their investors happy and their customers nervous – as a former Mercury Interactive, now HP customer, I can attest that while the acquisition of Mercury by HP may have been good for HP and good for Mercury’s investors, it was not good for Mercury’s customers, at least the smaller ones.
The driver behind static analysis is its efficiency: you buy a tool, you run it, it scans thousands and thousands of lines of code and finds problems, you fix the problems, now you’re secure. Sounds good, right?
But how effective is static analysis? Do these tools find real security problems?
We have had success with static analysis tools, but it hasn’t been easy. Starting in 2006, some of our senior developers started working with FindBugs (we’re a Java shop) because it was free, it was easy to get started with, and it found some interesting, and real, problems right away. After getting to understand the tool and how it worked, some cleanup and a fair amount of time invested by a smart, diligent and senior engineer to investigate false positives and setup filters on some of the checkers, we added FindBugs checking to our automated build process, and it continues to be our first line of defense in static analysis. The developers are used to checking the results of the FindBugs analysis daily, and we take all of the warnings that it reports seriously.
Later in 2006, as part of our work with Cigital to build a software security roadmap, we conducted a bake-off of static analysis tool vendors including Fortify, Coverity, Klocwork (who would probably get more business if they didn't have a cutesy name that is so hard to remember), and a leading Java development tools provider whose pre-sales support was so irredeemably pathetic that we could not get their product installed, never mind working. We did not include Ounce Labs at the time because of pricing, and because we ran out of gas, although I understand that they have a strong product.
As the NIST SAMATE study confirms, working with different tool vendors is a confusing and challenging and time-consuming process: the engines work differently, which is good since they catch different types of problems, but there is no consistency in the way that warnings are reported or rated, and different terms are used by different vendors to describe the same problem. And there is the significant problem of dealing with noise: handling the large number of false positives that get reported by all of the tools (some are better than others), understanding what to take seriously.
At the time of our initial evaluation, some of the tools were immature, especially the C/C++ tools that were being extended into the Java code checking space (Coverity and Klocwork). Fortify was the most professional and prepared of the suppliers. However, we were not able to take advantage of Fortify’s data flow and control flow analysis (one of the tool’s most powerful analysis capabilities) because of some characteristics of our software architecture. We verified with Fortify and Cigital consultants that it was not possible to take advantage of the tool’s flow analysis, even with custom rules extensions, without fundamentally changing our code. This left us with relying on the tool’s simpler security pattern analysis checkers, which did not uncover any material vulnerabilities. We decided that with these limitations, the investment in the tool was not justified.
Coverity’s Java checkers were also limited at that time. However, by mid 2007 they had improved the coverage and accuracy of their checkers, especially for security issues and race conditions checking, and generally improved the quality of their Java analysis product. We purchased a license for Coverity Prevent, and over a few months worked our way through the same process of learning the tool, reviewing and suppressing false positives, and integrating it into our build process. We also evaluated an early release of Coverity’s Dynamic Thread Analysis tool for Java: unfortunately the trial failed, as the product was not stable – however, it has potential, and we will consider looking at it again in the future when it matures.
Some of the developers use Klocwork Developer for Java, now re-branded as Klocwork Solo, on a more ad hoc basis: for small teams, the price is attractive, and it comes integrated into Eclipse.
In our build process we have other static analysis checks, including code complexity checking and other metric trend analysis using an open source tool JavaNCSS to help identify complex (and therefore high high-risk) sections of code, and we have built proprietary package dependency analysis checks into our build to prevent violation of dependency rules. One of our senior developers has now started working with Structure101 to help us get a better understanding of our code and package structure and how it is changing over time. And other developers use PMD to help cleanup code, and the static analysis checkers included in IntelliJ.
Each tool takes different approaches and has different strengths, and we have seen some benefits in using more than one tool as part of a defense-in-depth approach. While we find by far the most significant issues in manual code reviews or exploratory testing or through our automated regression safety net, the static analysis tools have been helpful in finding real problems.
While FindBugs does only “simple, shallow analysis of network security vulnerabilities", and analysis of malicious code vulnerabilities as security checks, it is good at finding small, stupid coding mistakes that escape other checks, and the engine continues to improve over time. This open source project deserves more credit: it offers incredible value to the Java development community, and anyone building code in Java that who does not take advantage of it is a fool.
Coverity reports generally few false positives, and is especially good for finding potential thread safety problems and null pointer (null return and forward null) conditions. It also comes with a good management infrastructure for trend analysis and review of findings. Klocwork is the most excitable and noisiest of all of our tools, but it includes some interesting checkers that are not available in the other tools – although after manual code reviews and checks by the other static analysis tools, there is rarely anything of significance left for it to consider.
But more than the problems that the tools find directly, the tools help to identify areas where we may need to look deeper: where the code is complex, or too smarty pants fancy, or otherwise smelly, and requires followup review. In our experience, if a mature analysis tool like FindBugs reports warnings that don’t make sense, it is often because it is confused by the code, which in turn is a sign that the code needs to be cleaned up. We have also seen the number of warnings reported decline over time as developers react to the “nanny effect” of the tools’ warnings, and change and improve their coding practices to avoid being nagged. And the final benefit of using these tools is that this frees up the developers to concentrate on higher-value work in their code reviews: they don’t have to spend so much time looking out for fussy, low-level coding mistakes, because the tools have found them already, so the developers can concentrate on more important and more fundamental issues like correctness, proper input validation and error handling, optimization, simplicity and maintainability.
While we are happy with the toolset we have in place today, I sometimes wonder whether we should beef up our tool-based code checking. But is it worth it?
In a presentation at this year’s JavaOne conference, Prof. Bill Pugh, the Father of FindBugs says that
“static analysis, at best, might catch 5-10% of your software quality problems.”
He goes on to say, however, that static analysis is 80+% effective at finding specific defects and cheaper than other techniques for catching these same defects – silent, nasty bugs and programming mistakes.
Prof. Pugh emphasizes that static analysis tools have value in a defense-in-depth strategy for quality and security, combined with other techniques; that “each technique is more efficient at finding some mistakes than others”; and that “each technique is subject to diminishing returns”.
In his opinion, “testing is far more valuable than static analysis”, and “FindBugs might be more useful as an untested code detector than a bug detector”. If FindBugs finds a bug, you have to ask: “Did anyone test that code”?”. In our experience, Prof Pugh’s FindBugs findings can be applied to the other static analysis tools as well.
As technologists, we are susceptible to the belief that technology can solve our problems – the classic “silver bullet” problem. When it comes to static analysis tools, you’d be foolish not to use a tool at all, but at the same time you’d be foolish to expect too much from them – or pay too much for them.