All software projects are guaranteed to have one artifact in common—source code. Together with architectural risk analysis[1], code review for security ranks very high on the list of software security best practices (see Figure 1[2]). Here, we’ll look at how to automate source code security analysis with static analysis tools.
Since ITS4’s release in early 2000, the idea of detecting security problems through source code has come of age. ITS4 is extremely simple—the tool basically scans through a file looking for syntactic matches based on several simple “rules” that might indicate possible security vulnerabilities (for example, use of strcpy() should be avoided). Much better approaches exist.
Programmers make little mistakes all the time—a missing semicolon here, an extra parenthesis there. Most of the time, these gaffes are inconsequential; the compiler notes the error, the programmer fixes the code, and the development process continues. This quick cycle of feedback and response stands in sharp contrast to what happens with most security vulnerabilities, which can lie dormant (sometimes for years) before discovery. The longer a vulnerability lies dormant, the more expensive it can be to fix, and adding insult to injury, the programming community has a long history of repeating the same security-related mistakes. The promise of static analysis is to identify many common coding problems automatically before a program is released.
Static analysis tools examine the text of a program statically, without attempting to execute it. Theoretically, they can examine either a program’s source code or a compiled form of the program to equal benefit, although the problem of decoding the latter can be difficult. We’ll focus on source code analysis here because that’s where the most mature technology exists.
Manual auditing, a form of static analysis, is very time-consuming, and to do it effectively, human code auditors must first know what security vulnerabilities look like before they can rigorously examine the code. Static analysis tools compare favorably to manual audits because they’re faster, which means they can evaluate programs much more frequently, and they encapsulate security knowledge in a way that doesn’t require the tool operator to have the same level of security expertise as a human auditor. Just as a programmer can rely on a compiler to consistently enforce the finer points of language syntax, the operator of a good static analysis tool can successfully apply that tool without being aware of the finer points of security bugs.
Testing for security vulnerabilities is complicated by the fact that they often exist in hard-to-reach states or crop up in unusual circumstances. Static analysis tools can peer into more of a program’s dark corners with less fuss than dynamic analysis, which requires actually running the code. Static analysis also has the potential to be applied before a program reaches a level of completion at which testing can be meaningfully performed.
Static analysis can’t solve all your security problems. For starters, static analysis tools look for a fixed set of patterns, or rules, in the code. Although more advanced tools allow new rules to be added over time, if a rule hasn’t been written yet to find a particular problem, the tool will never find that problem. When it comes to security, what you don’t know is likely to hurt you, so beware of any tool that says something like, “zero defects found, your program is now secure.” The appropriate output is, “sorry, couldn’t find any more bugs.”
A static analysis tool’s output still requires human evaluation. There’s no way for a tool to know exactly which problems are more or less important to you automatically, so there’s no way to avoid trawling through the output and making a judgment call about which issues should be fixed and which ones represent an acceptable level of risk. Knowledgeable people still need to get a program’s design right to avoid any flaws—although static analysis tools can find bugs in the nitty-gritty details, they can’t critique design. Don’t expect any tool to tell you, “I see you’re implementing a funds transfer application. You should tighten up the user password requirements.”
Finally, there’s Rice’s theorem, which says (in essence) that any non-trivial question you care to ask about a program can be reduced to the halting problem. In other words, static analysis problems are undecidable in the worst case. The practical ramifications of Rice’s theorem are that all static analysis tools are forced to make approximations and that these approximations lead to less-than-perfect output. A tool can also produce false negatives (the program contains bugs that the tool doesn’t report) or false positives (the tool reports bugs that the program doesn’t contain).
False positives cause immediate grief to any analyst who has to sift through them, but false negatives are much more dangerous because they lead to a false sense of security. A tool is sound if, for a given set of assumptions, it produces no false negatives, but the down side to always erring on the side of caution is a potentially debilitating number of false positives. The static analysis crowd jokes that too high a percentage of false positives leads to 100 percent false negatives because that’s what you get when people stop using a tool. A tool is unsound if it tries to reduce false positives at the cost of sometimes letting a false negative slip by.
Probably the simplest and most straightforward approach to static analysis is the Unix utility grep. Armed with a list of good search strings, grep can reveal quite a lot about a code base. The down side is that grep is rather lo-fi because it doesn’t understand anything about the files it scans. Comments, string literals, declarations and function calls are all just part of a stream of characters to be matched against. Better fidelity requires taking into account the lexical rules that govern the programming language being analyzed. By doing this, a tool can distinguish between a vulnerable function call gets(&buf); a comment /* never ever call gets */ and an innocent and unrelated identifier >int begetsNextChild = 0; Basic lexical analysis is the approach taken by early static analysis tools, including ITS4, Flawfinder and RATS, all of which preprocess and tokenize source files (the same first steps a compiler would take) and then match the resulting token stream against a library of vulnerable constructs. Earlier, Matt Bishop and Mike Dilger built a special-purpose lexical analysis tool specifically for the purpose of identifying time-of-check to time-of-use (TOCTOU) flaws[3].
While lexical analysis tools are certainly a step up from grep, they produce a hefty number of false positives because they make no effort to account for the target code’s semantics. A stream of tokens is better than a stream of characters, but it’s still a long way from understanding how a program will behave when it executes. Although some security defect signatures are so strong that they don’t require semantic interpretation to be identified accurately, most are not so straightforward.
To increase precision, a static analysis tool must leverage more compiler technology. By building an abstract syntax tree (AST) from source code, such a tool can take into account the basic semantics of the program being evaluated.
Armed with ASTs, the next decision to make is the analysis’ scope. Local analysis examines the program one function at a time and doesn’t consider relationships between functions. Module-level analysis considers one class or compilation unit at a time, so it takes into account relationships between functions in the same module and considers properties that apply to classes, but it doesn’t analyze calls between modules. Global analysis involves analyzing the entire program, so it takes into account all relationships between functions.
The analysis’s scope also determines the amount of context the tool considers. More context is better when it comes to reducing false positives, but it can lead to a huge amount of computation to perform. Researchers have explored many methods for making sense of program semantics. Some are sound, some aren’t; some are built to detect specific classes of bugs, while others are flexible enough to read definitions for what they’re supposed to detect. Let’s review some of the most recent tools:
Many static analysis approaches hold promise, but have yet to be directly applied to security. Some of the more noteworthy ones include ESP (a large-scale property verification approach[10]), model checkers such as SLAM and BLAST (which use predicate abstraction to examine program safety properties[11,12]), and FindBugs (a lightweight checker with a good reputation for unearthing common errors in Java programs[13]). Several commercial tool vendors are starting to address the need for static analysis, moving some of the approaches touched on here into the mainstream.
Good static analysis tools must be easy to use, even for non-security people. This means that their results must be understandable to normal developers who might not know much about security and that they educate their users about good programming practice. Another critical feature is the kind of knowledge (the rule set) the tool enforces. The importance of a good rule set can’t be overestimated.
In the end, good static checkers can help spot and eradicate common security bugs. This is especially important for languages such as C, for which a very large corpus of rules already exists. Static analysis for security should be applied regularly as part of any modern development process.
References
Co-authored by Brian Chess, chief scientist at Fortify Software. His technical interests include static analysis, defect modeling, and Boolean satisfiability. He received a Ph.D. in computer engineering from the University of California, Santa Cruz. Contact him at brian@fortifysoftware.com.