We have developed a new framework, based on the concept of deductive
databases, for context-sensitive program analysis. In this framework,
all program information is stored as relations; data access and
analyses are written as Datalog queries. To handle the large number
of contexts in a program, the database represents relations with
binary decision diagrams (BDDs). The system we have developed, called
bddbddb, automatically translates database queries into highly
optimized BDD programs.
Our preliminary experiences suggest that a large class of
analyses involving heap objects can be described succinctly in Datalog
and implemented efficiently with BDDs. To make developing
application-specific analyses easy for programmers, we have also
created a language called PQL that makes a subset of Datalog queries
more intuitive to define. We have used the language to find many
security holes in Web applications.
We designed a program representation, called IPSSA, for capturing
intraprocedural and interprocedural definition-use relationships of
directly and indirectly accessed memory locations. This
representation makes it easy to create demand-driven
path-sensitive and context-sensitive analyses.
We demonstrated how a program checker based on IPSSA can be used to
find security violations. Our checker, when applied to 10
programs, found 6 new violations and 8 previously reported
ones. The checker generated only one false warning, suggesting that
our approach is effective in creating practical and easy-to-use bug
detection tools.
Our algorithm is based on Heintze and Tardieu's Andersen-style
points-to analysis designed originally for C programs. We have
improved the precision of their algorithm by tracking the fields of
individual objects separately and by analyzing the local variables in
a method in a flow-sensitive manner. Our algorithm represents the
semantics of each procedure concisely using a sparse summary graph
representation based on access paths; it iterates over this sparse
representation until it reaches a fixed point solution. By utilizing
the access path and field information present in the summary
graphs, along with minimizing redundant operations and memory
management overheads, we are able to quickly and effectively analyze
very large programs. Our experimental results demonstrate that this
technique can be used to compute precise static call graphs for very
large Java programs.
Memory leaks and double deletes in large C++ programs are hard to
find; detecting them require not just local analysis, but whole
program analysis. We have made formal an object ownership model that
is commonly used in practice. Objects in this model are owned at all
times. The ownership may be held by a parent object, or by a variable
in a method, and can be transferred during the lifetime of the object.
We have developed an algorithm that can automatically identify objects
that follow the model and locate potential memory leaks and deletions
of dangling pointers in C++ programs, without requiring user
intervention. The algorithm uses a fully flow-sensitive and
context-sensitive analysis to derive the likely object invariants and
to check that the objects are used consistently throughout the
program. The algorithm has been implemented in a tool called Clouseau.
Previous attempts to model memory management with linear types were
too restrictive to be useful on real-life programs. To create a
practical model, we allow object ownership be transferred optionally
in assignments and parameter passing. We use an automatic procedure
to determine if there are any inconsistencies in transfers of
ownership. We make ownership inferencing feasible by introducing an
object invariant that is often found in good object-oriented
programming practice. Namely, members of a class are classified as
either "owned" or "not owned"; i.e. a member is either "always owned"
or "never owned" by the parent object at all public method boundaries.
This approach results in an efficient whole-program flow-sensitive and
context-sensitive analysis, which takes less than 8 minutes on a PC to
analyze a program with over 100,000 lines of code.
Clouseau successfully identifies tens of errors in our SUIF2 research
compiler, and several bugs each in the open-source Mozilla web browser
and the OpenOffice productivity suite. In addition, it found numerous
classes in most of the systems where system extenders could easily
misuse the class implementations because of undocumented interface
restriction on the object. Besides producing a useful tool, this
research contributes to our understanding of deep program analysis by
producing one of the fully flow-sensitive and context-sensitive
analysis tools that works on large real-life C++ programs.
System software often obey rules such as "accesses to variable A must
be guarded by lock B". We have developed an approach called
metacompilation where programmers can add simple system-specific
compiler extensions that check their code. We have written roughly
fifty checkers to find errors in operating systems such as Linux and
OpenBSD. We have also developed techniques to automatically extract
such rules from the program.
Metacompilation checkers find many serious errors in complex, real
systems code. We have written roughly 50 checkers that in
aggregate have found thousands of errors in widely-used systems
such as Linux and OpenBSD. Many errors were the worst type of
systems bugs: those that crash the system, but only after it
has been running continuously for days. We have validated the
results by reporting the errors to the main system developers,
spurring hundreds of fixes (not all errors have been fixed ---
there were so many that the backlog could easily take over a
year to process).
Many checkers are less than 100
lines of code; some of the most effective are less than 20.
MC checkers are easily written. Most of our checkers
were written by programmers who had only a passing
familiarity with the systems that they checked.
Extension writers also need little background knowledge of compilers
in order to write checkers. A good example of this can be
seen in that an undergraduate with no real knowledge of Linux and
OpenBSD and no compiler experience implemented an MC checker that
found roughly two hundred security holes in these systems in a
few months (which is likely a world record for security holes
per unit of time, if not close to the record for sheer
numbers).
(More papers by Dawson Engler and his group can be found
here).
CRED is implemented as an extension of the GNU C compiler
version 3.3.1. The simplicity of our design makes possible a
robust implementation that has been tested on over 20 open-source
programs, comprising over 1.2 million lines of C code.
CRED proved effective in detecting buffer overrun attacks
on programs with known vulnerabilities, and is the only tool found
to guard against a testbed of 20 different buffer overflow
attacks. Finding overruns only on strings
impose an overhead of less than 26% for 14 of the
programs, and an overhead of up to 130% for the remaining six,
while the previous state-of-the-art bounds checker by Jones and Kelly
breaks 60% of the programs and is 12 times slower. Incorporating
well-known techniques for optimizing bounds checking into
CRED could lead to further performance improvements.
Automatically extracted models are useful as documentation or
constraints to be enforced by a static or dynamic checker; developers
can check the models to ensure that they match the original design
intents and testers can use them to evaluate the completeness of a
test suite. Techniques developed include static analyses to deduce
illegal call sequences in a program, dynamic instrumentation
techniques to extract models from the execution runs, and finally a
dynamic model checker that ensures that the code conforms to the
model.
We have tested our model extraction tools over 1.2 millions of lines
of code. The model automatically extracted for the socket
implementation in java.net captures the essence of the class
effectively. The models extracted for the standard Java libraries are
useful as documentation and can be used by static tools to look for
errors in programs. The scalability of the dynamic analysis tool is
demonstrated by running it on the million-line J2EE enterprise edition
platform. The extracted models convey interesting information about
the program being tested and the test suite itself. Finally, when
applied to the joeq virtual machine, the tools found discrepancies
betwen the intended and implemented API. This information is helpful
in the evolution of programs.
Limited hardware support has been provided in the past for performance
monitoring. Here, we are interested in monitoring the execution for
the sake of software reliability, as such, the support must be more
flexible. The program-specific monitor functions are executed in
parallel with the continuation of the normal execution. Because the
hardware buffers up all the side effects of the speculative execution
of the normal thread, the monitoring functions can simply assume they
are operating on the state in which the function is invoked. Because
the side effects of the monitoring functions seldom affect the normal
threads, it is often possible to execute the normal threads in
parallel. In the rare occasion where the monitoring threads wish to
raise an exception, the hardware can simply discard the speculative
state. For error recovery, we advocate the use of fine-grain
transactions in our programming. We execute the code within the
transaction in a speculative thread. The side effects of the
transactions are committed if the thread completes normally, and
discarded otherwise.
We have measured the effectiveness of our proposal by experimenting
with a number of case studies including the implementation of DIDUCE
and Purify, and code to recover from buffer overruns. Our preliminary
results suggest that instrumented code indeed has more parallelism.
For example, our proposed architectural feature manages to execute
80% more instructions with no performance degradation in one of our
test cases. Also, we showed that the speculative thread support
allows computations to recover quickly and thus fends off
denial-of-service attacks that try to crash the processes of an
application.
This work suggests a new direction in architectural research. There
has been so much energy devoted to squeezing the last drop of
performance out of existing applications, we are seeing diminishing
returns. Instead of tuning the micro-architecture to speed up
existing applications, we propose that we look to the future to
promote emerging programming paradigms that enhance reliability and
productivity, and which may be too expensive on today's architecture.
This research is supported in
part by the National Science Foundation under grants No. 0326227 and
No. 0086160, NSF student fellowships and Stanford Graduate
Fellowships. Any opinions, findings, and conclusions or
recommendations expressed in this material are those of the author(s)
and do not necessarily reflect the views of the National Science
Foundation.
Integrated Static and Dynamic Analyses for User-Defined Error and
Security Flaw Detectors
Program analysis has been increasingly used in software engineering
tasks such as auditing programs for security vulnerabilities and
finding errors in general. Such tools often require analyses much
more sophisticated than those traditionally used in compiler
optimizations. In particular, context-sensitive pointer alias
information is a prerequisite for any sound and precise analysis that
reasons about uses of heap objects in a program. Context-sensitive
analysis is challenging because there are over 1014contexts in a
typical large program, even after recursive cycles are collapsed. Moreover,
pointers cannot be resolved in general without analyzing the entire
program.
Michael Martin, V. Benjamin Livshits, and Monica S. Lam
In Proceedings of the Conference on Object-Oriented Programming,
Systems, Languages and Applications (OOPSLA '05),
October 2005.
V. Benjamin Livshits and Thomas Zimmermann
In Proceedings of the ACM SIGSOFT Symposium on the Foundations of
Software Engineering (FSE 2005), September 2005.
In Proceedings of the 14th USENIX Security Symposium,
August 2005.
Monica S. Lam, John Whaley, V. Benjamin Livshits, Michael C. Martin,
Dzintars Avots, Michael Carbin and Christopher Unkel.
In Proceedings of the 24th SIGMOD-SIGACT-SIGART Symposium on
Principles of Database Systems,
June 2005. (Invited Tutorial).
Dzintars Avots, Michael Dalton, Benjamin Livshits, Monica S. Lam
In Proceedings of the 27th International Conference on Software
Engineering,
May 2005.
John Whaley and Monica S. Lam
In Proceedings of the ACM SIGPLAN 2004 Conference on Programming
Language Design and Implementation, pages 131-144,
June 2004.
Best paper award.
IPSSA: An Unsound Path- and Context-Sensitive Pointer Analysis for
Bug Detection
We have developed a pointer alias analysis expressly for automatic error
detection. State-of-the-art pointer alias analyses are either too slow
or too imprecise for finding errors in real-life programs. We propose
a hybrid pointer analysis that tracks actively manipulated pointers
held in local variables and parameters accurately with path and
context sensitivity and handles pointers stored in recursive data
structures less precisely but efficiently. We make the unsound
assumption that pointers passed into a procedure, in parameters,
global variables, and locations reached by applying simple access
paths to parameters and global variables, are all distinct from each
other and from any other locations. This assumption matches the
semantics of many functions, reduces spurious aliases and speeds up
the analysis.
V. B. Livshits and M. S. Lam
In Proceedings of the 11th ACM SIGSOFT International Symposium on the
Foundations of Software Engineering (FSE-11), pages 317-326, September 2003.
Efficient Inclusion-Based Points-To Analysis
We have designed and implemented an efficient
inclusion-based points-to analysis for strictly-typed object-oriented
languages. Our implementation easily scales to millions of lines of
Java code, and it supports language features such as inheritance,
object fields, exceptional control flow, type casting, dynamic
dispatch, and reflection.
J. Whaley and M. S. Lam
In Proceedings of the 9th International Static Analysis Symposium,
pages 180-195,
September 2002.
Clouseau:
Detecting memory leaks in C++ programs automatically by static
analysis
David L. Heine and Monica S. Lam
In Proceedings of the Conference on Programming
Language Design and Implementation, pages 168-181,
June 2003.
Metacompilation:
Detecting critical errors in system software by static analysis
Seth Hallem, Benjamin Chelf, Yichen Xie, and Dawson Engler
In Proceedings of the
ACM SIGPLAN Conference on Programming Language Design and
Implementation., June 2002.
The best description of our MC system.
Focuses on interprocedural analysis, ranking, and simple path sensitivity
to suppress false paths.
Ken Ashcraft and Dawson Engler
In IEEE Security and Privacy, pages 131-147, May 2002.
Uses metacompilation
extensions to find over 100 security holes in Linux and BSD.
Dawson Engler, David Yu Chen, Seth Hallem, Andy Chou, and Benjamin
Chelf
In Proceedings of the Symposium on Operating System Principles,
pages 57-72,
October 2001.
Andy Chou, Junfeng Yang, Benjamin Chelf, Seth Hallem, and Dawson
Engler
In Proceedings of the Symposium on Operating System Principles,
pages 73-88,
October 2001.
David Lie, Andy Chou, Dawson Engler, and David Dill
In Proceedings of the Conference on Computer Architecture,
June 2001
Shows how to check deeper properties than the OSDI paper by
using
MC to automatically extract specifications (models) from actual C code
and then running these models through a formal verifier.
Dawson Engler, Benjamin Chelf, Andy Chou, and Seth Hallem.
In Proceedings of the Conference on the Design and Implementation
of Operating Systems, October 2000.
Best paper award.
It discusses a set of small extensions that found roughly 500
bugs in Linux, OpenBSD, and the Xok exokernel. The extensions were
usually less than 100 lines. It also used extensions to find
hundreds
of optimization opportunities in heavily-tuned software.
Using Redundancy To Find Errors
Programmers generally attempt to perform useful work. If they
performed an action, it was because they believed it served some
purpose. Redundant operations violate this belief. However, in
the past redundant operations have been typically regarded as minor
cosmetic problems rather than serious errors. This paper demonstrates
that in fact many redundancies are as serious as traditional hard
errors (such as race conditions or null pointer dereferences).
We experimentally test this idea by writing and applying five
redundancy checkers to a number of large open source projects, finding
many errors. We then show that even when redundancies are harmless
they strongly correlate with the presence of traditional hard errors.
Finally we show how flagging redundant operations gives a way to detect
mistakes and omissions in specifications. For example, a locking
specification that binds shared variables to their protecting locks
can use redundancies to detect missing bindings by flagging critical
sections that include no shared state.
Yichen Xie and Dawson Engler
Transactions on Software Engineering, pages 915-928, October 2003.
(Award paper forwarded by FSE 2002.)
RacerX: static detection of race conditions and deadlocks
RacerX is a static tool that uses flow-sensitive,
interprocedural analysis to detect both race conditions and deadlocks.
It is explicitly designed to find errors in large, complex multithreaded
systems. It aggressively infers checking information such as which locks
protect which operations, which code contexts are multithreaded, and
which shared accesses are dangerous. It tracks a set of code features
which it uses to sort errors both from most to least severe. It uses
novel techniques to counter the impact of analysis mistakes. The tool is
fast, requiring between 2-14 minutes to analyze a 1.8 million line system.
We have applied it to Linux, FreeBSD, and a large commercial code base,
finding serious errors in all of them.
Dawson Engler and Ken Ashcraft, In
Proceedings of the Symposium on Operating Systems Principles,
pages 237-253, October
2003.
DIDUCE:
Tracking down software errors using dynamic anomaly detection.
DIDUCE is a practical and effective tool that aids
programmers in detecting complex program errors and identifying their
root causes. By instrumenting a program and observing its behavior as
it runs, DIDUCE dynamically formulates hypotheses of invariants obeyed
by the program. DIDUCE hypothesizes the strictest invariants at the
beginning, and gradually relaxes the hypothesis as violations are
detected to allow for new behavior. The violations reported help users
to catch software bugs as soon as they occur. They also give
programmers new visibility into the behavior of the programs such as
identifying rare corner cases in the program logic or even locating
hidden errors that corrupt the program's results.
We implemented the DIDUCE system for Java programs and applied it to
four programs of significant size and complexity. DIDUCE succeeded in
identifying the root causes of programming errors in each of the
programs quickly and automatically. In particular, DIDUCE is effective
in isolating a timing-dependent bug in a released JSSE (Java Secure
Socket Extension) library, which would have taken an experienced
programmer days to find. Our experience suggests that detecting and
checking program invariants dynamically is a simple and effective
methodology for debugging many different kinds of program errors
across a wide variety of application domains.
S. Hangal and M. S. Lam
In Proceedings of the 24th International Conference on Software
Engineering,
pages 291-301,
May 2002.(gzip'ed
postscript).
CRED: A Practical Dynamic Overflow Detector.
Despite previous efforts in auditing software manually and
automatically, buffer overruns are still being discovered in
programs in use. A dynamic bounds checker detects buffer overruns
in erroneous software before it occurs and thereby prevents
attacks from corrupting the integrity of the system.
Dynamic buffer overrun detectors have not been adopted widely because
they either (1) cannot guard against all buffer overrun attacks, (2)
break existing code, or (3) incur too high an overhead. This paper
presents a practical detector called CRED (C Range Error
Detector) that avoids each of these deficiencies. CRED finds
all buffer overrun attacks as it directly checks for the bounds of
memory accesses. Unlike the original referent-object based
bounds-checking technique, CRED does not break existing code because
it uses a novel solution to support program manipulation of
out-of-bounds addresses. Finally, by restricting the bounds checks to
strings in a program, CRED's overhead is greatly reduced without
sacrificing protection in the experiments we performed.
O. Ruwase and M. S. Lam
In Proceedings of the 11th Annual Network and
Distributed System Security Symposium, pages 159-169, February 2004.
Automatic Extraction of Object-Oriented Component Interfaces
Today's large programs are often made out of components, whose methods
must be invoked according to some predefined but often undocumented
sequencing order. Component interfaces in our system are modeled as a
product of finite state machines, where each state-modifying method is
represented as a state in the FSM, and transitions of the FSMs
represent allowable pairs of consecutive methods.
John Whaley, Michael C. Martin and Monica S. Lam
In Proceedings of the International Symposium on Software Testing and
Analysis, pages 218-228,
July 2002.
ACM SIGSOFT Distinguished Paper Award, 2002.
Speculative Threads:
Architectural Support to Improve Software Reliability
A particularly attractive approach to improving software reliability
is to have the program monitor its execution and recover from errors
when they are detected. Unfortunately, the approach is expensive not
just because it slows down a program's execution, but it also places
additional burden on the programmers. We believe that future computer
architectures should provide features that make it easy and efficient
to monitor a program execution and recover for errors. Our research
shows that the mechanisms originally invented to support speculative
thread-level parallelism can be used for this purpose. We can speed
up the monitoring code by executing it in parallel with the original
execution, and use the rollback mechanism to allow for error recovery.
Ph.D. Thesis. Stanford University, Computer Systems Laboratory, August 2004.
J. Oplinger and M. S. Lam,
In Proceedings of the Conference on Architectural Support for
Programming Languages and Operating Systems, pages 218-228,
October 2002.