# COMS W4117

Compilers and Translators:

Software Verification Tools

Lecture 14: Datalog

October 23, 2007

## Lecture Outline

- Review
- Introduction to Datalog
- Datalog rules
- Execution of Datalog programs

## 1. Review

- Context-sensitive vs. context-insensitive analysis
- Call strings
- Cloning-based context-sensitive analysis
- Summary-based context-sensitive analysis
- Pointer alias analysis
- SQL injections

## 2. Introduction to Datalog

- Datalog is a simplified Prolog-like language initially designed for
database query evaluation.
- Datalog allows us to succinctly express rules about inferring program facts.
- It also allows us to efficiently implement these rules.
- The elements of Datalog are atoms of the form p(X
_{1}, . . . , X_{n}) where
- p is a predicate
- X
_{1}, . . . , X_{n} are terms (variables or constants)

- A ground atom is a predicate with only constants as arguments.
- A literal is an atom or a negated atom.

## 3. Datalog Rules

- Rules are a way of expressing logical inferences.
- A Datalog rule is of the form
- H :- B
_{1} & B _{2} & … & B _{n}

- where H and the B's are literals; H is called the head, and the B's the body of the rule.
- A Datalog program is a collection of rules. The program is applied to a
set of ground atoms for some of the predicates.
- Example:

```
path(X, Y) :- edge(X, Y)
path(X, Y) :- path(X, Z) & path(Z, Y)
```

is a Datalog program that computes the paths in a graph given its set of edges.

An extensional database (EDB) is a predicate whose true facts are defined a priori;
e.g., like edge(X, Y)
An intensional database (IDB) is a predicate defined by a rule; like path(X, Y).
## 4. Execution of Datalog Programs

- Given a Datalog program and a set of facts for each EDB predicate,
we can iteratively find the set of facts defined for each IDB predicate.
We start with the assumption that each IDB is initially empty, and then
apply the rules inferring new facts whenever the rules require us to do so.
We keep applying this process until no more changes occur to any IDB
predicate. The process is similar to the iterative algorithm for
data-flow analysis.
- See example 12.13, p. 924, for computing reaching definitions using Datalog.

## 5. Reading

aho@cs.columbia.edu