Compilers and Translators:

Software Verification Tools

Lecture 13: Applications of Interprocedural Analysis

October 18, 2007

- Review
- Context-sensitive vs. context-insensitive analysis
- Call strings
- Cloning-based context-sensitive analysis
- Summary-based context-sensitive analysis
- Pointer alias analysis
- SQL injections
- Buffer overflows

- Intraprocedural vs. interprocedural analysis
- Call graphs

- Context-insensitive interprocedural analysis treats each call and return statement as goto operations ignoring the order of the calls.
- We can model this kind of analysis with a super control-flow graph (like the PDG) which is a control-flow graph with additional edges connecting each call site to the beginning of the procedure it calls and each return statement to the instruction following the call site.
- A context-sensitive analysis keeps track of the context in which each procedure was called.

```
for (i = 0; i < n; i++) {
c1: t1 = f(0);
c2: t2 = f(243);
c3: t3 = f(243);
X[i] = t1 + t2 + t3;
}
int f(int v) {
return (v+1);
}
```

- A calling context is defined by the contents of the entire call stack.
We refer to the string of call sites on the stack as the
*call string*. - Consider a modification of the above program where the calls to f are now calls to g, which then calls f:

```
for (i = 0; i < n; i++) {
c1: t1 = g(0);
c2: t2 = g(243);
c3: t3 = g(243);
X[i] = t1 + t2 + t3;
}
int g(int v) {
c4: return f(v);
}
int f(int v) {
return (v+1);
}
```

- In cloning-based context-sensitive analysis we create a conceptual clone of the procedure for each context of interest and then apply a context-insensitive analysis to the cloned call graph.
- In practice we do not clone the procedure; we use an internal data structure to keep track of the analysis results on each clone.
- A cloned version of the above program is shown here:

```
for (i = 0; i < n; i++) {
c1: t1 = g(0);
c2: t2 = g(243);
c3: t3 = g(243);
X[i] = t1 + t2 + t3;
}
int g(int v) {
c4.1: return f1(v);
}
int g(int v) {
c4.2: return f2(v);
}
int g(int v) {
c4.3: return f3(v);
}
int f1(int v) {
return (v+1);
}
int f2(int v) {
return (v+1);
}
int f3(int v) {
return (v+1);
}
```

- In summary-based context-sensitive analysis we create a concise description ("summary") of the observable behavior of each procedure.
- The purpose of the summary is to avoid reanalyzing the procedure body at each invoking call site.
- Each procedure is represented by a region with a single entry point.
- A procedure region can be nested within several different outer regions.
- The analysis has two parts:
- A bottom-up phase that computes a transfer function to summarize the effect of a procedure.
- A top-down phase that propagates caller information to compute callee results.
- The following program shows the result of propagating all possible constant arguments to the function f in the program in section (4) above:

```
for (i = 0; i < n; i++) {
c1: t1 = f0(0);
c2: t2 = f243(243);
c3: t3 = f243(243);
X[i] = t1 + t2 + t3;
}
int f0(int v) {
return (1);
}
int f243(int v) {
return (244);
}
```

- We can improve the accuracy of program analyses if we can determine whether two pointers can be aliases of one another.
- If we can determine that p and q cannot point to the same location in the following code fragment, then we can conclude that x is 1 at the end of the third statement.

```
*p = 1;
*q = 2;
x = *p;
```

- Suppose a banking application uses a relation:

```
AcctData(name, password, balance)
```

```
SELECT balance FROM AcctData
WHERE name = ':n' and password = ':p'
```

```
n = Charles Dickens' --
p = who cares
```

```
SELECT balance FROM AcctData
WHERE name = 'Charles Dickens' --' and password = 'who cares'
```

```
SELECT balance FROM AcctData
WHERE name = 'Charles Dickens'
```

- ALSU, Sect. 12.2.

aho@cs.columbia.edu