For most hardware instruction schedulers, all speculatively executed instructions must be discarded when a conditional branch is mispredicted. This constraint is, however, unnecessarily strict. Consider the following simple example:
if (a < 0) b = 1; c = 2;While the assignment b = 1 is executed only if a < 0, the assignment c = 2 is always executed regardless of the value of a. We say that b = 1 is control dependent on the condition a < 0 and that c = 2 is control independent. We refer to the branch on which an instruction is control dependent as its control dependence branch.
A hardware scheduler generally cannot determine which instructions are control independent. In this example, suppose the scheduler speculatively executes c = 2 before the condition is resolved. If the branch is mispredicted, the hardware must discard the assignment to c only so that it can repeat the identical assignment later. A compiler can compute the control dependence and eliminate this inefficiency. More generally, control dependence analysis allows instructions to be moved across many branches. By expanding the range of code from which parallelism can be extracted, control dependence analysis increases the available parallelism.
Control dependence is an intuitively simple concept. For block-structured programs, the immediate control dependence for an instruction is simply the condition in the closest enclosing control construct. For example, in the following code,
for (i = 0; i < 100; i++) if (A[i] > 0) foo(); bar();the call of the foo function is control dependent on the condition A[i] > 0, which in turn is control dependent on the loop exit condition i < 100. The bar function is not control dependent on any part of the loop and can execute in parallel with the loop, assuming there is no data dependence between foo and bar.
Control dependences in programs with arbitrary control flow can easily be computed in a compiler using the reverse dominance frontier algorithm . Hardware techniques for analyzing control dependences have also been considered , but they can only detect a small subset of the control independent instructions and require complex hardware.