Previous Up Next

Chapter 9  Optimization of the BOL representation

Most optimizations in the Moby compiler are performed on the BOL representation described in Chapter 6. Currently, the optimizer performs the following phases:

9.1  Variable substitutions

9.2  Census

The Census module provides code to compute information about the number of uses of variables. This information is used by the various transformation phases. Specifically, the census computes a use count for every BOL variable, which reflects the number of non-binding occurances of the variable, and an application count for every variable that is bound as a function or continuation.

The Census module also provides functions for adjusting the census data when code is eliminated or replicated.

9.3  Denesting

We often find it useful to introduce extra levels of block (or let) nesting when transforming BOL. Unfortunately, such extra structure hides data dependencies from simple analysis, such as that used by the contraction phase (see Section 9.4). For example, consider the following BOL fragment:
{ let a0 = { let b1 = "blah" let x3 = { let y4 = { let z5 = True in (z5) } in y4 } in x3 } in a0 }
Without data-flow analysis, the fact that a0 is True is obscured by the nesting of blocks. The denesting transformation flattens unnecessary block-nesting.1 For example, the above fragment is reduced to the following by the denesting phase:
{ let b1 = "blah" let z5 = True let y4 = z5 let x3 = y4 let a0 = x3 in a0 }
The contraction phase can easily reduce this fragment to
{ let z5 = True in z5 }
in one pass.

The denesting transformation is implemented in the BOLUtil module (file bol-util.sml) and takes time linear in the size of the expression. It can be either applied to an entire expression or just to the top of an expression.

For the Mandelbrot program (about 50 lines of Moby code), applying the denesting transformation prior to contraction eliminated 80 let-float transformations and reduced the number of contraction iterations over two phases from 9+3 to 2+2.

9.4  Contraction

The Contract module implements a collection of transformations that are guaranteed to shrink the size of the program.

9.5  Cross-module inlining

9.6  Eta splitting

The EtaSplit module applies h-expansion to escaping functions that have known call sites. This transformation is useful, because it turns known calls to escaping functions into known calls to known functions, which allows useless variable elimination and specializing of calling conventions. The h-expansion transformation is only applied to those escaping functions that have known call sites. Recall that escaping functions are those that either are marked as exported or have a use count that is greater than their application count. If a function has a non-zero application count and is escaping, then we split it into two functions.

9.7  Useless varible elimination

The Useless structure is responsible for removing variables that do not contribute to the computation (even though they may be used). This phase proceeds in two steps. First, we compute the set of useful variables of the program. A variable is defined to be useful if it satisfies one of the following conditions: The last three of these conditions propagate usefulness backwards through the dataflow and require that we iterate the analysis until a fixed point is reached.

Once the set of useful variables has been computed, we can transform the program to eliminate useless variable (i.e., those not in the set). This transformation is implemented by walking the tree looking for binding occurances of useless variables and eliminating them.

9.8  Small constant copying

Because the BOL representation does requires that all operands be variables, normal optimizations may result in sharing of small constants (i.e., binding a BOL variable to a small constant that is used in multiple contexts). This situation can result in poor code, since shared small constants will not be used as immediate operands in the generated machine instructions. A related problem is when a small constant appears as a free variable in a function. To address these problems, the constant copying phase replicates variables that are bound to small constants at the site of their use. For example, consider the following BOL code fragment:
let x : int = 1 fun f (y : int) { let t = y + x ret t } let z = f(x) ...
The constant copying phase will rewrite this fragment as follows:
fun f (y : int) { let x1 : int = 1 let t = y + x1 ret t } let x2 : int = 1 let z = f(x2) ...

9.9  Cluster conversion

The cluster phase takes care of both closure conversion and cluster merging, as well as the LCPS transformation []. It works by first performing a series of analyses to determine where CPS splits are needed, how to map functions to clusters, closure representation, etc.. It then uses the information collected during these analyses to drive a single transformation phase that produces the BOL cluster representation.
1
It is essentially a restricted form of the let-floating transformation described by Peyton Jones, Paqrtain and Santos [PPS96].

Previous Up Next