Translating C to C — Easy expressions

Expression evaluation is the process of translating expressions, such as the following, into machine instructions.

y = a*x*x + b*x + c ;

The classic, though rare, method

In some second programming and data structure courses, expression evaluation is often done as an example of programming with stacks. For example an expression such as a*x*x + b*x + c will be parsed into the equivalent postfix, or Reverse Polish, notation, in this case a x x * * b x * + b c +. In many computer organization textbooks, this is the first step into generating stack based assembly code similar to the following:

    PUSH   a
    PUSH   x
    PUSH   x
    MULT
    MULT
    PUSH   b
    PUSH   x
    MULT
    PLUS
    PUSH   c
    PLUS

In a stack-based computer like the Burroughs B5000 from the 1960’s, an instruction like MULT would remove the top two elements of the stack and replace them with their product. Stack-based scientific calculators, such as the early HP35 and today’s TI-83, operate similarly. The popular PDF file format is also largely built on the stack-based programming language PostScript programming language.

Expression evaluation with stacks can be used on the most modern computer architectures, such as the x86-64 which has push and pop instructions. However the push and pop are usually used only for passing arguments to function. When evaluating expression, temporary “variable”s store in high-speed registers are used to store intermediate values. Furthermore, C and Java have some expressions that are difficult to perform on stack. For example, in evaluating something like i < 0 || A[i] == 0, you can’t put i < 0 and A[i] == 0 on the stack and then call perform the logical OR operation, because you shouldn’t even attempt the evaluation of A[i] == 0 when i < 0 is true.

The C-to-C solution

Instead of searching for an automatic solution to expression evaluation, we’ll try an ad hoc approach where you translate most complex expressions into a sequence of simple assignments where only one operator appears on the left hand side. We’ll need to use made-up variable names to do this which we’ll call τi. Most of the τi will be stored in registers.

For a while, we’re going to ignore most of those complex C expressions that involve lvalues (locations). This means you are not going to see pointers, the & operator, or structures here. Those appear in Chapter 6.

Parsing

You will need to parse your code. These means you must pay attention to your programming language’s rules of precedence to know the order in which operators are applied. In a real compiler, this part of the task is usually done with code generated by a compiler compiler such as yacc or bison.

The simple operators

The really simple operators are the arithmetic and bit-wise operators. We’ll also mention function calls in passing, but the function stack will be presented much later.

For example, a statement such as “x = z*sin(f*d) + k” would be translated to a sequence of C statements similar to the following:

τ1 = f * d ;
τ2 = sin(τ1) ;
τ3 = z * τ2 ;
x = τ3 + k;

Just notice that the there is only one operator on the right hand side of each statement.

Very simple statements, such as “x = τ3 + k” can be implemented with a couple of instructions of your target machine instruction set. Here’s an example implementation of this statement in the MIPS architecture.

      lw    $t3,τ3
      lw    $t4,k
      add   $t3,$t3,$t4
      sw    $t3,x

In low-cost microprocessors some operators, such as multiplication or division, may need to be implemented with calls to specialized functions written for your machine architecture. For example, f*d may need to be replaced with something like _MultiplyDouble(f, d) if our computer, like the PIC24, does not support a floating point multiply operation. Some operators will also need to be translated into short sequences of instructions. Perhaps, a 32-bit addition will be performed as two 16-bit additions.

When you implement the relational operators, such as > and ==, you must make sure that these operators return either 0, for false, or 1, for true, as required by the C standard. For example, here is a faithful PIC24 implementation of the C statement “r = x > y ;”.

         CLR         r                     ;; r    <- 0
         MOV         x,WREG
         SUBR        y,WREG                ;; WREG <- x - y
         BRA         LE,1f                 ;; go to the next 1:
         INC         r                     ;; ++r only if x > y
1: