Translating C to C: Control Structures

Here we consider how C statements are implemented with a lower-level architecture such as the LC-3 or PIC. For now, we are going to ignore the problem of evaluating C expressions. We are just going to describe how as C program with loops and conditionals can be transformed into a C program with goto’s and one simple form of C’s if statement.

The if statement

The if statement has the following form where statement represents the body of the if statement.

if (expression)
  statement

This statement can be transformed into a sequence of C statements similar to the following:

  int δ = expression ;
  if (δ == 0) goto λ ;
  statement
λ :

If this code segment, δ represents a special C variable that is generated for the purpose of this translation. The variable name must be chosen so that it is unique and cannot conflict with any legal variable names. Greek letters, sometimes suffixed with unique integers, will do well for this purpose.

Similarly, λ represents a uniguely generated name for a location.

The simple one-line if statement used in our example would be familar to FORTRAN IV programmers.

if (δ == 0) goto λ ;

All processors can do simple branches like this if in one or two instructions. Usually, δ must be loaded to set the processor’s condition codes. In a simple machine languauge like that of the LC-3, the branch is then performed in a single instruction, such as “BRZ  λ”. On an Intel computer, it would be something like “je  λ”. On recent PIC processors, “BRA  Z, λ” should do the trick.

Processors with “skip” instructions, such as smaller PICs, will need a two instruction sequence. The second instruction performs the branch and the first instruction decides it the second should be skipped.

     btsc   SR,Z        ;; skip the next instruction if Z is set
     goto   λ

A nested example

These rules must be applied recursively to translate a program. Suppose we have been asked to translate the following more complex C if statement.

if (n % 4 == 0) {
  ++julianLeap ;
  if (n % 400 == 0 || n % 100 != 0) {
    ++gregorianLeap ;
  }
}

In this case, two unique data variables, δ1 and δ2, along with two unique labels, λ1 and λ2, would be needed.

The translated code would look something like the following.

  δ1 = ( n % 4 == 0 ) ;
  if (δ1 == 0) goto λ1 ;
  ++julianLeap ;
  δ2 = ( n % 400 == 0 || n % 100 != 0 ) ;
  if (δ2 == 0) goto λ2 ;
  ++gregorianLeap ;
λ2 :
λ1 :

Notice how the inner if is translated inside the outer if.

Anything else?

The if-else can be translated by adding two labels in the code. For example, consider the following rather abstract C code.

if (expression)
  statement1
else
  statement2

It could be transformed as follows:

  int δ = expression ;
  if (δ == 0) goto λ1 ;
  statement1
  goto λ2 ;
λ1 :
  statement2
λ2 :

So the following example

if (a > b)
  m = a ;
else
  m = b ;

would be changed to

  int δ = (a > b) ;
  if (δ == 0) goto λ1 ;
  m = a ;
  goto λ2 ;
λ1 :
  m = b ;
λ2 :

The switch

The switch statement isn’t pretty, so you can’t expect its transformation to be easily explained. However, the switch can be viewed as series of if choices which select the target code for each choice. The break statements are replaced with goto’s to the end of the switch.

Let’s do an example using the following silly C code.

switch (sizeNum) {
case '0':
case '1':
  sizeChar = 's' ;
  break ;
case '2':
  sizeChar = 'm' ;
  break ;
default:
  sizeChar = 'l' ;
}

This can expressed switch-less as:

  if (sizeNum == 1 || sizeNum == 2)
    goto λ1 ;
  else if (sizeNum == 3)
    goto λ2 ;
  else
    goto λ3 ;

λ1:
  sizeChar = 's' ;
  goto λ4 ;

λ2:
  sizeChar = 'm' ;
  goto λ4 ;

λ3:
  sizeChar = 'l' ;

λ4:

Iterative statements

Translating the while isn’t hard. The code begins with a test that evaluates the continuation condition and exits the loop if it is false. At the end of the loop is a goto back to the beginning.

Consider the following abstract loop

while (expression)
  statement

It can be translated into if controlled code as:

λ1:
  if (! expression) goto λ2;
  statement
  goto λ1:
λ2:

However, many compilers generate the following because it makes the loop a tad faster.

goto λ2
λ1:
  statement
λ2:
  if (expression) goto λ1 ;

Fortunately for us, the for is often described using the while. Thus a for statement like the following:

for(init ; condition ; increment)
  statement

Can be translated into the following while statement. Be sure to put the increment after the for statement.

init ;
while(condition) {
  statement
  increment ;
}

And thus the following two sections of C code do the same.

for(i=0 ; i<10 ; ++i)
  sum = sum + i ;
i=0 ;
while(i<10) {
  sum = sum + i ;
  ++i ;
}

We’re going to leave the while-do as an exercise to the reader. It’s really not hard. Just make the test at the end of the loop rather than the beginning.

Taking a break or continueing

If a break statement is used inside a loop statement, it needs to replaced by a goto that leaves the loop, without performing any of the tests for continuing the loop.

If a continue statement is used inside a while statement, it needs to replaced by a goto that braches to the beginning of the the loop. In this case, the tests for continuing the loop must be performed. A continue within a for statement should be replaced to a goto to the code where the for loop increment statement is performed.