Translating C to LC-3: Variables

Variable Allocation

The textbook’s sketchy rules for variable allocation on the LC-3 are found in Sections 12.5 to 12.6.3 (pp. 326-336).

Variables are stored according to their duration.

Variable size

The number of addresses allocated to a variable will depend on the computer architecture.

C typeLC-3 16-bitIntel 32-bitIntel 64-bit
char111
int144
type *148
char[10]101010
int[10]104040

Symbol table

Sometimes the same name will be given to variables with different scopes. For example, the name of a global variable may also be used as the name of a local variables. In these cases, the two variables must be considered distinct. This means that the symbol table must identify the variable by both its name and its scope. However, for simplicity, we will omit the scope in our examples.

Pointless function

int A[100] ;

int addEm(int startHere, int endHere) {
  int i, sum ;
  sum = 0 ;
  for (i=startHere; i<endHere; ++i) {
    sum += A[i]++ ;
  }
  return sum ;
}

Pointless function symbol table

variabledurableoffsetsize
Astatic15100
endHereautomatic41
iautomatic01
startHereautomatic51
sumautomatic-11

The real world

The textbook’s way of handling global variables isn’t really feasible because global variables may be accessed from several programs (or compilation units). In the real world, the best the compiler can do is generate a table (or a list) with an entry for each accessed global variable. If the global variable is allocated within the compilation unit, e.g., is not declared with extern, the compiler may be able to allocate memory for the variable.

The linker is given the task of completing this global references. The linker determines where the global variables are really located and performs relocations to global variable references within the compiled code. One way to do this is to maintain a global offset table (GOT) for each function that contains the addresses of the function’s global variable. In this case, global variable access involves a hidden pointer access.

However, when dynamic linking is used, the addresses of global variables may not be known until the application is loaded into memory. In this case, the loader must perform relocations before it brances to the compiled code.

Here are some references for those who want to know more.

Variable access in LC-3

Variables are read with LDR and written with STR using the register and offset found in the symbol table. Because array variables are constant pointers, they require special handling.

Reading the value (rvalue) of a variable

sumLDR  R0,R5,#-1
endHereLDR  R0,R5,#4
A[3]LDR  R0,R4,#18
AADD  R0,R4,#15

Note that the “value” of A is really the address of its first element, because array variables really are the address of an array's first element.

If a variable is stored more than 32 words from its data pointer (R4 or R5), three instructions are required to access its value. In the following example, the automatic variable x has an offset of -50.

        LD   R0,xOFFSET           ;; R0  :=  -50
        ADD  R0,R0,R5             ;; R0  :=  &x
        LDR  R0,R0,#0             ;; R0  :=  x
; ........
xOFFSET .FILL   #-50

Array indexes also require multiple instructions.

        ADD  R0,R4,#15            ;; R0  :=  A
        LDR  R1,R5,#0             ;; R1  :=  i
        ADD  R0,R0,R1             ;; R0  :=  A+i or &A[i]
        LDR  R0,R0,#0             ;; R0  :=  A[i]

Obtaining the address (lvalue) of a variable

&sumADD  R0,R5,#-1
&endHereADD  R0,R5,#4
&A[3]ADD  R0,R4,#18

There is no &A.

To obtain the address of variables more than 32 words from their data pointers or array elements, just omit the last instruction from the two code sequences shown above.

Dereferencing a variable

If we want to execute a C statement such as “*p = v ;“, we need to first get p and v into registers. Let’s say p is in R2 and v is in R3. Then the C statement can be accomplished with the instruction “STR  R3,R2,#0”. If you wanted to go in the other direction, i.e., “v = *p ;”, then use “LDR  R3,R2,#0”.