The textbook's sketchy rules for variable storage on the LC-3 are found in Sections 12.5 to 12.6.3 (pp. 326-336).
Variables are stored according to their duration.
The amount of space allocated to a variable will depend on the computer architecture.
C type | LC-3 | Intel |
---|---|---|
char | 1 | 1 |
int | 1 | 4 |
type * | 1 | 1 |
char[10] | 10 | 10 |
int[10] | 10 | 40 |
R5
R4
int A[100] ; int addEm(int startHere, int endHere) { int i, sum ; sum = 0 ; for (i=startHere; i<endHere; ++i) sum += A[i]++ ; }
variable | durable | offset | size |
---|---|---|---|
A | static | 15 | 100 |
endHere | automatic | 4 | 1 |
i | automatic | 0 | 1 |
startHere | automatic | 5 | 1 |
sum | automatic | -1 | 1 |
The textbook's way of handling global variables isn't really feasible. In the real world, every function (or compilation unit) has a table (or a list) with an entry for each accessed global variable. The compiler just allocates space for these entries and provides a way for the linker to associate named global variables with these entires. The linker places the address of the global variable into the entries.
Effectively, every global variable reference is similar to a pointer reference.
sum | LDR R0,R5,#-1 |
endHere | LDR R0,R5,#4 |
A[3] | LDR R0,R4,#18 |
A | ADD R0,R4,#15 |
Note that the "value" of A
is really the address of its first element,
because array variables are the address of an array's first element.
If a variable is stored more than 32 words from its data pointer
(R4
or R5
), three instructions are
required to access its value. In the following example,
the automatic variable x
has an offset of -50.
LD R0,xOFFSET ;; R0 := -50 ADD R0,R0,R5 ;; R0 := &x LDR R0,R0,#0 ;; R0 := x ; ........ xOFFSET .FILL #-50
Array indexes also require multiple instructions.
ADD R0,R4,#15 ;; R0 := A LDR R1,R5,#0 ;; R1 := i ADD R0,R0,R1 ;; R0 := A+i or &A[i] LDR R0,R0,#0 ;; R0 := A[i]
&sum | ADD R0,R5,#-1 |
&endHere | ADD R0,R5,#4 |
&A[3] | ADD R0,R4,#18 |
There is no &A
.
To obtain the address of variables more than 32 words from their data pointers or array elements, just omit the last instruction from the two code sequences shown above.
If we want to execute a C statement such as "*p = v ;
", we need to first get p
and v
into registers.
Let's say p
is in R2
and v
is in R3
. Then the C statement can be accomplished with
the instruction "STR R3,R2,#0
".
If you wanted to go in the other direction, i.e.,
"v = *p ;
", then use
"LDR R3,R2,#0
".
Expression evaluation is the process of translated expressions, such as the following, into LC-3 instructions.
sum += A[i]++ ;
eolist = ( p == NULL || *p == '\0' ) ;
Good expression evaluation is hard to do. The textbook's approach
is very simple and very incomplete. In Section 10.3
(pp. 264-272), expression evaluation is done with a push-down stack.
So that a C statement like "x = y + x*y
"
would be implemented with something like the following:
LD R0,y JSR PUSH LD R0,x JSR PUSH LD R0,y JSR PUSH ;; stack has [y][x][y] JSR OpMult ;; stack has [y][x*y] JSR OpAdd ;; stack has [y+x*y] JSR POP ;; R0 is [y+x*y] ST R0,y
This works fine for simple statements, but it just won't do for C.
For example in evaluating A[i]++
, you can't put the
value of A[i]
on the stack and then call something
like OpPlusPlus
, because ++
needs the
address of A[i]
, not the value of A[i]
.
Similarly for something like
p == NULL || *p == '\0'
,
you can't put
p == NULL
and
*p == '\0'
on the stack and then call
OpLogicalOR
, because you shouldn't
even attempt the evaluation of *p == '\0'
when p == NULL
is true.
Instead of searching for an automatic solution to
expression evaluation, try an ad hoc approach where you
translate a complex expression into a sequence of simple assignments
where only one operator appears on the left hand side.
You'll need to use made-up variable names to do this.
You'll also sometimes need the &
operator when
an lvalue is expected.
Here's examples of suitable sequences for the earlier C statements. The made-up variable names use Greek letters.
/* sum += A[i]++ ; */ int *α ; int β ; α = &A[i] ; β = *α ; α++ ; sum = sum + β ;
/* eolist = ( p == NULL || *p == '\0' ) ; */ int α ; int β ; α = p == NULL ; if (α != 0) goto ω ; β = *p ; α = β == '\0' ; ω : if (α != 0) α = 1 ; eolist = α ;
Better yet. Hope your instructor gives simpler problems.
This table gives some suggestions for translating the C code on
the left. You need to apply these rules until you get nothing
but assignments and goto
's.
if (expression) statement |
α = expression ; if (α == 0) goto ω ; statement ω : |
if (expression) statement1 else statement2 |
α = expression ; if (α == 0) goto ψ ; statement1 goto ω ; ψ : statement2 ω : |
while (expression) statement |
goto ω ; ψ : statement ω : α = expression ; if (α != 0) goto ψ ; |
do statement while (expression) ; |
ψ : statement α = expression ; if (α != 0) goto ψ ; |
for(init ; condition ; increment) statement |
init goto ω ; ψ : statement increment ω : α = condition ; if (α != 0) goto ψ ; |
On entry to the function, all of its arguments have been placed
on the top of the stack. The i'th argument is at
offset 1-i from register R6
.
R5
points to the address of the activation record
of the calling procedure.
In common terminology, R5
is called the
dynamic link and R6
is called the
stack pointer. R7
will contain
the address of the calling instruction. This is the
return address.
In the first four instructions of the function, the return address and dynamic link are stored on the stack. One stack slot is also set aside to hold the return value. Then the dynamic link is updated to point to the first local variable. Finally, the stack pointer is adjusted to allocate space for local variables.
If l is the number of local variables used by the function, the first five instructions are as follows:
FPROLOG ADD R6,R6,#-3 ;; Allocate stack space for "bookkeeping" STR R7,R6,#1 ;; Store return address of caller STR R5,R6,#0 ;; Store dynamic link of caller ADD R5,R6,#-1 ;; Point dynamic link to first local ADD R6,R6,#(-l) ;; Point stack pointer to last local
All variables, including function parameters, can be accessed using the offsets of the function symbol table.
When the function exits, the
the return value must be placed on
the stack on top of it arguments.
The return value slot is located at offset 3 from R5
.
The dynamic link and return address must also be restored from
the stack. R6
, the stack pointer, must also
be set to point to the return address.
Assuming that R0
contains the return value,
here are five instructions that will return from a function.
FEPILOG STR R0,R5,#3 ;; Store return value ADD R6,R5,#3 ;; Point stack pointer to return value LDR R7,R5,#2 ;; Restore return address LDR R5,R5,#1 ;; Restore dynamic link RET ;; Return
Before the function is called, all its arguments must be evaluated.
Let's assume this has happened and that the arguments are stored
in made-up variables α_{1}
to α_{n}
.
The function invocation must then place the n arguments on the stack. The arguments are placed on the stack from last to first. Letting i go from n down to 1, push the arguments using code similar to the following:
ADD R6,R6,#1 LDR R0,R5,offset for α_{i} STR R0,R6,#0
Control can now be transfered to the called function with
a JSR
or, more likely, a JSRR
instruction.
Function return is pretty simple. The calling function will transfer the return value into a register and then "remove" the return value and the n arguments from the stack by adding n+1 to the stack pointer.
LDR R0,R6,#0 ;; Place return value in R0 ADD R6,R6,n+1
Let's look at an example, a recursive implementation of Euclid's GCD algorithm.
int GCD(int n, int m) { if (n==m) return n ; else if (n<m) return GCD(m, n) ; else return GCD(n-m, m) ; }
int GCD(int n, int m) { int r, t ; t = n-m ; if (t==0) r = n ; else { int a1, a2 ; if (t<0) { a1 = m ; a2 = n ; } else { a1 = t ; a2 = m ; } r = GCD(a1, a2) ; } return r ; }
use | offset |
---|---|
a2 | -3 |
a1 | -2 |
t | -1 |
r | 0 |
dynamic link | 1 |
return address | 2 |
return value | 3 |
n | 4 |
m | 5 |
.ORIG x4B00 ;; Prologue -- CREATE THE ACTIVATION RECORD GCD ADD R6,R6,#-3 ;; Allocate stack space for "bookkeeping" STR R7,R6,#1 ;; Store caller return address STR R5,R6,#0 ;; Store caller dynamic link ADD R5,R6,#-1 ;; Set R5 to first local ADD R6,R6,#-4 ;; Set R6 to last local (#locals is 4) ;; t = n-m ; LDR R0,R5,#4 ;; R0 = n LDR R1,R5,#5 ;; R1 = m NOT R1,R1 ADD R1,R1,#1 ADD R0,R0,R1 ;; R0 = n-m STR R0,R5,#-1 ;; t = R0 ;; if (t==0) LDR R0,R5,#-1 ;; R0 = t BRnp ELSE1 ;; r = n ; LDR R0,R5,#4 ;; R0 = n STR R0,R5,#0 ;; r = R0 BRnzp JOIN1 ;; else { ELSE1 ;; if (t<0) { LDR R0,R5,#-1 ;; R0 = t BRzp ELSE2 ;; a1 = m ; LDR R0,R5,#5 ;; R0 = m STR R0,R5,#-2 ;; a1 = R0 ;; a2 = n ; LDR R0,R5,#4 ;; R0 = n STR R0,R5,#-3 ;; a2 = R0 BRnzp JOIN2 ;; } else { ELSE2 ;; a1 = t ; LDR R0,R5,#-1 ;; R0 = t STR R0,R5,#-2 ;; a1 = R0 ;; a2 = m ; LDR R0,R5,#5 ;; R0 = m STR R0,R5,#-3 ;; a2 = R0 JOIN2 ;; } ;; r = GCD(a1, a2) ; ADD R6,R6,#-1 ;; Put a2 on call stack LDR R0,R5,#-3 STR R0,R6,#0 ADD R6,R6,#-1 ;; Put a1 on call stack LDR R0,R5,#-2 STR R0,R6,#0 JSR GCD ;; Recurse! LD R0,R6,#0 ;; R0 = return value ADD R6,R6,#3 ;; Remove parameters and return value ;; #parameters is 2 STR R0,R5,#0 ;; r = R0 JOIN1 ;; } ;; return r ; LDR R0,R5,#0 ;; R0 = r STR R0,R5,#3 ;; return value = R0 ;; Epilogue -- REMOVE THE ACTIVATION RECORD ADD R6,R5,#3 ;; Point stack pointer to return value LDR R7,R5,#2 ;; Restore return address LDR R5,R5,#1 ;; Restore dynamic link RET ;; Return .END