Local variables
Local variables has a lifetime of a single function call or a single program block. Local variables may be saved either in the function’s stack frame or in a register. Function arguments are a special case of local variables. They are usually passed in registers; however, sometimes there will just be too many arguments to pass in registers. In this case, some arguments must be passed on the stack.
Either the compiler or the assembly-language programmer or the ABI (Application Binary Interface) allocates registers and stack locations to local variables.
Suppose N
is a local variable. Let’s look at how the statement
N += 7
can be implemented in the MIP32 architecture. If register $t5
has been allocated to N
, the statement can be implemented as a single
instruction.
addi $t5,$t5,7
However, if N
is implemented on the stack, say at offset 40, up to three
instructions are required.
lw $t0,40($sp)
addi $t0,$t0,7
sw $t0,40($sp)
Caller and callee saved
However, what happens if N
is allocated to
$t5
and then a call is made
to a subfunction. How can the programmer be sure that the subfunction
doesn’t use $t5
and trash the value of N
?
Following the common terminology, we’ll call the
calling function the caller and the called function the callee.
The MIPS32 ABI states
that the temporary registers $t0
to $t9
are caller saved.
This means that the callee is free to use these registers and that
the caller must reload them after the call returns.
The argument registers ($a0
to $a3
) and value
registers
($v0
to $v1
) are also considered caller saved.
On the other hand, the saved registers $s0
to
$s7
are callee saved.
This means that the callee can use these registers only if their
original values are saved before use and restored before return.
The caller is guaranteed that the saved registers are unchanged by the call.
Global variables
Global, or external, variables require special handling. The address of a global
variable cannot be determined by a compiler or assembler. This is the
job of the linker (called ld
in the Unix world).
The best the assembler or compiler can do is generate a table containing the names
of all external variables and the many references, both instructions
and data, to these variables in the assembler that it
produces. The linker tries to resolve these references.
This requires the use of the global pointer ($gp
) register.
If x
is an external variable,
an
assembly language program may write a statement similar to
lw $t0,x
If x
has been allocated within 215 bytes of $gp
(more precisely
within the range $gp
-32768 to $gp
+32767), the
lw
can be implemented with code similar to
lw $t0,OFFSET_x($gp)
where OFFSET_x
is computed by the linker.
However, sometimes it is simply imposible to put all global variables in
216 bytes. In this case, it is necessary to place the
addresses of global variables in the global offset table or GOT.
Effectively, the global offset table is an vector containing the addresses
of external variables
and
global variable access is implemented by a two-instruction sequence
similar to
lw $t0,OFFSET_GOT_n($gp)
lw $t0,0($t0)
By the way, shared librarys (DLLs) add yet more complexity.
Addresses (Pointers)
If N
is a variable stored in memory, how do you get its address?
If N
is stored at 40 bytes from the begining of the stack pointer,
you use something like the following:
addi $t0,40($sp)
If N
is a global, it will be referenced using the global pointer, $gp
.
Fortunately the assembler has a little hack allowingyou to write
use a idiom such as
la $t0,N
which is magically translated to something like
addi $t0,$gp,OFFSET_N
or, when a global offset table is used, to
lw $t0,OFFSET_GOT_N($gp)
Don’t ask questions, just use the idiom la
to load address.
Arrays
Remember that A[i]
is really just *(A+i)
but keep in mind
that the addition must be implemented as
A+i*sizeof(*A)
. So, if A
is an array of 4-byte integers, then something
like the following is requred on a MIPS32 computer to load A[0]
into
a register.
la $t0,A
lw $t1,i
sll $t1,$t1,2
add $t0,$t0,$t1
lw $t0,0($t0)
The Intel architecture can do this in a couple
of instructions.
mov i, %rax
movl A(,%rax,4), %eax
However, it’s not at all clear that the Intel two will be faster than the MIPS five.
Ultimately, both require the same number of adds and shifts.
The faster solution, for both MIPS and Intel, is to use a compiler that transforms a loop like
for (int i=0; i<n; ++i) {
aSum = aSum + A[i] ;
}
into
for (int *iP=A[i]; iP<&A[n]; ++iP) {
aSum = aSum + *iP ;
}
This eliminates all the adding and shifting required to access
the array elements.