Translating C to PIC: Functions

Properties of the call conventions

The function call conventions form the PIC32 are described in Chapter 12 of the MPLAB® XC32 C/C++ Compiler User’s Guide.

Microsoft Developer Network also has a pretty good page for the MIPS Calling Sequence Specification for Visual Studio 2005.

The Spring 2006 offering of ECE 314 at Cornell also have a very detailed tutorial on the MIPS Calling Convention.

Register usage

Caller vs callee saved

Caller saved registers are not saved accross function calls. When a register is caller saved, the calling function must save the register if it wishes to reuse it after the call returns.

Callee saved registers are saved across function calls. When a register is callee saved, the called function must save the register if it wishes to use it and restore the register value before the call returns.

Registers $a0 to $a3 and $v0 to $v1 are considered caller saved. If $fp is not being used as a frame pointer, it can be users as a called saved register. In that case, it should be referenced by its alternate name, $s8.

Parameter passing

If possible, function arguments are passed in registers $a0 to $a3. If the arguments cannot fit in these four registers, they are placed on the stack.

Space is allocated for all arguments, including those passed in registers, on the stack. If a argument is passed in a register, its space on the stack will be uninitialized.

“Small” arguments will not share a register: A char gets an entire 32-bit register. If the returned value is a large struct, the calling function allocates space for this structure and passes its address in $a0. This is a messy feature of C.

Stack frame

When programs are compiled under the XC32 compiler, the default action for function invocation is to allocate a stack frame, even though this may not be needed for leaf functions. The frame pointer, the address of the base of the stack frame, is contained in $fp. Many routines written in MIPS32 assembly and generated by an optimizing compiler will not use the frame pointer, but will rely solely on the stack pointer to address data within the stack frame.

Stacks grow download in memory, toward lower addresses. The stack pointer always points to the lowest allocated slot of the stack. This means that memory should never be addressed as a negative offset of the stack pointer.

Usually, the stack pointer and frame pointer are the same, but the stack pointer may grow temporarily to allocate space for “temporaries.”

The PIC32 stack frame is relatively simple. Here is the order for storing information on the stack. See Figure 9-1 of the MPLAB® XC32 C/C++ Compiler User’s Guide for a better picture.

Arguments received in the call
Local variables or saved registers
Saved registers or local variables
Space for arguments passed in calls

The reason for the “or” is that there is an inconsistency between the XC32 documentation and XC32 generated code. However, either way should be fine. Also, the space for arguments appears twice they are technically part of the frame for both the caller and callee.

By the way, the slot where the frame pointer is stored is sometimes called the dynamic link because it is part of a linked list of stack frames.

Local variables, saved registers, and parameters stored on the stack are accessed at fixed offsets from the frame pointer. In unoptimized code generated by the XC32 compiler, all local variable references will be of the form offset($fp). However, in many other calling conventions, local variables are accessed at fixed offsets from the stack pointer.

Actions of the called function

Function prologue

We’re going to present the prologue as it must be performed for stem functions, functions which call other functions. Life may be easier for leaf functions.

On entry to the function, space for all the function’s arguments has already been allocated at the top of the stack, even though some of those arguments are passed in registers.

The called function will adjust $sp to allocate space for the new stack frame and then store the present return address, $ra, and frame pointer, $fp on the stack. The frame pointer is then set to the same address as the stack pointer. The code looks a bit like this when a frame of N bytes is allocated:

       addiu    $sp,$sp,-N      # adjust stack for N-byte frame
       sw       $ra,N-4($sp)    # save return address on stack
       sw       $fp,N-8$(sp)    # save frame pointer on stack
       add      $fp,$sp,$zero   # set frame pointer to present stack pointer

At this point, other registers may be saved in the stack frame. Arguments that were passed in registers may also be saved in slots allocated for them in the caller’s stack frame.

Inside the function

All local variables stored in the stack frame will be accessed by positive offsets from $fp, the frame pointer. Most parameters will be accessed in the registers in which they were passed. Some will be accessed at offsets from the frame pointer.

Function epilogue

Before the function exits, any modified callee-saved registers must be restored and return values must be placed in appropriate registers.

Then the frame pointer is copied to the stack pointer. At that point, the return address and previous frame pointer are restored from the stack frame and the stack pointer is incremented to pop the stack frame. Finally, a jr instruction is used to return to the caller. It looks a bit like this:

       add      $sp,$fp,$zero   # Copy frame pointer to stack pointer
       lw       $ra,N-4($sp)    # restore return address
       lw       $fp,N-8($sp)    # restore frame pointer
       addiu    $sp,$sp,N       # pop stack frame
       jr       $ra             # return

Actions of the calling function

Function invocation

Before the function is called, all its arguments must be evaluated and placed in the appropriate registers or stack locations. Generally the stack frame was created with enough space to hold arguments for the function’s future calls.

Now a jump-and-link instruction, with the called function as its target, will be made. The jump-and-link will store the return address, PC+8, into $ra and then transfer to the target function.

The function will also need to save copies of any caller-saved registers it needs after the call is completed.

Function return

If values were returned, the calling function will find them in register $v0 and, for 64-bit returned values, $v1. As mentioned earlier, there is a special case dealing with functions that return a C struct. The calling function will also need to restore any caller-saved registers it was using.

An example

Let’s look at an example, a very inefficient recursive function for squaring a positive number.

int square(int n) {
    int r = 0 ;
    if (n != 0) {
        r = square(n-1) + n + n - 1 ;
    }
    return r ;
}

Implemented on the PIC32

Here’s some assembly code that was instructor-optimized from the output of the xc32 compiler.

      .align  2
      .global square
      .ent    square
square:
# debug info:
#   $fp is virtual frame pointer
#   $fp is 32 bytes less than $sp on entry
#   $ra contains return address
      .frame  $fp,32,$ra
# debug info:
#   registers $31 ($ra) and $30 ($fp) [from mask]
#   at offset of -4 from $sp on entry
      .mask   0xc0000000,-4
# prologue
      addiu   $sp,$sp,-32
      sw      $ra,28($sp)
      sw      $fp,24($sp)
      add     $fp,$sp,$zero
# save n in its argument slot
      sw      $a0,32($fp)
# r = 0
      sw      $zero,16($fp)
# if (n != 0)
      beq     $a0,$zero,retPnt
      nop
# invocation: square(n-1)
      addiu   $a0,$a0,-1
      jal     square
      nop
# return: square(n-1)
      add     $t1,$v0,$zero
# r = square(n-1) + n + n - 1
      lw      $a0,32($fp)
      addu    $t1,$v0,$a0
      addu    $t1,$t1,$a0
      addiu   $t1,$t1,-1
      sw      $t1,16($fp)
retPnt:
# return r
      lw      $v0,16($fp)
# epilogue
      add     $sp,$fp,$zero
      lw      $ra,28($sp)
      lw      $fp,24($sp)
      addiu   $sp,$sp,32
      j       $ra
      nop
      .end    square
      .size   square, .-square

The calls — before the prologue

nRegistersStack
3 Registers at entry, N=3 Stack at entry, N=3
2 Registers at entry, N=2 Stack at entry, N=2
1 Registers at entry, N=1 Stack at entry, N=1
0 Registers at entry, N=0 Stack at entry, N=0

The calls — before the epilogue

nRegistersStack
0 Registers at exit, N=0 Stack at exit, N=0
1 Registers at exit, N=1 Stack at exit, N=1
2 Registers at exit, N=2 Stack at exit, N=2
3 Registers at exit, N=3 Stack at exit, N=3

The calls — all done

nRegistersStack
0 Registers at return Stack at return

Implementing Java methods

If you put the keyword static in front of the square function header, you get a Java static method. Here is the JVM bytecode for that method. When this method is running, local variable 0 is parameter n and local variable 1 is r. The operand stack is shown to the right of the JVM instructions. Also, square is static method 2 of the class.

  static int square(int);
    Code:
       0: iconst_0                  //      [[ 0
       1: istore_1                  //      [[
       2: iload_0                   //      [[ n
       3: ifeq          19          //      [[
       6: iload_0                   //      [[ n
       7: iconst_1                  //      [[ n               1
       8: isub                      //      [[ n-1
       9: invokestatic  #2          //      [[ sq(n-1)
      12: iload_0                   //      [[ sq(n-1)         n
      13: iadd                      //      [[ sq(n-1)+n
      14: iload_0                   //      [[ sq(n-1)+n       n
      15: iadd                      //      [[ sq(n-1)+n+n
      16: iconst_1                  //      [[ sq(n-1)+n+n     1
      17: isub                      //      [[ sq(n-1)+n+n-1
      18: istore_1                  //      [[
      19: iload_1                   //      [[ r
      20: ireturn

Chapter 6 of the JVM specification contains a detailed description of the Java Virtual Machine instruction set.

Calling main

We do need some assembly language code to get a C program running.

Typical C startup with crt0 on a multi-user system

Startup for calling main on the PIC

If the PIC assembly language programmer does not provide a __reset entry point, the following actions will be performed to call main:

This list is a summary of a more detailed description that can be found in Chapter 14 of the in the MPLAB® XC32 C/C++ Compiler User’s Guide.

The mainframe and the PC

Implemented on the IBM/360

Mark Smotherman of Clemson University is an expert on the history of computer architecture. His page on IBM S/360 Subroutines describes a calling convention that uses a linked list rather than a stack.

Intel architectures