Implementing C Scope

Scope

Three files: Many different x’s

#include <stdio.h>
#include <stdlib.h>

int x ;

int OddFact(int x) ;

int main(int argc, char *argv[]) {
  int y ;
  x = atoi(argv[1]) ;
  for (y=0; y<10; ++y) {
    int x = OddFacts(y) ;
    printf("%d %d\n", x, 2*x) ;
  }
  return (EXIT_SUCCESS) ;
}
extern int x ;

int FetchAndAdd(int y) {
  x = x + y ;
  return x ;
}
static int x = 0 ;

int FetchAndAdd(int) ;

int Factorial(int n) ;

int OddFacts(int y) {
  ++x ;
  if (x%2 == 1) {
    return FetchAndAdd(y) ;
  } else {
    return Factorial(y) ;
  }
}

int Factorial(int x) {
  if (x<=1) {
    return 1 ;
  } else {
    return x*Factorial(x-1) ;
  }
}

Function prototype scope

These variables are known only within the function prototype.

double sin(double x) ;

Block scope

These variables are known only within their block except for “holes” created when a variable name is re-used within a function prototype or block. As a special case, the scope of a function’s parameters are known in the function's header and block. Variables with block scope are often called local variables.

{
  int t ;
  t = x ;
  x = y ;
  y = t ;
}
double Dsine(double x) {
  return sin(x*(M_PI/180)) ;
}

File scope

Variables that are declared outside a function have file scope. These are known throughout a file except for "holes" created when the same variable is re-declared within a function prototype or block. Function names also have file scope.

There is one exception to the rule about the "hole". If a variable is declared as extern within a block, it is a variable of file scope. If there is already an identically named variable of file scope, the two variables are the same. Maybe it’s best just not to do this.

Variables with file scope are often called global variables. Also following the terminology of Java, file scope may also be called compilation unit scope.

int accumulator = 0 ;

int FetchAndAdd(int y) {
  return accumulator += y ;
}

Duration

Variables can exist throughout the execution of a block or throughout the execution of a program.

Static variables

Static variables exist throughout the execution of a program. All variables with file scope have static duration. Variables with block scope that are declared with the static storage class specifier also have static duration. There is very different from the static of Java.

int FetchAndAdd(int y) {
  static int accumulator = 0 ;
  return accumulator += y ;
}

Automatic variables

Automatic variables exist only throughout the execution of a block. Variables with block scope, except those declared as static are of automatic duration.

It is also possible to declare automatic block variables with the auto or register storage class specifier. There is absolutely no good reason for doing this.

int Factorial(int x) {
  if (x<=1) {
    return 1 ;
  } else {
    return x*Factorial(x-1) ;
  }
}

Linkage

Some variable and most function names need to be known by several files. Linkage is the property of variables that makes this possible.

Generally linkage is supported by special operating system software, appropriately called linkers, that combine object files produced by compilers. In Unix, the linker is usually a program called ld. ld can join object files written in different programming languages.

No linkage

Block variables, including static block variables, are known only within a block. Consequently, they have no linkage.

Internal linkage

If a function or a variable is declared as static, it has internal linkage. Consequently, it can only be used within the file where it is defined.

This makes it possible to define functions and global variables that are used only within one file. Using static in these cases can avoid problems of variable name clashes in code written by different programmers.

Is it quite likely that most linkers treat variables with internal linkage and static variables with no linkage the same way.

static int FetchAndAdd(int y) {
  static int accumulator = 0 ;
  return accumulator += y ;
}

External linkage

Variables and functions with file scope that are not declared as static have external linkage.

Linkers really must do some linkage for these variables. They must allocate storage for external variables and deal with possible inconsistencies; such as (1) an external variable being declared as an int in one file and as a double in another, or (2) an external variable being initialized to 202 in one file and to 255 in another. Problems like these will not be discovered when the files are compiled since they are compiled separately.

If a variable with file scope is defined with an extern storage class specifier, it is considered to reference a global variable that is actually created within another file. For this reason extern variables cannot be used for initialization.

extern char *State ;

A file scope variable declaration without the extern is a real variable declaration and, if appropriate, can be initialized.

char *State = "North Carolina" ;

The rule is to put the extern in front of the variable declarations in every file but one. However, you'll find that many systems are a but more forgiving. Perhaps it's just best to avoid variables with external linkage.

C variables at the machine level

Most computer languages and systems divide program memory into large segments. Read the Wikipeida article on program memory before going on. At least look at it!

The automatic variables of C are stored in stack frames, small sections of memory allocated for each function call. The local variables of a function are addressed relative to the a frame pointer containing the starting address of the stack frame. In the MIPS architecture, the frame pointer is register $fp. The frame pointer is adjusted in each function call to provide space for the function’s local variables.

On the MIPS32, many automatic variables are allocated at fixed locations relative to the beginning of the stack frame. For example, if x is stored at offset 48, it can be loaded in one memory access:
        lw   $t5,48($fp)

Many automatic variables will be stored in registers. For example, the first few arguments to the function will be passed in registers $a0 to $a3. Also, most optimizing compilers and all assembler programmers will use registers $t0 to $t9 to store frequently used local variables.

Static variables will be allocated in the data region and are addressed at offsets from the global pointer register $gp. The addresses of static variables cannot be determined until the all the functions of an application are linked or, in the call of dynamic libraries, loaded. This means the assembler must generate tables that reveal which instructions reference global variables.

In the best case, an variable will be located at a fixed offset from $gp. This allows the use of machine code similar to that used to access local variables.
        lw   $t5,48000($gp)
However, this is only feasible when the programmer promises the compiler that no more than 65536 bytes of global memory will be used. (Look at the instruction format to see why it’s 65536.)

In the usual case, each function has its own global offset table (GOT) for accessing global variables. Each GOT contains the address of every global variable used by the function. These addresses are filled in as the functions are loaded into memory. When a global variable is accessed a code sequence similar to the following is used:
        lw   $t5,%got(X)($gp)
        nop
        lw   $t5,0($t5)
where %got(X) is a pointer to where the address of the global variable X is stored within the function’s global offset table. When a function is called, register $gp is changed to point to that function’s global offset table.

Take a look at Global Offset Tables section of Computer Science from the Bottom Up for more information.