CSCI 431 Lecture Notes - Data Types

How do we describe a programming language?

How do we implement a programming language?

A language implementation provides a couple of very standard capabilities:

A language implementation can take two forms:

The structure of a translator.

Identifiers and Binding

Name Spaces

   +---------------------+----global name space
   |int x;               |
   |                     |
   |main (...) {         |
   |  +----------------+------main name space
   |  |int x;          | |
   |  |  {             | |
   |  |  +-----------+--------block inside main name space
   |  |  |int x;     | | |
   |  |  |           | | |
   |  |  +-----------+ | |
   |  |  }             | |
   |  |  ...           | |
   |  +----------------+ |
   |  }                  |
   |                     |
   |foo  (...) {         |
   |  +----------------+------foo name space
   |  |int x;          | |
   |  |  ...           | |
   |  +----------------+ |
   |  }                  |
   +---------------------+

Scope of Declarations and Blocks

   {--- Pascal example}
   procedure cat (...);

   var dog : Integer;  {--- override any external binding of "dog"}

   begin  {--- start of the block}       
      ...
   end;   {--- end of the block, "dog" returns to external binding (if any)}

Free vs. Bound variables

A variable in a program block can be thought of as free if it is not bound within the block (e.g., an access to a global variable from within a function is an access to a free variable). Consider the following C program fragment.

   int x = 3;

   foo(int y) {

     x = y;             /* x is free, y is bound */
   }

It is important to identify free vs. bound variables since free variables can have different bindings/meanings each time the block is evaluated (the binding depends on something external to the function).

More about Bindings

Environments

Static and Dynamic Binding

There are two obvious interpretations of bindings, called static bindings and dynamic bindings. Static binding (or static scoping) involves the function body being evaluated in the environment of the function definition also referred to as the lexically enclosing environment. This is what we have seen in the examples so far. Dynamic binding (or dynamic scoping) is slightly different in that the function body is evaluated in the environment of the function call.

For example, consider the following Pascal code:

  const s = 2;
  const h = 10;

  function scaled (d : Integer);
    begin
      scaled := d * s;
    end;

  function call_scaled(blah : Integer);
    const s = 3;
    begin
      ... scaled(h);
    end;

  begin {---main}
    ...
  end.

If the above program uses static binding, s is evaluated at the point of the function definition (i.e. in the function scaled) and the value of scaled(h) is 20. If dynamic binding is used, s is evaluated at the point of the function call (i.e. in the function call_scaled) and the value of scaled(h) is 30.

With static binding, we can determine the bindings of identifiers at compile time, but with dynamic binding we must delay this until run-time. As such, most programming languages use static binding.

The Run-Time Stack

  • an example:
      int x;                  /* storage for global variables */
    
      void main() {
        int y;                /* stack storage for local variables */
        char *str;            /* stack storage for local variables */
    
        str = malloc(100);    /* allocates 100 bytes of dynamic heap storage */
    
        y = foo(23);
        free(str);            /* deallocates 100 bytes of dynamic heap storage */
      }                       /* y and str deallocated and stack frame popped */
    
      int foo(int z) {        /* z is allocated stack storage */
        char ch[100];         /* ch is allocated stack storage */
    
        if (z == 23)
          foo(7);
    
        return 3;             /* z and ch are deallocated as stack frame is
                                 popped, 3 is put on top of the stack */
      }
    

  • at the start of the program:

    Image: stack example one

  • after the first call to foo:

    Image: stack example two

  • after the second call to foo:

    Image: stack example three

    An Aside on Cleanup

  • storage objects in stack storage die (are deallocated) when the stack frame is popped.
  • storage objects in heap storage can be explicitly killed (as in C), but in other languages are implicitly killed when they are no longer referenced. For example:
      /* C example */
      char *str;
      str = malloc(100);  /* allocate a 100 byte storage object in heap storage
                             and put the pointer to it into str */
      free(str);          /* explicitly release the allocated storage object */
    
  • it is also possible to end up with pointers to dead objects. These pointers are called dangling references. For example:
      /* C example */
      char *str;
      str = malloc(100);  /* allocate a 100 byte storage object in heap storage
                             and put the pointer to it into str */
      free(str);          /* explicitly release the allocated storage object */
      *str = 'h';         /* str is now a dangling reference!  This instruction
                             attempts to put the character 'h' into what is
                             pointed at by str, but str points to a storage
                             that is no longer allocated to the program */
    

    Data Types

    Composite Data Types

    A value of a composite type consists of components that may be inspected selectively, and these components in turn may have components that may be inspected selectively.

    Features:

    Implementation Issues

    Type Checking

    When are types checked?

    How are types checked - What is type equivalence?

    An example similar to your homework

    The program:

    
    PROGRAM MAIN(INPUT, OUTPUT) ;
    
      PROCEDURE P(V: INTEGER) ;
      VAR
         X: INTEGER ;
    
        PROCEDURE Q(V: INTEGER) ;
        BEGIN
          X := V ;
          WRITELN("In Q -- X = ", X:2, ", V = ", V:2) ;
        END ;
    
        PROCEDURE R(V: INTEGER) ;
          VAR
            X: INTEGER ;
    
          PROCEDURE S(V: INTEGER) ;
          BEGIN
             X := V ;
             Q(V+1) ;
             WRITELN("In S -- X = ", X:2, ", V = ", V:2) ;
          END ;
      
        BEGIN
          X := V ;
          S(V+1) ;
          WRITELN("In R -- X = ", X:2, ", V = ", V:2) ;
        END ;
    
      BEGIN
        X := V ;
        R(V+1) ;
        WRITELN("In P -- X = ", X:2, ", V = ", V:2);
        IF (V < 45000) THEN
           P(48048) ;
      END ;
    
    BEGIN
       P(43680) ;
    END .
    
    

    The output of this Program:

    
    In Q -- X = 43683, V = 43683
    In S -- X = 43682, V = 43682
    In R -- X = 43682, V = 43681
    In P -- X = 43683, V = 43680
    In Q -- X = 48051, V = 48051
    In S -- X = 48050, V = 48050
    In R -- X = 48050, V = 48049
    In P -- X = 48051, V = 48048