Symbol Table
---------------

Uses data structures that allow efficient name lookup operations in
the presence of scope changes. 

We can use 
 -- hash tables to lookup attributes for each name
 -- a scope stack that keeps track of the current scope (in
     which a symbol to be looked up appears) and its surrounding
     scopes. The top most element in the scope stack corresponds to
     the current (ie innermost) scope, while the bottommost element will 
     correspond to the outermost (ie global) scope.

Support for scopes:
  lexical scopes can be supported using a scope stack as follows:
  a) symbols in a program reside in multiple hash tables: in particular, 
      symbols within each scope are contained in a single hash table
      for that scope
  b) at any point in the program, the scope stack keeps track of all the 
      scopes surrounding that program point, as described above. The 
      elements of the stack contain pointers to the corresponding
      hash table.
  c) to lookup a name, we start from the hash table pointed to by the 
     top element of the stack. If the symbol is not found, we try 
     hash table pointed by the next lower entry in the stack. This
     process is repeated until we find the name, or we reach the bottom
     of the stack (this means that the name has never been declared.)
  d) scope entry and exit operations modify the scope stack appropriately.
     Specifically, when a new scope is entered, a corresponding hash
     table is created. A pointer to this hash table is pushed onto the
     scope stack. When we exit a scope, the top of the stack is popped off.

Example:
               <------------------ (1)
  float y = 1.0
               <------------------ (2)
  void f(int x) {
               <------------------ (3)
     for (int x = 0; ...) {
               <------------------ (4)
         {
               <------------------ (5)
             int y = 1;
               <------------------ (6)
         }
               <------------------ (7)
         {
               <------------------ (8)
             float x = 1.0;
               <------------------ (9)
         }
               <------------------ (10)
     }
               <------------------ (11)
  }
               <------------------ (12)
  main() {
               <------------------ (13)
     float y = 10.0;
               <------------------ (14)
     f(1);
}  
               <------------------ (15)


Illustration:

At (1), we have a single hash table, which is the global hash table.
The scope stack contains exactly one entry, which points to this
global hash table.

When the compiler moves from (1) to (2), the name y is added to the 
hash table for the current scope. Since the top of scope stack points
to the global table, this means that "y" is being added to the global
table.

When compiler moves from (2) to (3), the name "f" is added to the
global table, a new hash table for f's scope is created.
A pointer to f's table is pushed on the scope stack. Then
the parameter "x" added to hashtable for the current scope (ie f)

(Students should work out the rest of the moves.)

Static or lexical scoping: associations are determined at compile time,
using a sequential processing of program. 

Dynamic scoping: associations are determined at runtime --- processing of
program statements follows the execution order of different statements.
Example: 

if we added a new function "g" to the above program as follows:
   void g() {
      int y ;
      f() ;
   }

Consider references to the name "y" at (3). With static scoping, this
name always refers to the global variable y defined between (1) and (2).
With dynamic scoping, the search for non-local names follows the execution
nesting, rather than static nesting. This means that if "f" is called
from main, then at (3), "y" will refer to the float variable declared
in main. If "f" is invoked from within g, the same name will refer to 
the integer variable "y" defined in g. Thus, the same name may refer to
different entities, possibly of different types, in a dynamically
scoped language.

Since the type associated with the name "y" at (3) can differ depending
upon the point of call, we cannot statically determine the type of y. This
suggests that dynamic scoping does not fit well with static typing.

Since static typing has now been accepted to be the right approach,
almost all current languages (C/C++/Java/SML/LISP) use static scoping

Scopes in SML:
--------------

-- most names are at the "top-level," which corresponds to global scope.
-- formal parameters of functions are within the scope of the function
-- let statement introduces new bindings whose scope extends from the
   point of binding to the end of the let-block
    e.g.,
            val v = let
                       val x = 2
                       val y = x * y
                    in
                       x*y
                    end


Allocation, Extent, Environment
--------------------------------

A variable is stored in memory at a location
(determined by the environment) corresponding to the variable.

Constants do not need to be stored in memory.
(Constants Vs Literals)

Environment stores the binding between variable names and the
corresponding locations in memory.

The process of setting up this binding is known as storage allocation.


1. static allocation: 
          allocation performed at compile time. Compiler translates
          all names to corresponding location in the code generated by it.
          Examples: all variables in original FORTRAN
                    all global and static variables in C/C++/Java

2. stack allocation: 
         needed in any language that supports the notion of local
         variables for procedures. Also called "automatic allocation"
         but this is somewhat of a moisnomer now.

         Examples: all local variables in C/C++/Java procedures and blocks

         Implementation: Compiler translates all names to relative offsets
         from a location called the "base pointer" or "frame pointer".
         The value of this pointer varies will, in general, be different
         for different procedure invocations.

         The pointer refers to the base of the "activation record" (AR) for an
         invocation of a procedure. The AR holds such information as parameter
         values, local variables, return address (where execution should
         resume on exit from the current procedure) etc.

         Example:
                  int fact (int n) {
                   if n = 0 then 1
                   else {
                       int rv = n * fact (n-1) ;
                       return rv ;
                   }
                }
                main() {
                  fact (5) ;
                }
         
     An activation record is created on the stack for each a call to
    function. The start of activation record is pointed to by a
    register called BP. The other elements (e.g., parameters, local
    variables) are accessed by offset from BP (e.g., first parameter
    may be accessed as BP, second parameter as BP+2, assuming each
    parameter occupies 2 bytes etc). Then on the first call to fact,
    BP is decremented (since stack grows downwards, towards address 0
    in the memory) to point to new activation record, n is initialiaed 
    to 5, rv is pushed but not initialized (since fact has to compute
    factorial of 4 yet) and then new activation record (again with n
    initialized to 4, rv is pushed on top of n) is created for the
    next recursive call and so on until when n becomes 0, at which
    time, stack is unrolled where successive rv's are assigned the
    value of n at that stage and the rv of previous stage (thus
    computing the factorial).

    Dynamic allocation
    - explicit allocation, freeing
         e.g., malloc/free in C, new/delete in C++
    - explicit allocation, automatic free
          Java
    - automatic allocation, automatic free
          Lisp, SML


Extent, Lifetime: duration of allocation of a variable

Variables and Constants
    Variables are stored in memory, whereas constants need not be
      (they may be replaced at the compile time by the compiler with
       the value they refer to).
    Value of variables can change at runtime. Variables have a
      location and value.
    Location refers to l-value (the value on left of assignment) and
       value refers to r-value (the value on right)

    e.g., in x = y, r-value of y is stored in location that is l-value 
           of x.