Programming Languages: Chapter 6: Binding and Scope


Checkpoint

Maintain perspective: this is a course on the concepts of programming languages.

What is a PL concept?

Our approach: study those concepts by building interpreters which implement them in Scheme for languages.

Why Scheme?

  • you do not know it and therefore will learn something new
  • ideal vehicle to study programming language concepts because it forces us to focus on fundamental language concepts
  • very simple and consistent, yet powerful language (see HW2...DSs <= 100 LOC)

Powerful languages are those which support the creation of new languages.

  • LISP is a language for designing languages.
  • XML is a language for designing languages (e.g., VoiceXML).
  • Usually implies no syntactic distinction between programs and data (so called homoiconic).


Bindings of variables

  • references vs. declarations
  • denotation
  • a reference is bound to a declaration
  • declarations have limited scope
  • references are (statically or dynamically) bound to declarations (which have limited scope)
  • the scope of a declaration is the region of the program (a range of statements) where that variable is visible (i.e., can be referenced)
  • local vs. nonlocal references
  • binding rules or scope rules specify to which declaration a reference is bound
  • languages where that binding can be determined by analyzing the program text are said to use static scoping
  • languages where that binding cannot be determined until run-time are said to use dynamic scoping
  • binding rule for λ-calculus expressions ([EOPL2] Definition 1.3.1 on p. 29)
  • qualifiers, concepts, and operators such as private, public, friends, and the scope resolution operator (::) in C++ give the programmer a finer control over scope


Static scoping

  • introduced in ALGOL 60
  • local + ancestor blocks: sometimes called lexical scoping
  • declaration associated with a referenced variable can be determined statically (i.e., before run-time)
  • scope of a variable reference is the code constituting its static ancestors
  • advantages of static scoping
    • readability
    • predictability
    • type checking/validation
  • disadvantages of static scoping
    • scope of a variable tends to be larger than necessary; see [COPL9] p. 232
    • sometimes leads to several globals or all subprograms residing at the same level


Dynamic scoping

  • scope determined `based on the calling sequence of subprograms, not on their spatial relationship to each other' [COPL9] p. 232; implies run-time
  • used in McCarthy's original version of LISP as well as APL and SNOBOL4
  • Scheme, a popular dialect of LISP, adopted static scoping; an example of mutation in the evolution of programming languages
    • Perl and COMMON LISP leave it up to the programmer
    • example in Perl:
      $l = 10;
      $d = 10;
      
      # reads an integer from standard input
      $input = <STDIN>;
      
      if ($input == 5) {
         print "Before the call to sub1 -- l: $l, d: $d\n";
         &sub1();
         print "After the call to sub1 --  l: $l, d: $d\n";
      } else {
          print "Before the call to sub2 -- l: $l, d: $d\n";
          &sub2();
          print "After the call to sub2 --  l: $l, d: $d\n";
         }
      
      exit(0);
      
      sub sub1 {
         my $l; # only in this block (statically scoped)
      
         local $d; # accessible to children (dynamically scoped)
      
         $l = 5;
         $d = 20;
      
         print "Inside the call to sub1 -- l: $l, d: $d\n";
      
         print "Before the call to sub2 -- l: $l, d: $d\n";
         &sub2();
         print "After the call to sub2 --  l: $l, d: $d\n";
      
      }
      
      sub sub2 {
         print "Inside the call to sub2 -- l: $l, d: $d\n";
      }
      
  • advantages of dynamic scoping
    • flexibility
    • sometimes makes things easy (e.g., no need to pass parameters if they are present in an outer scope)
    • often parameters passed from one subprogram to another are simply variables local to the caller
  • disadvantages of dynamic scoping
    • readability
    • reliability
    • type checking; can we use static type checking in a dynamically scoped language?
    • can be less efficient to implement than static scoping
    • difficult to debug
    • no locality of access
    • no way to protect local variables
    • subprograms are always executed in the environment of all previously called subprograms which have not yet completed their execution
    • can have unintended consequences


Referencing environment

  • the referencing environment is the set of variables (and their bindings) which are visible at any given point in a program
  • examples from [COPL9] pp. 235-237
  • scope and referencing environments are inverses of each other
    • scope(<declaration>) = {a set of program points}
    • refenv(a program point) = {a set of variable bindings}


free or bound?

  • for any programming language (see [EOPL2] Definition 1.3.2 on p. 29)
  • value of an expression depends only its free variables
  • value of an expression is independent of its bound variables
  • value of an expression with no free variables is fixed
    • such expressions are called combinators
    • for instance, identity function or application combinator
  • for λ-calculus (see [EOPL2] Definition 1.3.3) on p. 31
  • occurs-free? and occurs-bound? (see [EOPL2] Fig. 1.1 on p. 32)


Determining the declaration associated with a reference

  • notion of a block-structured language
    • a block is a group of statements with associated declarations (scope)
    • sometimes involves nested subprograms
    • Scheme, C, and Perl are each block-structured, statically-scoped languages
  • lexical binding
  • scope of a variable declaration is the text within which references to the variable refer to the declaration [EOPL2] p. 33
  • scope is therefore a subset of the program
  • one (inner) declaration may shadow another (outer) declaration, or
  • that the (inner) declaration creates a scope hole in the other
  • visibility
  • procedure for determining the declaration to which a variable reference is bound
  • lexical depth; use zero-based indexing
  • declaration position; also use zero-based indexing
  • can associate each variable reference with a (lexical depth, declaration position) pair (i.e., (v: d p))
  • lexical address makes variable name unnecessary
  • replace formal parameter lists with their length
  • identifiers are necessary for writing programs, but unnecessary for executing them
  • contour diagrams


Evolution of computer languages


Deep, Shallow, and Ad-hoc Binding

referencing environment for passed function
  • deep binding: uses the environment at the time the passed function was created (line a below)
  • shallow binding: uses the environment of the expression which invokes the passed function (line b below)
  • ad hoc binding: uses the environment of the invocation expression in which the subprogram is passed as an argument (line c below)
  • example:
    (let  ((y 3))
      (let ((x 10)
    
            ;; to which declaration of y is the reference to y bound?
            (f (lambda (x) (* y (+ x x))))) ; line a
    
        (let ((y 4))
          (let ((y 5)
                (x 6)
                (g (lambda (x y) (* y (x y))))) ; line b
            (let ((y 2))
              (g f x)))))) ; line c
    
    results:
    • deep binding:
        (g f 6)
        
        ((lambda (x y) (* y (x y))) f 6)
        
        (* 6 (f 6))
        
        (* 6 ((lambda (x) (* y (+ x x))) 6))
        
        (* 6 (* y? (+ 6 6)))
        
        (* 6 (* 3 (+ 6 6)))
        
        216
        
    • shallow binding:
        (* 6 (* y? (+ 6 6)))
        
        (* 6 (* 4 (+ 6 6)))
        
        288
        
    • ad-hoc binding:
      (* 6 (* y? (+ 6 6)))
      
      (* 6 (* 2 (+ 6 6)))
      
      144
      
summary of results:
  • deep binding: 216
  • shallow binding: 288
  • ad-hoc binding: 144


The Funarg problem

    When McCarthy and his students at MIT were developing the first version of LISP, they really wanted static scoping, but implemented pure dynamic scoping by accident.

    Their second version of LISP (the patchversion) also did not implement static scoping, but rather ad-hoc binding, which is closer to dynamic scoping, but still not quite pure dynamic scoping.

    Dubbed the (downward) funarg problem (i.e., functional argument problem). The upward funarg problem, which is more difficult, involves return functions to functions (rather than passing functions to functions).


Queue abstraction

  • brings us closer to object-oriented programming (OOP)
  • implementation in a purely functional setting would require passing and returning queues from procedure to procedure
  • an alternative is to used a queue shared among all of the procedures
  • this sounds imperative and it is
  • we still want the representation of the queue to be hidden
  • we can create an interface with procedures that will return each of the operations that will act on the shared hidden state of the queue
  • each of those returned procedures is a closure (recall analogy to OOP)
  • interface on [EOPL2] p. 66
  • single queue is simulated by two lists
  • imperative features in the implementation:
    • (set! ...) (Scheme's assignment statement; works by side effect)
    • (begin ...) creates statement blocks
    • create-queue is the queue constructor; returns a vector of queue operations
  • implementation in [EOPL2] Fig. 2.5 (p. 67)
  • this queue is an object and the operations on it are called methods


Overview of lecture

You may not have realized it, but in learning let, let*, and letrec, you have been studying a concept called scope.

Identifiers may appear in two different contexts:

    as references: in (f x y), f, x, and y are references

    as declarations: in (lambda (x) ...) or (let ((x ...)) ...) the occurrence of x is a declaration

The value named by an identifier is called its denotation.

Each reference is (statically or dynamically) bound to a declaration (which has limited scope in most languages).

Languages have binding rules.

In Scheme, the relationship between a reference and its declaration is a static property.

Static scoping: can determine scope by examining the text of the program.

Dynamic scoping: can only determine scope at run-time.

McCarthy's original version of LISP used dynamic scoping.

Perl and COMMON LISP let you choose the scoping method used per variable.

Perl and COMMON LISP let the programmer tune the scoping method used for each variable.

Examples in Perl: dynamic.pl and dynamic2.pl.

Binding rule for lambda calculus: [EOPL2] p. 29.

free or bound (in general for any PL)?

((lambda (x) x) y)
(x bound, y free)

(lambda (y)
   ((lambda (x) x) y))
(x and y now both bound)

The meaning of an expression with no free variables is fixed.

Lambda calculus expressions without free variables are called combinators and are useful programming tools.

;;; application combinator
(lambda (f)
   (lambda (x)
      (f x)))

free or bound in lambda calculus, [EOPL2] p. 31

occurs-free? and occurs-bound? on [EOPL2] p. 32 implement those rules.

Relationship between references and declarations:

(lambda (x) ...)

(define x ...)

nesting

block-structured language

language rules: scoping rules

> (define x               ; line 1
   (lambda (x)            ; line 2
      (map
         (lambda (x)      ; line 4
            (+ x 1))      ; line 5; reference x refers to declaration x on line 4
         x)))             ; line 6; reference x refers to declaration x on line 2

> (x '(1 2 3))            ; line 7; reference x refers to declaration x on line 1
(2 3 4)

scope of x on line 1 ? {line 7}

scope of x on line 2 ? {line 6}

scope of x on line 4 ? {line 5}

Scope of a variable declaration is the text within which references to the variable refer to that declaration.

Scope of declaration v includes all references to v which occur free.

Bound references to v are shadowed by inner declarations of v.

Algorithm: search the regions enclosing the reference inside-out (i.e., from the innermost block to the outermost block).

Lexical depth (use zero-based indexing):

(lambda (x y)
   ((lambda (a)
      (x (a y)))    ; line 3
   x))              ; line 4

0: x on line 4 and a on line 3

1: x and y on line 3

Declaration position (use zero-based indexing)

Variable's lexical address: (v : d p)

(lambda (x y)
   ((lambda (a)
      ((x : 1 0) ((a : 0 0) (y : 1 1))))
   (x : 0 0)))

Lexical address is all we need; (identifier) names are superfluous!

Formal parameter lists are replaced by their length.

(lambda 2
   ((lambda 1
      ((: 1 0) ((: 0 0) (: 1 1))))
   (: 0 0)))

Lexically-bound identifiers are useful for writing and understanding programs, but are unnecessary for executing programs.


References

    [COPL9] R.W. Sebesta. Concepts of Programming Languages. Addison-Wesley, Ninth edition, 2010.
    [EOPL2] D.P. Friedman, M. Wand, and C.T. Haynes. Essentials of Programming Languages. MIT Press, Cambridge, MA, Second edition, 2001.

Return Home