CSE 428: Lecture notes 3 

Dangling-else

Imperative languages often allow two kinds of conditional commands (or
statements): the if-then and the if-then-else. Let us consider a possible
grammar generating these commands: 

      Cmd ::= if Exp then Cmd | if Exp then Cmd else Cmd | ...
      (other cmds) 

This grammar is ambiguous, in fact, the command 

      if x > 0 then if x = 1 then print(1) else print(2) 

can be interpreted both as 

      if x > 0 then ( if x = 1 then print(1) else print(2) )

and as 

      if x > 0 then ( if x = 1 then print(1) ) else print(2)

This ambiguity is clearly relevant for the semantics: if the value of x is 2
for example, in the first case the machine should print 2, in the second case
should do nothing. 

This ambiguity originates whenever a command contains an unbalanced
number of then and else (i.e. more then then else). In order to eliminate
it, we must establish a rule which determines, for each else, its matching
then. Usually the convention is the following: 

      Each else matches the last (from left-to-right) unmatched
      then 

In order to impose this rule, one possibility is to modify the productions in
the following way: 

Cmd ::= Bal_Cmd | Unbal_Cmd 
Bal_Cmd ::= if Exp then Bal_Cmd else Bal_Cmd | ... (other cmds) 
Unbal_Cmd ::= if Exp then Cmd | if Exp then Bal_Cmd else Unbal_Cmd 

In this new grammar, the sample command above can only be generated by
a tree imposing the first kind of structure. 

The role of parse trees in the implementation of
programming languages

Parse-trees are are used as an internal representation of the source program
by the interpreter or the compiler. In very schematic terms, we can
represent the various phases of the implementation as follows 

Interpreter case

           _________     ________            __________     _____________  
          |         |   |        |          |          |   |             |  
 source ->| scanner |-->| parser |- parse ->| static   |-->| interpreter |->
results  
          |_________|   |________|  tree    | analyzer |   |_____________|  
                                            |__________|          ^  
                                                                  |  
                                                                data 

Compiler case

           _________     ________            __________    
__________                _________  
          |         |   |        |          |          |   |         
|              |         |  
 source ->| scanner |-->| parser |- parse ->| static   |-->| compiler |->
compiled ->| machine |-> results  
          |_________|   |________|  tree    | analyzer |   |__________|    
code     |_________|  
                                           
|__________|                                  ^  
                                                                                         
|  
                                                                                        
data 


      Scanner: takes the source file (sequence of keyboard characters) and
      generates the sequence of tokens 
      Parser: checks the syntactical correctness and generates the
      parse-tree 
      Static analyzer: check the static correctness and possibly makes some
      optimizations. The optimization can also be a separate phase. The
      static analyzer might enrich the parse tree with some information,
      typically informations about the type of the identifiers. 

Actually, real implementations are often a combination of compilation and
interpretation: the code gets compiled into an intermediate language
(something in between the source langauge and the machine language) and
then interpreted.