e1 op e2 op' e3 (respectively e1 op' e2 op e3 )is interpreted only as
(e1 op e2) op' e3 (respectively e1 op' (e2 op e3) )In other words, op binds tighter than op'.
From the point of view of derivation trees, the fact that e1 op e2 op' e3 is interpreted as (e1 op e2) op' e3 means that the introduction of op must be done at a level strictly lower than op', i.e. in a sub-tree whose root is introduced by the same production which has introduced op'. In order to modify the grammar so that it generates only this kind of tree, a possible solution is to introduce a new syntactic category producing expressions of the form e1 op e2, and to force a hierarchical order wrt to the main category of expressions of the form e1 op' e2.
- e1 * e2
- Exp ::= Exp + Exp | Term
- Term ::= Term * Term | Num
Exp /|\ / | \ / | \ Exp + Exp | | Term Term | /|\ | / | \ | / | \ Num Term * Term | | | 2 Num Num | | 3 5
In the particular case of the + and the * operators, this kind of ambiguity does not cause problems semantically, because they are both associative, i.e. (2 + 3) + 5 and 2 + (3 + 5) have the same value. Analogously for *. In general, however, an operator might be not associative. This is for instance the case for the - and ^ (exponentiation) operators: (5 - 3) - 2 and 5 - (3 - 2) have different values, as well as (5 ^ 3) ^ 2 and 5 ^ (3 ^ 2).
In order to eliminate this kind of ambiguity, we mush establish whether the operator is left-associative or right-associative. Left-associative means that e1 op e2 op e3 is interpreted as (e1 op e2) op e3 (op associates to the left). Vice versa, right-associative means that it is interpreted as e1 op (e2 op e3) (op associates to the right).
We can impose left-associativity (resp. right-associativity) by using a left-recursive (resp. right-recursive) production for op. For instance, in the example of arithmetic expressions, we can enforce left-associativity of + and * in the folloing way
This grammar is now unambiguous.Exp ::= Exp + Term | Term Term ::= Term * Num | Num
- Exp ::= Num | Exp - Exp
- Exp ::= Num | Exp - Num
- Exp ::= Num | Exp ^ Exp
- Exp ::= Num | Num ^ Exp
Stm ::= if Exp then Stm | if Exp then Stm else Stm | ... (other stms)This grammar is ambiguous, in fact, the statement
if x > 0 then if x = 1 then print(1) else print(2)can be interpreted both as
if x > 0 then ( if x = 1 then print(1) else print(2) )and as
if x > 0 then ( if x = 1 then print(1) ) else print(2)This ambiguity is clearly relevant for the semantics: if the value of x is 2 for example, in the first case the machine should print 2, in the second case should do nothing.
This ambiguity originates whenever a statement contains an unbalanced number of then and else (i.e. more then than else). In order to eliminate it, we must establish a rule which determines, for each else, its matching then. Usually the convention is the following:
Each else matches the last (from left-to-right) unmatched thenIn order to impose this rule, one possibility is to modify the productions in the following way:
Stm ::= Bal_Stm | Unbal_StmIn this new grammar, the statement above can only have the first kind of structure.
Bal_Stm ::= if Exp then Bal_Stm else Bal_Stm | ... (other stms)
Unbal_Stm ::= if Exp then Stm | if Exp then Bal_Stm else Unbal_Stm