e1 op e2 op' e3 (respectively e1 op' e2 op e3 )is interpreted only as
(e1 op e2) op' e3 (respectively e1 op' (e2 op e3) )In other words, op binds tighter than op'.
From the point of view of derivation trees, the fact that e1 op e2 op' e3 is interpreted as (e1 op e2) op' e3 means that the introduction of op must be done at a level strictly lower than op', i.e. in a sub-tree whose root is introduced by the same production which has introduced op'. In order to modify the grammar so that it generates only this kind of tree, a possible solution is to introduce a new syntactic category producing expressions of the form e1 op e2, and to force a hierarchical order wrt to the main category of expressions of the form e1 op' e2.
We can eliminate the ambiguities from the grammar in the example of the arithmetic expressions by introducing a new syntactic category Term producing expressions of the form
e1 * e2where e1 and e2 may contain * again, but not +. This can be done by organizing hierarchically the productions as follows:
Exp ::= Exp + Exp | Term
Term ::= Term * Term | Num
This modification corresponds to assigning * a higher priority wrt + (following the mathematical convention). Consider again the string 2 + 3 * 5. It is easy to see that in the new grammar there is only one tree which can generate it:
Exp /|\ / | \ / | \ Exp + Exp | | Term Term | /|\ | / | \ | / | \ Num Term * Term | | | 2 Num Num | | 3 5
In the particular case of the + and the * operators, this kind of ambiguity does not cause problems semantically, because they are both associative, i.e. (2 + 3) + 5 and 2 + (3 + 5) have the same value. Analogously for *. In general, however, an operator might be not associative. This is for instance the case for the - and ^ (exponentiation) operators: (5 - 3) - 2 and 5 - (3 - 2) have different values, as well as (5 ^ 3) ^ 2 and 5 ^ (3 ^ 2).
In order to eliminate this kind of ambiguity, we mush establish whether the operator is left-associative or right-associative. Left-associative means that e1 op e2 op e3 is interpreted as (e1 op e2) op e3 (op associates to the left). Vice versa, right-associative means that it is interpreted as e1 op (e2 op e3) (op associates to the right).
We can impose left-associativity (resp. right-associativity) by using a left-recursive (resp. right-recursive) production for op. For instance, in the example of arithmetic expressions, we can enforce left-associativity of + and * in the following way
Exp ::= Exp + Term | TermThis grammar is now unambiguous.
Term ::= Term * Num | Num
Consider the following grammar (productions) for numerical expressions constructed with the - operation:
Exp ::= Num | Exp - Exp
This grammar is ambiguous since it allows both the interpretations (5 - 3) - 2 and 5 - (3 - 2). If we want to impose the left-associativity (following the mathematical convention), it is sufficient to modify the productions in the following way:
Exp ::= Num | Exp - Num
Consider the following grammar (productions) for numerical expressions constructed with the ^ operation:
Exp ::= Num | Exp ^ Exp
This grammar is ambiguous since it allows both the interpretations (5 ^ 3) ^ 2 and 5 ^ (3 ^ 2). If we want to impose the right-associativity (following the mathematical convention), it is sufficient to modify the productions in the following way:
Exp ::= Num | Num ^ Exp
Stm ::= if Exp then Stm | if Exp then Stm else Stm | ... (other stms)This grammar is ambiguous, in fact, the statement
if x > 0 then if x = 1 then print(1) else print(2)can be interpreted both as
if x > 0 then ( if x = 1 then print(1) else print(2) )and as
if x > 0 then ( if x = 1 then print(1) ) else print(2)This ambiguity is clearly relevant for the semantics: if the value of x is 2 for example, in the first case the machine should print 2, in the second case should do nothing.
This ambiguity originates whenever a statement contains an unbalanced number of then and else (i.e. more then than else). In order to eliminate it, we must establish a rule which determines, for each else, its matching then. Usually the convention is the following:
Each else matches the last (from left-to-right) unmatched thenIn order to impose this rule, one possibility is to modify the productions in the following way:
Stm ::= Bal_Stm | Unbal_StmIn this new grammar, the statement above can only have the first kind of structure.
Bal_Stm ::= if Exp then Bal_Stm else Bal_Stm | ... (other stms)
Unbal_Stm ::= if Exp then Stm | if Exp then Bal_Stm else Unbal_Stm