Example: a subset of the English sentences

This example illustrates a Context Free Grammar generating a subset of the English sentences.

We are interested in simple sentences like "the cat drinks the milk" and "John likes Ann". Namely, sentences composed of a subject, a transitive verb, and a direct object. Additionally, we want to have sentences like "the cat sleeps" and "John walks". Namely, sentences composed of a subject and an intransitive verb.

The starting symbol of our grammar will be <Sentence>. We will have two productions, corresponding to the two cases illustrated above:

   <Sentence> ::= <Subject> <Trans_Verb> <Object>
                   | <Subject> <Intrans_Verb>
Subjects and objects have similar structure, namely they can be articles followed by a noun, like "the cat", or proper nouns, like "John"
   <Subject> ::= <Article>  <Noun> |  <Proper_Name>
    <Object> ::= <Article>  <Noun> |  <Proper_Name>
Let us consider only the article "the". Thus we have:
   <Article> ::= the 
Note that <Sentence>, <Subject>, <Article>, etc. are syntactic categories, namely they stand for a collection of strings. On the contrary, "the" is a terminal symbol and represents only the string itself.

As for nouns, let us consider the following (of course we can enrich the productions so to allow more nouns if we wish):

   <Noun> ::= cat | milk | sun

   <Proper_Name> ::= John | Ann

As for verbs, let us consider the following (again, we could add more verbs):

   <Trans_Verb> ::= drinks | likes

   <Intrans_Verb> ::= sleeps | walks | drinks 
Let us see now some examples. The sentence "the cat drinks the milk", for instance, can be derived from the grammar in the following way:
                     /     |     \
                    /      |      \
                   /       |       \
          <Subject>  <Trans_Verb>   <Object>
         /        |        |        |        \
        /         |        |        |         \
       /          |        |        |          \
 <Article>    <Noun>    drinks    <Article>    <Noun>
      |           |                 |           |
      |           |                 |           |
      |           |                 |           |
     the         cat               the         milk
This figure, showing the structure of the derivation, is called "derivation tree". The language defined by our grammar is the set of all sentences derivable from the grammar.
Other sentences derivable from the grammar (an therefore belonging to the language) are the following: The following sentences, on the contrary, are not derivable from the grammar (and therefore do not belong to the language) The following sentences are syntactically correct, in the sense that they are derivable from the grammar, although they are semantically "questionable": The following sentences is also syntactically correct, but it is semantically ambiguous: In fact, it is not clear whether "the sun" is a star or a computer.

The ambiguities in this language can only be semantic. It is possible to show that there is no syntactical ambiguity, in the sense that every sentence of the language can be derived in one way only. Note that the possibility of the verb "eats" to be interpreted both as transitive and intransitive does not represent a source of ambiguity.

We can illustrate an example of syntactic ambiguity by using a more complicated grammar. Consider for instance the following sentence, which can be generated with a grammar that allows subordinate clauses and conjunctions:

   John thinks that Ann walks and the cat sleeps
This sentence is syntactically ambiguous because it could be interpreted as
   John thinks that (Ann walks and the cat sleeps)
as well as
   John thinks that (Ann walks) and the cat sleeps