Example: a subset of English sentences

This example illustrates a Context Free Grammar generating a subset of the English sentences.

We are interested in simple sentences like "the cat drinks the milk" and "John likes Ann". Namely, sentences composed of a subject, a transitive verb, and a direct object. Additionally, we want to have sentences like "the cat sleeps" and "John walks". Namely, sentences composed of a subject and an intransitive verb.

The starting symbol of our grammar will be <Sentence>. We will have two productions, corresponding to the two cases illustrated above:

   <Sentence> ::= <Subject> <Trans_Verb> <Object>
                   | <Subject> <Intrans_Verb>

Subjects and objects have similar structure, namely they can be articles followed by a name, like "the cat", or proper names, like "John"

   <Subject> , <Object> ::= <Article>  <Name> |  <Proper_Name>

Let us consider only the article "the". Thus we have:

   <Article> ::= the

Note that <Sentence>, <Subject>, <Article>, etc. are syntactic categories, namely they stand for a collection of strings. On the contrary, "the" is a terminal symbol and represents only the string itself.

As for names, let us consider the following names (of course we can enrich the productions so to allow more names if we wish):

   <Name> ::= cat | milk | sun

   <Proper_Name> ::= John | Ann

As for verbs, let us consider the following (again, we could add more verbs):

   <Trans_Verb> ::= drinks | likes

   <Intrans_Verb> ::= sleeps | walks | drinks

Let us see now some examples. The following sentences are all derivable from the grammar (an therefore belong to the language):

the cat drinks the milk
the cat likes the milk
the cat likes John
John drinks the milk
John likes Ann
Ann likes the cat
the cat sleeps
the cat walks
John walks
John drinks

The following sentences, on the contrary, are not derivable from the grammar (and therefore do not belong to the language)

the cat likes
the John walks

The following sentences are syntactically correct, in the sense that they are derivable from the grammar, although they are semantically "questionable":

the milk likes John
the milk drinks the cat

The following sentences is also syntactically correct, but it is semantically ambiguous:

John likes the sun

In fact, it is not clear whether "the sun" is a star or a computer.

The ambiguities however are only semantics. It is possible to show (by structural induction) that there is no syntactical ambiguity, i.e. that the above grammar is unambiguous. Note that the possibility of the verb "eats" to be interpreted both as transitive and intransitive does not represent a source of ambiguity.