##
*Fall 2000, CSE 468:
Lecture 5 (Sep 6)*

# The notions of Alphabet, string, and language

## Alphabet

An alphabet A is any (finite) set of symbols. Examples: {a,b,c},
the English alphabet {a,b,c,...,z}, the alphabet of digits
{0,1,2,...,9}, etc.
## String

A string is a finit sequence of symbols of the given alphabet.
Examples: lambda (the empty string), a, ab, aba, etc.
### String concatenation

If x and y are strings, then xy is the string obtained
by concatenating x and y (first x then y). Example:
if x = ab and y = bc, then
xy = abbc.
### Length of a string

The length of a string x, which we will denote by |x|, is the
number of symbols occurring in x, counting the repetitions of the same symbol.
Example: |abbc| = 4.
## Language

A language is any set of strings on the given alphabet. Example:
L = {lambda, a, ab, abcaa}.
### Language concatenation

If L_{1} and L_{2} are languages,
then we define
L_{1}L_{2} = {x_{1}x_{2} |
x_{1} is in L_{1}, x_{2} is in
L_{2}}

### Exponential and Kleene's star

The exponential is defined inductively (or recursively) in the following way
L^{0} = {lambda}

L^{n+1} = L^{n}L

Note that the concatenation is associative, hence we could have
equivalently defined L^{n+1} = LL^{n}.
The star is defined as follows
L^{*} = the union of all L^{n} for n greater than or equal to 0

Sometimes we want to exclude L^{0} from this construction,
therefore we define also
L^{+} = the union of all L^{n} for n greater than or equal to 1

Note that the set of all strings in a alphabet A is A^{*}.
### Definition of languages

The first part of the course will focus
on the formal definition of languages (i.e. the formal specification of
a particular language as a particular set of strings), the
study of properties of a language, and the recognition of
the strings of a language.
We will start with the class of regular languages, which are rather simple
from the point of view of definition and recognition,
but yet interesting, in the sense that they can be infinite
and can contain strings of arbitrary length.

# Regular Expressions and Regular Languages

## Regular expressions

The regular expressions are all those expressions that can
be constructed on
- lambda
- symbols of the alphabet
- + (binary)
- concatenation (binary)
- * (Kleene's star, unary)

We will use parentheses to represent the structure
of an expression, and we will assume that * has precedence
over concatenation, concatenation has precedence over +, and
that concatenation and + are associative.
## Language represented by a regular expression

The empty string
lambda stands for {lambda}, an alphabet symbol a stands for {a},
+ stands for union, and concatenation and * stand for the homonymous
operations on languages.
**Examples** The set of strings on {a,b}
with even lenght can be represented by the regular expression
(aa + ab + ba + bb)^{*}, or equivalently by the regular expression
((a + b)(a + b))^{*}.

## Regular Languages

The class of Regular Languages is constituted by all (and only) the
languages represented by regular expressions.