Definitions

The CPU (Central Processing Unit) of a computer is where all the logical and arithmetic tests, loops and decisions take place, and where control commands and data exchanges are issued to devices such as memory, disk, screen, etc. The behaviour of the CPU is determined by its state, which is described by the content of all its registers and internal memory caches. Letting

be the set of possible CPU states, the CPU acts like a deterministic function $f:S\to S$ . According to this model, to each state $s\in S$ there corresponds a next state

. The rate at which the CPU changes state is governed by the system clock (usual rates are between 1 and 3 GHz). Thus, around every billionth of a second, the CPU changes its state.

The form of the function obviously depends on the CPU make and model. CPUs usually contain some extremely fast but very small memory chunks called ``registers'' which are specifically designed to store either values or memory addresses. The state of the CPU at each clock tick is then determined by the values contained in each of its registers. The CPU is designed in such a way that at each clock tick the memory address contained in a certain register will automatically be incremented, and the value contained at the new address is read and interpreted as a ``machine code instruction''. This allows us to interpret the function in a different way: we can consider the next state of the CPU as given by a function $p:I\times S\to S$ with , where is a machine code instruction in the set of all possible CPU instructions. Although each basic instruction in is rather simple, this interpretation of makes it possible to group several simple instructions into more complex ones¹. As some of the instructions concern logical tests and loops, it becomes apparent that the full semantics of any modern computer language (including C++) can indeed be exploited by a CPU after a suitable transformation of the complex, high-level language into the simple machine code instruction set .

Loosely speaking, the set can be partitioned in the following instruction categories.

Input: transfer data from external device to processor
Output: transfer data from processor to external device
Storage: transfer data from processor to memory
Retrieval: transfer data from memory to processor
AL operation: perform arithmetic/logical operation on data
Test: verify condition on data and act accordingly
Loop: repeat a sequence of operations

In practice, these instructions are encoded in machine language, i.e. sequences of bits. The length of each instruction depends on the width of the CPU registers. The width of each register is measured in terms of the amount of BInary digiTs $\{0,1\}$ (bits) it can contain. Traditionally, on Intel 16-bit architectures (32- and 64- bit architectures are evolutions thereof, and each new version is guaranteed to retain backward compatibility) there are four general-purpose registers: AX (accumulator), BX (base), CX (counter), DX (data); four pointer registers: SI (source index), DI (destination index), BP (base pointer), SP (stack pointer); four segment registers: CS (code segment), DS (data segment), ES (extra segment), SS (stack segment); and finally, one instruction pointer IP. The machine code instruction

loaded at each clock tick to compute

is the value found at the address CS:IP. More information can be found at http://www.ee.hacettepe.edu.tr/~alkar/ELE414/ and http://ourworld.compuserve.com/homepages/r_harvey/doc_cpu.htm.

Consider now the following (informal) definitions:

Program: set of instructions that can be interpreted by a computer
Instructions: well-formed sequences of characters (syntax)
Interpretation: sequence of operations performed by the computer hardware (semantics)
Programming language: set of rules used to form valid instructions
Algorithm: a program which terminates (though sometimes find ``non-terminating algorithm'' with abuse of notation)

The well-formedness of the sequence of characters in each instruction corresponds to the C++ syntax which is one of the subjects of these notes, and will therefore be explained in more detail later. The same holds for the semantics of each C++ program.

Leo Liberti 2008-01-12