Dynamic variables are used for creating dynamic structures, like lists, trees, etc. Namely structures which can expand or shrink at run-time.
A variable x of type "pointer to type T" can be declared in C++ as follows
T* x;
Meaning of the operator "new T": Allocate a variable from the heap (i.e. remove a location from FL), and return its address. Actually, depending on the "size" of T there might be more than one location removed. However we will not be concerned with this issue for the time being.
Thus an instruction of the form
x = new T;in C++ assigns to the pointer x the address of the location taken from the FL. Note that there are two locations involved: the location l associated to x by the declaration T* x; and the location l' taken from the FL. The content of l is the address of l'.
In order to access the location l', we must use x, dereferenced. The dereferencing operator in C++ is the unary *. Thus an assignment to l', or the access to the value of l', can be done by using *x.
A memory leak is a heap location that cannot be reached anymore from the active variables and yet it is not in the FLClearly a memory leak represents an undesirable situation, because it's a waste of memory. If too much leakage is generated, then the execution might abort due to lack of free memory.
For instance, we have a memory leak if, after the instruction x = new T; we execute any of the following instructions
void p(){ int* x; x = new int; } // when p returns x is deallocated and the location pointed by x becomes a memory leakFinally, a common situation in which the risk of memory leak may arise is when the pointer is itself in the heap, and gets deallocated. This is typical when we have dynamic structures, like lists and trees. We will discuss this situation later.
delete x;Meaning of the instruction "delete x;": place back in FL the location pointed by x
For instance, in the procedure p above, we could avoid memory leak by adding the instruction delete x; before p returns. Analogously, we could add such instruction before x = NULL; etc.
class tree{ int info; tree* left; tree* right; public: tree(int n){info = n; left = right = NULL; } tree(int n, tree* l, tree* r){info = n; left = l; right = r; } }Suppose that we create a tree with an instructions like the following:
tree* t = new tree(2, new tree(1), new tree(3));Now, when we want to destroy the tree, we cannot simply use
delete t;because this would deallocate only the node pointed by t, and would leave the nodes pointed by t->left and t->right as memory leaks. Note that the pointers t->left and t->right were living in the heap.
In order to deallocate the all tree we should write
delete (t->left); delete (t->right); delete t;
A similar problem occurs when t is actually on the stack, and goes out of scope. For instance:
void q(){ tree t = tree(2, new tree(1), new tree(3)); } // when q returns the nodes pointed by t.left and t.right become memory leaksIn the latter case the pointers t.left and t.right were living in the stack.
In order to avoid memory leaks in this second case, we should write:
void q(){ tree t = tree(2, new tree(1), new tree(3)); delete (t.left); delete (t.right); } // no memory leaks when q returns
~tree(){ delete left; delete right; }This destrucor will deallocate all nodes in a tree of arbitrary size. The deletion of each node, in fact, causes recursively the application of the destructor to the left and right subtrees. The terminal case is when left and right are NULL (leaf). delete applied to NULL has no effect.
A dangling pointer is a pointer to a location which is considered free and may be reallocated laterA dangling pointer is considered a dangerous situation, because it is a potential source of errors difficult to detect. Consider for instance the following situation:
int* x = new int; delete x; int* y = new int; *y = 5; *x = *x + 1; cout << *y; // it may print 6 instead of 5.The cout instruction in the code above prints 6 in case the location allocated for the pointer y is the one which was returned to the FL by delete x;. (This will be the case if the FL is handled with a LIFO discipline.) In C++ delete x; does not erase from x the address of the location, hence *x still refers to the same location that it was referring to before delete was executed.
Note that, even if delete x would "cancel" the content of x, delete can still cause dangling pointers. Consider for instance the following fragment:
int* x = new int; int* y = x; delete x; //y becomes a dangling pointerIn general, delete cannot cancel the content of all pointers which are pointing to the location which is being returned to the FL: it would be too expensive.
As another example, consider the following program, which uses a list. In the main function, after the instruction delete L; the pointer L1 becomes a dangling reference and the instruction L2 = new list(2,L1) may create a circular list.
const int NULL = 0; #includeclass list { public: int info; list* next; list(int n, list* l){ info = n; next = l; } }; void main(){ list* L1 = new list(1,NULL); list* L = L1; delete L; // L1 becomes a dangling reference list* L2 = new list(2,L1); // It may create a circular list while (L2 != NULL) { // It will loop forever if L2 is a circular list cout << L2->info; L2 = L2->next; } }
void p(){ int y; x = &y; // x is a global variable of type int* }During a call to p, x is set to point to the address of y, which is on the stack. After p returns, y is deallocated and x becomes therefore a dangling pointer to the stack.
A dangling pointer to the stack is a very dangerous situation, which may lead to catastrophic and not easily detectable errors. Like, for instance, changing the value of the control link or the static link, etc.
In Pascal the operations on the pointers have been restricted so that dangling pointers are confined to the heap. Namely, a dangling pointer to the stack can never occur. Pascal does not allow pointer aritmetic, and the referencing operator does not exists.
Garbage collection is a very convenient mechanism from the programmer's point of view. The obvious disadvantage is that it is expensive. In fact, it is in general costly to determine whether a location is a memory leak (garbage) or not, because we need to check whether or not it can be reached from the active variables. Typically, there might be locations that are linked in structures like trees or lists, and in order to determine whether they are leaks or not we need to trace all chain of links.
Garbage collection is used in Java, in some implementations of Pascal, and in most
functional and logical languages, where the heap allocation and deallocation is
transparent to the user. Additionally, it is used in the implementation of languages that
(because of their features) cannot be implemented in a stack-based manner
and need to allocate the activation records in the heap. One particular features that makes
stack-based allocation impossible is the presence of higher-order functions, in combination
with static scope. We will discuss this problem later in the course when we will introduce
functional programming in ML.