The ALCOOL Analyzer
ALCOOL is a static analyzer whose input language, namely the Parallel Automata Meta Language, is a fragment of the Dijkstra's PV language [6]. PAML comes with a parallel random-access abstract machine (PRAM) based on shared memory following the concurrent read exclusive write convention (CREW).
The ALCOOL input language
PAML is a concurrent toy language. A PAML program is made of several sequential processes running in parallel and sharing resources. There are three kinds of resources:
- A memory location or variable v is declared using
the following syntax, where <value> indicates the initial value of
the variable, it can be any decimal number.
var: v = <value>
The content of the variable v can be modified by an assignment where ε is any PAML expression.v := ε
PAML does not support pointer handling. In particular two distinct identifiers always refer to different memory location. In doing so one avoids aliases and we can decide, at compile time, is a race condition can occur. - A semaphore s is declared using the following syntax, where
arity is an integer indicating how many tokens of type
s are available in the pool of resources at the beginning of the
execution of the program.
sem: <arity> s
A process tries to take a token of type s by the instruction P(s) which should be read as "take a token of type s from the common pool of resources". If such a token is available in the pool of resources, then the process obtains it. Otherwise the process is stalled until a token of type s is available. A process can hold several tokens of the same type. A token of type s is released using the instruction V(s) which shoul dbe read as "put th token back to the pool". If the process which performs the instruction V(s) does not hold a token of type s, the instruction is ignored and the execution of the process keeps on going. - A synchronization barrier b is declared using the following syntax, where
arity is an integer indicating how many processes the barrier
b can stop.
sync: <arity> b
The instruction W(b), which should be read as "wait behind the barrier
b", indeed stops any process that meets it until a certain
number of processes get blocked by it. When the threshold, viz the arity
of the barrier b, is crossed, the executions of all the
processes stalled by the instruction resume. The "Wait" instruction can
be seen as a weakened form of parallel compound since the latter creates
the processes that it synchronizes while the former does not.
A process p is declared using the following syntax, where body is a sequence of instructions.
proc: p = <body>The constructions offered by PAML to describe sequential processes are rather unusual. PAML is actually much closer to the intermediate language of a compiler like GCC or LLVM than to a human friendly one. In that spirit, processes should be understood as "basic blocks". The branchings follows the syntax thereinafter, where ε_{k} is a PAML expression for all k in {1,...,n} and <body_{k}> is a sequence of instructions for all k in {1,...,n+1}. The value 0 being interpreted as false, the branching instructions selects the first k such that ε_{k} evaluates to a nonzero value.
<body_{1}>+[ε_{1}]+...+<body_{n}>+[ε_{n}]+<body_{n+1}>Moreover, PAML does not explicitly offer looping instruction. Instead it has a "Call" instruction C(p) which leaves the execution of the current process and starts the execution of the process p. It should be compared with the Unix instructoin exec which replaces the current process by the one that is "called". In particular their is no return mechanism.
A PAML program is this a sequence of declarations. The final one should be a bootup i.e. a declaration of the following form where <proc_{k}> is process identifier and α_{k} is an integer indicating how many copies of the process identified by <proc_{k}> should be launched at the beginning of each execution.
init: α_{1}<proc_{1}> ... α_{n}<proc_{n}>All the details about the PAML syntax are specified by its grammar, which is given in Extended Backus-Naur Form.
The geometric model of a PAML program
The geometric model of a PAML program is a mathematical object which, in some sense, generalizes flowcharts [9] and control flow graphs [1] to parallel programs. The main feature of ALCOOL is to produce it.
ALCOOL can deduce several properties of a PAML program from its geometric model like the presence of deadlocks [7] and the factorization into independent PAML programs [2].
The geometric model is built independently from the value analysis. Therefore ALCOOL approach differs from the ones suggested in [3], [4], and [5]. In fact, the geometric model of a PAML program is designed to be used as the control flow structure of the program to analyze.
The geometric model of a PAML program Π actually requires that each process apprearing in Π is conservative in the sense that the amount of resources it holds at a given point of its control flow graph does not depend on the path that led to it. This property is thus related to the control flow graph of a process instead of the process itself. For a given control flow graph G, it is decided at compile time by an algorithm whose compexity is linear in the size of G. In fact a process is said to be conservative when it admits a (finite) conservative control flow graph. A finite loop-free control flow graph can always be turned into a finite conservative one.
Theoretical details about geometric models of concurrency can be found in [8].
Examples of conservative and nonconservative processes
A conservative loop.
At each turn a token is taken and released.
A nonconservative loop and its conservative (but infinite)
unfolding.
At each turn the process takes a token so it accumulates them until there is
no more left.
A conservative branching.
The process takes a token before the branching and release it after the
outputs of the branches have been joined.
Another conservative branching.
A token is taken along only one of the branches. In order to have a conservative
control flow graph, one has to keep the outputs of the branches separated.
Examples of PAML programs and their geometric models
The Hasse-Syracuse algorithm.
This program is purely sequential, it illustrates how the standard
"while" construction is emulated by PAML "Call" instructions. In C
language, the Hasse-Syracuse algorithm looks like this.
int v = 3 ; while (v <> 1) { if (v % 2 = O) v := v / 2 else v : = 3 * v + 1 }The PAML version is as follows.
var: v = 7 proc: does_it_continue = C(continue)+[v<>1]+() , continue = (v:=x/2 ; C(does_it_continue))+[v % 2 = 0]+ (v:=3*v+1; C(does_it_continue)) init: does_it_continue
The forbidden square.
The process p gets the token s and releases it. Two copies of p are launched
in parallel. The coordinates of a point of the plane should be understood as
the instruction pointers of the two copies. In this spirit, a point on the
grayed square represents a situation where both copies hold a token of type
s, which is impossible since there is only one such token. The
grayed square is the forbidden region of the program.
sem 1 s proc: P(s);V(s) init: 2p
The tetrahemihexacron (a.k.a. "3D Swiss cross").
We alter the preceding example by running an extra copy p.
sem 1 s proc: P(s);V(s) init: 3p
The floating cube.
We modify the preceding program by adding one token of type s.
sem: 2 s proc: p = P(s);V(s) init: 3p
The binary synchronization.
The process p is blocked by the synchronization barrier b. Two
copies of p are launched in parallel. When both copies are blocked by the
barrier b, it gives away and the executions of both copies are
resumed. Each point of the lines A and B but the synchronization point s represents a
situation where one of the copies of p tries to execute W(b) alone, which is
impossible. The synchronization point
s represents the situation where both
copies of p execute W(b) simultaneously. The forbidden region
of the program is thus the union A∪B without s.
sync: 1 b proc: p = W(b) init: 2p
The Swiss cross.
The simplest PAML program in which a deadlock occurs. The process
p tries to get the token a before the token b.
Symetrically, the process q tries to get the token b before the token a.
If both processes p and q take a token the program is blocked.
The points on the red square correspond to that situation.
sem 1 a b proc: p = P(a);P(b);V(b);V(a) , q = P(b);P(a);V(a);V(b) init: p q
The three dining philosophers.
The 3-dimensional version of the Swiss cross example.
sem: 1 a b c proc: p = P(a);P(b);V(b);V(a) , q = P(b);P(c);V(c);V(b) , r = P(c);P(a);V(a);V(c) init: p q r
The Lipski algorithm.
The model of the Lipski algorithm, depicted below, was introduced in [10] as
an example of program without deadlock though its "request graph" has cycles.
A careful examination reveals that there is indeed a "tunnel" going through
the forbidden region.
sem: 1 u v w x y z proc: p = P(x);P(y);P(z);V(x);P(w);V(z);V(y);V(w) , q = P(u);P(v);P(x);V(u);P(z);V(v);V(x);V(z) , r = P(y);P(w);V(y);P(u);V(w);P(v);V(u);V(v) init: p q r
An artificial obstruction to parallelization.
Each execution of the following PAML program runs four processes: two copies
of p and two copies of q. The geometric model is thus
4-dimensional.
sem: 1 a b sem: 2 c proc: p = P(a);P(c);V(c);V(a) , q = P(b);P(c);V(c);V(b) init: 2p 2qOne remarks that due to the mutual exclusion a, the two copies of p cannot hold more that one token of the semaphore c. The same observation can be made about the mutual exclusion b and the two copies of q. Hence no lack of tokens of type c can occur during the execution of the program. In geometric terms, it means that the forbidden region generated by semaphore c is contained in the forbidden region generated by the semaphores a and b. Consequently, the semaphore c has no influence on the execution of the program which is, in some sense, equivalent to the following program.
sem: 1 a b proc: p = P(a);V(a) , q = P(b);V(b) init: 2p 2qTherefrom, one readily deduces that the original program could have been written as the parallel composition of the following ones.
sem: 1 a proc: p = P(a);V(a) init: 2pand
sem: 1 b proc: q = P(b);V(b) init: 2qThis example was extracted from [2].
Development and availability
ALCOOL is entirely written in OCaml. Its developement has been on and off for almost a decade. Though it is pretty advanced now, it still lacks a user-friendly interface and a documentation. We hope we can make its sources available as soon as possible on github.
References
- Allen, F. E. Control Flow Analysis. In Proceedings of a Symposium on Compiler Optimization, pages 1–19. ACM, 1970.
- Balabonski, T. and Haucourt, E. A Geometric Approach to the problem of Unique Decomposition of Processes. In Concurrency Theory 21th International Conference, vol. 6269 of Lecture Notes in Computer Science, pages 132–146. Springer, 2010.
- Bouajjani, A., Esparza, J., and Touili, T. A Generic Approach to the Static Analysis of Concurrent Programs with Procedures. International Journal of Foundations of Computer Science, 14(4):551–582, 2003.
- Cousot, P. and Cousot, R. Semantic analysis of communicating sequential processes. In Seventh International Colloquium on Automata, Languages and Programming, vol. 85 of Lecture Notes in Computer Science, p. 119–133. Springer, 1980.
- Cridlig, R. Implementing a Static Analyzer of Concurrent Programs: Problems and Perspectives. In Analysis and Verification of Multiple-Agent Languages, selected papers from the 5th LOMAPS workshop, vol. 1192, p. 244–259. Springer, 1997.
- Dijkstra, E. W. Cooperating sequential processes. Technical report, Technological University, Eindhoven, The Netherlands, 1965. Reprinted in Programming Languages: NATO Advanced Study Institute, p.43–112. Academic Press, 1968.
- Fajstrup, L., Goubault, É., and Raussen, M. Detecting Deadlocks in Concurrent Systems. In CONCUR’98 Concurrency Theory, vol. 1466 of Lecture Notes in Computer Science, pages 332–347. Springer, 1998.
- Fajstrup, L., Goubault, É., Haucourt E., Mimram S. and Raussen, M. Directed Algebraic Topology and Concurrency, vol. ? of Springer Briefs in Applied Science and Technology - Mathematical Methods. Springer, 2016.
- Floyd, R. W. Assigning meanings to programs. In Mathematical Aspects of Computer Science, vol.19 of Proceedings of Symposia in Applied Mathematics, p.19-32. AMS, 1967.
- Lipski, W. and Papadimitriou, C. H. A Fast Algorithm for Testing for Safety and Detecting Deadlocks in Locked Transaction Systems. Journal of Algorithms, pages 211–226, 1981.