The ALCOOL Analyzer

ALCOOL is a static analyzer whose input language, namely the Parallel Automata Meta Language, is a fragment of the Dijkstra's PV language [6]. PAML comes with a parallel random-access abstract machine (PRAM) based on shared memory following the concurrent read exclusive write convention (CREW).

The ALCOOL input language

PAML is a concurrent toy language. A PAML program is made of several sequential processes running in parallel and sharing resources. There are three kinds of resources:

  1. A memory location or variable v is declared using the following syntax, where <value> indicates the initial value of the variable, it can be any decimal number.
            var: v = <value>
    The content of the variable v can be modified by an assignment where ε is any PAML expression.
            v := ε
    PAML does not support pointer handling. In particular two distinct identifiers always refer to different memory location. In doing so one avoids aliases and we can decide, at compile time, is a race condition can occur.
  2. A semaphore s is declared using the following syntax, where arity is an integer indicating how many tokens of type s are available in the pool of resources at the beginning of the execution of the program.
            sem: <arity> s
    A process tries to take a token of type s by the instruction P(s) which should be read as "take a token of type s from the common pool of resources". If such a token is available in the pool of resources, then the process obtains it. Otherwise the process is stalled until a token of type s is available. A process can hold several tokens of the same type. A token of type s is released using the instruction V(s) which shoul dbe read as "put th token back to the pool". If the process which performs the instruction V(s) does not hold a token of type s, the instruction is ignored and the execution of the process keeps on going.
  3. A synchronization barrier b is declared using the following syntax, where arity is an integer indicating how many processes the barrier b can stop.
            sync: <arity> b
  4. The instruction W(b), which should be read as "wait behind the barrier b", indeed stops any process that meets it until a certain number of processes get blocked by it. When the threshold, viz the arity of the barrier b, is crossed, the executions of all the processes stalled by the instruction resume. The "Wait" instruction can be seen as a weakened form of parallel compound since the latter creates the processes that it synchronizes while the former does not.

A process p is declared using the following syntax, where body is a sequence of instructions.

        proc: p = <body>
The constructions offered by PAML to describe sequential processes are rather unusual. PAML is actually much closer to the intermediate language of a compiler like GCC or LLVM than to a human friendly one. In that spirit, processes should be understood as "basic blocks". The branchings follows the syntax thereinafter, where εk is a PAML expression for all k in {1,...,n} and <bodyk> is a sequence of instructions for all k in {1,...,n+1}. The value 0 being interpreted as false, the branching instructions selects the first k such that εk evaluates to a nonzero value.
      <body1>+[ε1]+...+<bodyn>+[εn]+<bodyn+1>  
Moreover, PAML does not explicitly offer looping instruction. Instead it has a "Call" instruction C(p) which leaves the execution of the current process and starts the execution of the process p. It should be compared with the Unix instructoin exec which replaces the current process by the one that is "called". In particular their is no return mechanism.

A PAML program is this a sequence of declarations. The final one should be a bootup i.e. a declaration of the following form where <prock> is process identifier and αk is an integer indicating how many copies of the process identified by <prock> should be launched at the beginning of each execution.

    init: α1<proc1> ... αn<procn>
All the details about the PAML syntax are specified by its grammar, which is given in Extended Backus-Naur Form.

The geometric model of a PAML program

The geometric model of a PAML program is a mathematical object which, in some sense, generalizes flowcharts [9] and control flow graphs [1] to parallel programs. The main feature of ALCOOL is to produce it.

ALCOOL can deduce several properties of a PAML program from its geometric model like the presence of deadlocks [7] and the factorization into independent PAML programs [2].

The geometric model is built independently from the value analysis. Therefore ALCOOL approach differs from the ones suggested in [3], [4], and [5]. In fact, the geometric model of a PAML program is designed to be used as the control flow structure of the program to analyze.

The geometric model of a PAML program Π actually requires that each process apprearing in Π is conservative in the sense that the amount of resources it holds at a given point of its control flow graph does not depend on the path that led to it. This property is thus related to the control flow graph of a process instead of the process itself. For a given control flow graph G, it is decided at compile time by an algorithm whose compexity is linear in the size of G. In fact a process is said to be conservative when it admits a (finite) conservative control flow graph. A finite loop-free control flow graph can always be turned into a finite conservative one.

Theoretical details about geometric models of concurrency can be found in [8].

Examples of conservative and nonconservative processes

A conservative loop.
At each turn a token is taken and released.



A nonconservative loop and its conservative (but infinite) unfolding.
At each turn the process takes a token so it accumulates them until there is no more left.



A conservative branching.
The process takes a token before the branching and release it after the outputs of the branches have been joined.



Another conservative branching.
A token is taken along only one of the branches. In order to have a conservative control flow graph, one has to keep the outputs of the branches separated.

Examples of PAML programs and their geometric models

The Hasse-Syracuse algorithm.
This program is purely sequential, it illustrates how the standard "while" construction is emulated by PAML "Call" instructions. In C language, the Hasse-Syracuse algorithm looks like this.

    int v = 3 ; 
    while (v <> 1) {
      if (v % 2 = O) 
        v := v / 2
      else 
        v : = 3 * v + 1 }
The PAML version is as follows.
    var: v = 7
    proc:
      does_it_continue = 
        C(continue)+[v<>1]+() 
      ,
      continue = 
        (v:=x/2 ; C(does_it_continue))+[v % 2 = 0]+
        (v:=3*v+1; C(does_it_continue))
    init: does_it_continue

The forbidden square.
The process p gets the token s and releases it. Two copies of p are launched in parallel. The coordinates of a point of the plane should be understood as the instruction pointers of the two copies. In this spirit, a point on the grayed square represents a situation where both copies hold a token of type s, which is impossible since there is only one such token. The grayed square is the forbidden region of the program.

  sem 1 s
  proc: P(s);V(s)
  init: 2p

The tetrahemihexacron (a.k.a. "3D Swiss cross").
We alter the preceding example by running an extra copy p.

  sem 1 s
  proc: P(s);V(s)
  init: 3p

The floating cube.
We modify the preceding program by adding one token of type s.

  sem: 2 s
  proc: 
    p = P(s);V(s)
  init: 3p

The binary synchronization.
The process p is blocked by the synchronization barrier b. Two copies of p are launched in parallel. When both copies are blocked by the barrier b, it gives away and the executions of both copies are resumed. Each point of the lines A and B but the synchronization point s represents a situation where one of the copies of p tries to execute W(b) alone, which is impossible. The synchronization point s represents the situation where both copies of p execute W(b) simultaneously. The forbidden region of the program is thus the union A∪B without s.

  sync: 1 b
  proc: 
    p = W(b)
  init: 2p

The Swiss cross.
The simplest PAML program in which a deadlock occurs. The process p tries to get the token a before the token b. Symetrically, the process q tries to get the token b before the token a. If both processes p and q take a token the program is blocked. The points on the red square correspond to that situation.

  sem 1 a b
  proc: 
    p = P(a);P(b);V(b);V(a) ,
    q = P(b);P(a);V(a);V(b)
  init: p q

The three dining philosophers.
The 3-dimensional version of the Swiss cross example.

  sem: 1 a b c
  proc: 
    p = P(a);P(b);V(b);V(a) ,
    q = P(b);P(c);V(c);V(b) ,
    r = P(c);P(a);V(a);V(c)
  init: p q r

The Lipski algorithm.
The model of the Lipski algorithm, depicted below, was introduced in [10] as an example of program without deadlock though its "request graph" has cycles. A careful examination reveals that there is indeed a "tunnel" going through the forbidden region.

 
    sem: 1 u v w x y z
    proc:
      p = P(x);P(y);P(z);V(x);P(w);V(z);V(y);V(w) ,
      q = P(u);P(v);P(x);V(u);P(z);V(v);V(x);V(z) ,
      r = P(y);P(w);V(y);P(u);V(w);P(v);V(u);V(v)
    init: p q r

An artificial obstruction to parallelization.
Each execution of the following PAML program runs four processes: two copies of p and two copies of q. The geometric model is thus 4-dimensional.

    sem: 1 a b
    sem: 2 c
    proc:
      p = P(a);P(c);V(c);V(a) ,
      q = P(b);P(c);V(c);V(b)
    init: 2p 2q
One remarks that due to the mutual exclusion a, the two copies of p cannot hold more that one token of the semaphore c. The same observation can be made about the mutual exclusion b and the two copies of q. Hence no lack of tokens of type c can occur during the execution of the program. In geometric terms, it means that the forbidden region generated by semaphore c is contained in the forbidden region generated by the semaphores a and b. Consequently, the semaphore c has no influence on the execution of the program which is, in some sense, equivalent to the following program.
    sem: 1 a b
    proc:
      p = P(a);V(a) ,
      q = P(b);V(b)
    init: 2p 2q
Therefrom, one readily deduces that the original program could have been written as the parallel composition of the following ones.
    sem: 1 a
    proc:
      p = P(a);V(a)
    init: 2p 
and
    sem: 1 b
    proc:
      q = P(b);V(b)
    init: 2q
This example was extracted from [2].

Development and availability

ALCOOL is entirely written in OCaml. Its developement has been on and off for almost a decade. Though it is pretty advanced now, it still lacks a user-friendly interface and a documentation. We hope we can make its sources available as soon as possible on github.

References

  1. Allen, F. E. Control Flow Analysis. In Proceedings of a Symposium on Compiler Optimization, pages 1–19. ACM, 1970.
  2. Balabonski, T. and Haucourt, E. A Geometric Approach to the problem of Unique Decomposition of Processes. In Concurrency Theory 21th International Conference, vol. 6269 of Lecture Notes in Computer Science, pages 132–146. Springer, 2010.
  3. Bouajjani, A., Esparza, J., and Touili, T. A Generic Approach to the Static Analysis of Concurrent Programs with Procedures. International Journal of Foundations of Computer Science, 14(4):551–582, 2003.
  4. Cousot, P. and Cousot, R. Semantic analysis of communicating sequential processes. In Seventh International Colloquium on Automata, Languages and Programming, vol. 85 of Lecture Notes in Computer Science, p. 119–133. Springer, 1980.
  5. Cridlig, R. Implementing a Static Analyzer of Concurrent Programs: Problems and Perspectives. In Analysis and Verification of Multiple-Agent Languages, selected papers from the 5th LOMAPS workshop, vol. 1192, p. 244–259. Springer, 1997.
  6. Dijkstra, E. W. Cooperating sequential processes. Technical report, Technological University, Eindhoven, The Netherlands, 1965. Reprinted in Programming Languages: NATO Advanced Study Institute, p.43–112. Academic Press, 1968.
  7. Fajstrup, L., Goubault, É., and Raussen, M. Detecting Deadlocks in Concurrent Systems. In CONCUR’98 Concurrency Theory, vol. 1466 of Lecture Notes in Computer Science, pages 332–347. Springer, 1998.
  8. Fajstrup, L., Goubault, É., Haucourt E., Mimram S. and Raussen, M. Directed Algebraic Topology and Concurrency, vol. ? of Springer Briefs in Applied Science and Technology - Mathematical Methods. Springer, 2016.
  9. Floyd, R. W. Assigning meanings to programs. In Mathematical Aspects of Computer Science, vol.19 of Proceedings of Symposia in Applied Mathematics, p.19-32. AMS, 1967.
  10. Lipski, W. and Papadimitriou, C. H. A Fast Algorithm for Testing for Safety and Detecting Deadlocks in Locked Transaction Systems. Journal of Algorithms, pages 211–226, 1981.