Seminar Reports: Occam-Pi programming Language and Rmox

Showing posts with label Occam-Pi programming Language and Rmox. Show all posts

Tuesday, December 2, 2008

INTRODUCTION
HISTORY
General Syntax
Communicating Process Paradigm
Occam & CSP
Mobile Channel Bundles
RMoX Operating System
Conclusion & Future Work
References

ABSTRACT

The occam [Inm84a, Inm84b, Inm95, Bar92] programming language is a concurrent programming Language based on the CSP [Hoa78, Hoa85, Ros97] process algebra. CSP provides a rigorous parallel model, whereby parallel processes engage in synchronous events. A process may also select between several events on other (external choice). In occam, CSP events appear mostly as channels (wherewo processes synchronise and data is copied) and at the ends of PARallel blocks — to ensure that l nested parallel processes have terminated. occam was developed between Oxford University and Inmos, with the Transputer [Inm93] in mind. The Transputer combined an on-chip micro-coded scheduler with on-chip external communications (on DS links), resulting in highly scalable multi-processor platforms for parallel processing.The T9000 Transputer additionally provided a virtual channel capability, allowing several virtual inks to be multiplexed into one physical link [MTW93]. Coupled with crossbar routing, namely theC104, this provided a solid platform for the distribution and execution of occam programs.Then we can have a look at the RMoX operating sysyem and its architecture.

INTRODUCTION

The occam programming language is designed to express concurrent algorithms and their implementation on a network of processing components. The occam reference manual serves to provide a single reference, and definition of the language occam. The manual describes each aspect of the language, starting with the most primitive components of an occam program, and moving on to cover the whole language in detail. The manual is addressed to the wider audience, including not only the computer scientist, software engineer and programmer, but also the electronics engineer and system designer.

Programming in occam is easy. Occam enables an application to be described as a collection of processes, where each process executes concurrently, and communicates with other processes through channels. Each process in such an application describes the behaviour of a particular aspect of the implementation, and each channel describes the connection between each of the processes. This approach has two important consequences. Firstly, it gives the program a clearly defined and simple structure. Secondly, it allows the application to exploit the performance of a system which consists of many parts. Concurrency and communication are the prime concepts of the occam model. occam captures the hierarchical structure of a system by allowing an interconnected set of processes to be regarded as a unified, single process. At any level of detail, the programmer is only concerned with a small, manageable set ofprocesses.

HISTORY

Occam is an ideal introduction to a number of key methodologies in modern computer science. Occam programs can provide a degree of security unknown in conventional programming languages such as C, FORTRAN or Pascal. occam simplifies the task of program verification, by allowing application of mathematical proof techniques to prove the correctness of programs. Transformations, which convert a process from one form to a directly equivalent form, can be applied to the source of an occam program to improve its efficiency in any particular environment. occam makes an ideal language for specification and behavioural description. occam programs are easily configured onto the hardware of a system or indeed, may specify the hardware of a system. The founding principle of occamis a minimalist approach which avoids unnecessary duplication of language mechanism, and is named after the 14th century philosopher William of Occam who proposed that invented entities should not be duplicated beyond necessity. This proposition has become known as “Occam’s razor”. The occam programming language arises from the concepts founded by David May in EPL (Experimental Programming Language) and Tony Hoare in CSP (Communicating Sequential Processes). Since its conception in 1982 occam has been, and continues to be under development at INMOS Limited, in the United Kingdom, under the direction of David May. The support for large programs provided by occam3 is based on principles found in many current programming languages which have been refined at INMOS by Geoff Barrett. The development of the INMOS transputer, a device which places a microcomputer on a single chip, has been closely related to occam, its design and implementation. The transputer reflects the occam architectural model, and may be considered an occam machine. occam is the language of the

transputer and as such, when used to program a single transputer or a network of transputers, provides the equivalent efficiency to programming a conventional computer at assembler level. However, this manual does not make any assumptions about the hardware implementation of the language or the target system. occam is a trademark of the INMOS group of companies.

Aims of This Work

The primary objective of this work is to provide support, at the language, kernel and operating system level, for highly concurrent dynamic parallel systems based on occam/CSP.This objective is sought in a number of ways. Firstly, by the extension and general enhancement of the occam programming language, largely by adding dynamic capabilities, plus a number of other (often trivial) extensions that bring it closer to languages such as C and Java (which rely heavilyon dynamic memory allocation), whilst remaining secure against aliasing and parallel usage errors. Second to provide support for data, channel and process mobility, using a movement semantics. And finally, by improving the interface between occam programs and the operating-system environment, allowing programmers to make full use of the UNIX/POSIX [Int96] environment. As a further objective, this work aims to improve the maintainability and safety of occam code, particularly in light of the new facilities added.

Other Approaches to Compiling occam

This thesis is concentrated on the KRoC occam system. However, other ways of compiling occamprograms exist. The most common of these is SPoC[DHWN94] — theSouthampton Portable occam

1 Compiler .SPoC takes a different approach to compiling occam programs from KRoC. Instead of using the Inmos occam compiler, SPoC uses a compiler written (from scratch) using the GMD compiler compiler toolkit [GE90], to produce a compiler that generates portable ANSI C code from the occam.

In a Google search for “occam compiler”, circa October 2002, SPoC appeared at the top of the list, followed by KRoC in 2nd place. Currently, KRoC appears in first place, followed by SPoC in second. This change of ordering is most likely due to the addition of KRoC to the popular ‘FreshMeat’ website (http://freshmeat.net/), that acts as a public project-management and bulletin-board system, mostly for free software.

Parallel programming is too often seen as a “hard” discipline, and one area which the majority of programmers try to avoid. This has not come about through the non-availability of parallel hardware, such as the Transputer — current hardware is more than adequate — but through the lack of appropriate software tools and infrastructures to build concurrent systems. Traditional languages like C, and more recently Java, suffer when parallelism is “bolted on”, frequently resulting in more harm than good. This arises because there is little or no control over how that parallelism is used — the programmer can write all sorts of race-hazardous code, often without realizing it. The problem of managing parallelism in these languages increases with the size of the system, leading to catastrophic failures in large systems that are almost impossible to pin down. Large non-parallel systems also suffer from such problems, that can easily be caused by variable/pointer aliasing errors. The occam language, with CSP semantics for parallelism, solves the majority of these problems: occam does not permit uncontrolled aliasing and CSP provides composable semantics for building parallel systems.

Despite this clear advantage, occam suffers from an inability to interact fully with the surrounding operating-system and hardware environments. The principal reason for this, in the KRoC occam system, is twofold. Firstly, adding significant amounts of code to hand-coded assembly language, of which the majority of KRoC run-time kernels are implemented in, is a difficult and daunting task. Secondly, the surrounding operating-system environment is often poorly adapted for fine-grain parallelism — for instance, executing a blocking system-call from within a (KRoC) occam process will cause all parallel occam processes to be suspended, while the program is blocked in the OS kernel. This makes writing programs that utilise inter-process communication (IPC) and networking difficult. Difficult to the point where it may be preferable to use C, or another language, and risk race-hazard and aliasing errors.

Another of occam’s limiting factors is its lack of dynamic behaviour. Traditional occam programs can be viewed as static process graphs, nodes representing processes and arcs representing channels. Even though parts of a program may come into existence then disappear, all the graphs can be defined statically. One of the reasons for this static model stems from the Transputer, that had real finite memory (as opposed to virtual memory), and by today’s standards, a much lower processing capacity Transputers were designed to be assemble din networks to increase processing yield.

General Syntax

Syntactic notation

The syntax of occam programs is described in a modified Backus-Naur Form (BNF). As an example, the following shows the syntax of assignment
assignment = variable := expression

This means “An assignment is a variable followed by the symbol :=, followed by an expression”. A vertical bar (|) means “or”, so for example:

Action= assignment
| input
| output
is the same as
action =assignment
action = input
action =output

The meaning of this syntax is “An action is an assignment, an input, or an output”.

The written structure of occam programs is specified by the syntax. Each statement in an occam program normally occupies a single line, and the indentation of each statement forms an intrinsic part of the syntax of the language. The following example shows the syntax for sequence:
Sequence = SEQ
{process}

The syntax here means “A sequence is the keyword SEQ followed by zero or more processes, each on a separate line, and indented two spaces beyond SEQ”. Curly brackets and are used to indicate the number of times some syntactic object occurs. process means, “zero or more processes, each on a separate line”. Similarly, 0 , expression , means “A list of zero or more expressions, separated by commas”, and 1 , expression , means “A list of one or more expressions, separated by commas”.

A complete summary of the syntax of the language is given at the end of the main body of the manual

Continuation lines

A long statement may be broken immediately after one of the following:
an operator i.e. +, -, *, / etc..
a comma ,
a semi-colon ;
assignment :=
the keyword IS, FROM or FOR

A statement can be broken over several lines providing the continuation is indented at least as much as the first line of the statement.

The annotation of occam programs

As the format of occam programs is significant, there are a number of rules concerning how programs are annotated. A comment is

introduced by a double dash symbol (--), and extends to the end of the line.

4 Syntax and program format
Consider the following sequence:
SEQ
-- This example illustrates the use of comments
-- A comment may not be indented less than
-- the following statement
...
SEQ -- A sequence
...
Comments may not be indented less than the following statement.

Names and keywords used in occam programs

Names used in occam programs must begin with an alphabetic character. Names consist of a sequence ofalpha numeric characters and dots. There is no length restriction. occam is sensitive to the case of names, i.e. Say is considered different from say. With the exception of the names of channel protocols, names in the examples presented in this manual are all lower case. However, the following are all valid names in
occam:
PACKETS
vector6
LinkOut
NOT.A.NUMBER
transputer
terminal.in
terminalOut

All keywords are upper case (e.g. SEQ). All keywords are reserved, and thus may not be used by the programmer.

Primitive processes

1.1 Assignment

occam programs are built from processes. The simplest process in an occam program is an action. An action is either an assignment, an input or an output. Consider the following example:
x := y + 2
This simple example is an assignment, which assigns the value of the expression y + 2 to the variable x. The syntax of an assignment is:

assignment variable := expression

The variable on the left of the assignment symbol (:=) is assigned the value of the expression on the right of the symbol. The value of the expression must be of the same data type as the variable to which it is to be assigned, otherwise the assignment is not valid.

A multiple assignment assigns values to several variables, as illustrated in the following example:
a, b, c := x, y + 1, z + 2

This assignment assigns the values of x, y + 1 and z + 2 to the variables a, b and c respectively. The expressions on the right of the assignment are evaluated, and the assignments are then performed in parallel. Consider the following example:
x, y := y, x

The effect of this multiple assignment is to swap the values of the variables x and y.
The syntax of multiple assignment extends the syntax for assignment:

assignment variable.list := expression.list
variable.list 1 , variable
expression.list 1 , expression

A list of expressions appearing to the right of the assignment symbol (:=) is evaluated in parallel, and then each value is assigned (in parallel) to the corresponding variable of the list to the left of the symbol. The rules which govern the names used in a multiple assignment therefore follow from those for names used in parallel constructions (see page 16). Practically, this means that no name may appear twice on the left side of a multiple assignment, as the name of a variable or as the name of a variable and the name of a subscript expression which selects a component from an array .The expression on the right of the assignment symbol (:=) may be a function. A multiple result function can be an expression list in a multiple assignment..

1.2 Communication

Communication is an essential part of occam programming. Values are passed between concurrent processes by communication on channels. Each channel provides unbuffered, unidirectional point-to-point communication between two concurrent processes. The format and type of communication on a channel is .Two actions exist in occam which perform communication on a channel. They are: input and output.

1.2.1 Input

An input receives a value from a channel and assigns the received value to a variable. Consider the following
example:
keyboard ? char

This simple example receives a value from the channel named keyboard and assigns the value to the variable char. The input waits until a value is received.
The syntax of an input is:

input channel ? variable

An input receives a value from the channel on the left of the input symbol (?), and assigns that value to the variable on the right of the symbol. The value input must be of the same data type as the variable to which it is assigned, otherwise the input is not valid.

1.2.2 Output

An output transmits the value of an expression to a channel. Consider the following example:
screen ! char

This simple example transmits the value of the variable char to the channel named screen. The output waits until the value has been received by a corresponding input.The syntax of an output is:
output channel ! expression
An output transmits the value of the expression on the right of the output symbol (!) to the channel named on the left of the symbol.

1.3 SKIP and STOP

The primitive process SKIP starts, performs no action and terminates.The primitive process STOP starts, performs no action and never terminates.To explain how SKIP behaves, consider the following sequence:
SEQ
keyboard ? char
SKIP
screen ! char
This sequence executes the input keyboard ? char, then executes SKIP, which performs no action. The sequence continues, and the output screen ! char is executed. The behaviour of STOP is illustrated by the following sequence:
SEQ
keyboard ? char
STOP
screen ! char
This sequence performs the input keyboard ? char before, then executes STOP, which starts but does not terminate and so does not allow the sequence to continue. The output screen ! char is never executed.

1.4 Summary

The primitive occam processes are assignments, inputs, outputs, SKIP and STOP:
Process = assignment
| input
| output
| SKIP
| STOP

Communicating Process Paradigm

Systems are built from layered networks of communicating parallel processes
Synchronous point-to-point communication via ‘channels’
The three sub-components here are ‘plus’, ‘prefix’ and ‘delta’
that could also be process networks
Implementation in occam; semantics from Hoare’s CSP.
Individual processes are completely isolated
interacting with the environment only through channels visible
interfaces

The synchronous nature of channel communication means that implementations may behave differently w.r.t. the environment
• asynchronous behavior is possible, however:

Occam & CSP

The occam language provides for ‘clean’ implementations of such processes

• developed by David May (and others) at Inmos [1] (1983), last commercial revision was occam2.1 [2] (1995)

• strict parallel-usage and aliasing checks give strong safety guarantees
i.e. freedom from race-hazard errors

• remaining issues include: deadlock, live lock and starvation CSP is the underlying process algebra, used to formally reason about the
behavior of occam processes

• developed by Tony Hoare [3, 4], standard text (currently) by Bill Roscoe [5] The mapping between CSP and occam is not an exact fit

• but is sufficiently complete for reasoning about occam programs.

Building Process Networks

The implementation of process networks such as ‘integrate’ is trivial:

PROC integrate (CHAN INT in?, out!)
CHAN INT a, b, c:
PAR
plus (in?, c?, a!)
prefix (0, b?, c!)
delta (a?, b!, out!)
:

Occam and Occam pi Programming Language

Occam can express a wide range of behaviours, but some aspects are lacking

• largely as a result of occam’s original application for the Transputer
• dynamic process creation
• channel mobility
• shared channels
Primary aims of occam-p:
• to provide the above features at a language level
• not break the existing good safety guarantees of occam
• make the language “programmer friendly”, i.e. something that people will actually want to use

overview:-

Extending the classical occam language with ideas of mobility and dynamic network reconfiguration, ideas from Milner’s p-calculus [6]

• we have found ways of implementing these extensions that involve significantly less resource overhead than imposed by the higher-level — but less structured, information and non-compositional — concurrency primitives of existing languages (such as Java) or libraries (such as POSIX threads)

As a result, we can run applications with the order of millions of concurrent processes on modestly powered PCs

• we have plans to extend the system, without sacrifice of too much efficiency and none of logic, to simple clusters of workstations, wider networks such as the Grid and small embedded devices

In the interests of proveability, we have been careful to preserve the distinction between the original point-to-point synchronised communication of occam and the dynamic asynchronous multiplexed communication of the p-calculus; in this, we have been prepared to sacrifice the elegant sparsity of the p-calculus

• we conjecture that the extra complexity and discipline introduced will make the task of developing, proving and maintaining concurrent and distributed programs easier

occam-pi features:

• mobile data, channels and processes; dynamic process creation
• shared channels, channel bundles, recursion, no race-hazards, no garbage, protocol inheritance, extended rendezvous, process priority, ...

Mobile Channel Bundles

Defined as ‘client’ and ‘server’ ends of a “mobile channel-type”,

Directions specified in the type are server-relative

• and may carry data both ways Allowing data to flow both ways is merely convenience — can get (mostly same effect using many individual mobile channels Grouping channels together, for related use, makes good sense however.

Communicating Mobile Bundles

The main use of mobile channel bundles is to support the run-time reconfiguration of process networks

P and Q are now directly connected

Process networks may be arbitrarily reconfigured using mobile channels, but still only reconfiguring static networks

• Dynamic process creation makes this much more interesting

Communication and assignment in traditional occam systems incur an O(n) run-time cost for datacopying. In small localised systems (i.e. those which share a common memory), this run-time cost is often undesirable and often unnecessary. A common solution to this, in occam, is to implement a memory pool which uses RETYPEs and various compiler-builtins to generate pointers from variables. These pointers (usually represented by the INT type) can then be communicated and assigned in O(1) time. The disadvantage of this method is that the compiler can no longer perform parallel-usage and alias checks on the pointers— it just treats them as plain INTs. Thus the potential for introducing parallel race-hazards is high. However, in traditional occam, this is often the only solution. Mobiles, with their movement semantics, provide a simple and effective solution to this — with an implementation that uses communication and assignment of references and has aliasing strictly controlled. These semantics are enforced by the compiler both at compile-time (in the undefineness checker, section 4.6), and also at run-time, by invalidating ‘lost’ references.

Mobiles are not a new concept — movement semantics exist in other languages, for exampleNIL [SY85], although these tend to be few in number. The programming language Icarus [MM98b, MM98a, MM01] also supports a movement semantics, for both data and channels. Icarus

also introduces the concept of a borrowing semantics, that provides an explicit mechanism for temporarily moving data/ channe ls to another process and back at defined points. A similar mechanism also exists for mobile types in occam, although it is somewhat implicit — in procedure calls and abbreviations. For channels, communication and assignment using a movement semantics is the only option— copy semantics have no meaning when applied to channels, and are therefore banned in the standard occam language. Mobile channel-types provide this functionality, and also extend it — by allowing explicitly shared channel-types (for the construction of one-to-any, any-to-one and any-to-process

Dynamic Process Creation

Recursion in occam

Recursion in occam has traditionally been prevented for two main reasons — one practical and one engineered. Firstly, the lack of dynamic memory allocation would have imposed a restriction on the depth of recursion. Secondly, the scoping of names in occam is such that they only become visible at the end of their declaration (where the ‘:’ appears). For PROCs, this means that any use of its own name inside the code body is invalid, unless a different PROC of the same name is in scope — in which case it will be used. It is possible to fake recursion however, often quite convincingly, by using the scoping of names to an advantage [Poo92]. In an arbitrary PROC called ‘foo’, any previously defined PROCs also called ‘foo’ are in scope and perfectly valid. This technique will fail on some KRoC implementations however — when the same name is defined at the outermost level multiple times, the UNIX linker will mostly likely stop with a “multiply defined symbol” error. The tranx86 translator partially handles this by only allowing one entry-point

for any given name per file processed. However, the same top-level PROC name in separate source files will also usually result in a similar error on UNIX platforms — which is a problem generally, not just for recursion. A version of unbounded recursion using a special locally defined PROC — with a very similar name — has been implemented in a development version of the KRoC/Sparc by Wood in [Woo00]. The Linux version of KRoC implements recursion using a different mechanism, by adding support directly to the occam compiler. The support for recursion in KRoC/Sparc also uses a Brinch-Hansen style memory allocator, but with with all support for such in the translator (octran) and run-time kernel only. For this implementation, recursive PROCs are declared using the ‘RECURSIVE’ or

‘REC’ keywords

— the latter being shorthand for the former. For example:

RECURSIVE PROC thing (...)
.
.. body of thing
:
This has the effect of bringing the name ‘thing’ into scope early, thereby permitting its use within the body of ‘thing’. A classic example for parallel recursion in occam is the parallel recursive Sieve of Eratosthenes. The original code for this is given in Appendix A of Moores’s CCSP paper, [Moo99], and requires very little change — just the addition of the ‘RECURSIVE’ keyword. The newcode for the ‘sieve’ process, augmented with channel-direction specifiers, is:

RECURSIVE PROC sieve (VAL INT count, CHAN INT in?, out!)
IF
count = 0
id (in?, out!) -- just copy in to out
TRUE
INT n:
SEQ
in ? n
out ! n
CHAN INT c:
PAR
filter (n, in?, c!)
sieve (count-1, c?, out!)
:

The ‘count’ parameter is used to limit the recursion. An initial value of 169 will allow for all the primes between 2 and 1000000 to be generated. When the ‘sieve’ process receives its first input (from ‘in?’), it outputs it down the prime output channel ‘out!’, then starts a network of subprocesses: a filter to filter out all multiples of ‘n’, in parallel with a recursive call to sieve. When ‘count’ reaches zero, the recursion stops. Some time after new filter processes stop being created, the pipeline will start to produce invalid primes

Seminar Reports