Fencing off Go: Liveness and safety for channel-based programming, Lange et al. POPL 2017

In the true spirit of POPL (Principles of Programming Languages), I present today’s summary of ‘Fencing off Go’ :

What more do you need to know?

Let’s try again 🙂

Fencing off Go: Liveness and safety for channel-based programming, Lange et al. POPL 2017

POPL papers can be very intimidating for those not steeped in type theory – and I’m sure they’re still pretty hard work even for those who are! In Fencing off Go though, Lange et al. have done something which should be of wide and practical interest to Go programmers.

This work develops a static verification framework for liveness and safety in Go programs, able to detect communication errors and partial deadlocks in a general class of realistic concurrent programs, including those with dynamic channel creation, unbounded thread creation and recursion.

And it’s more than just a pretty paper – the tool chain is also available for you to try on your own programs. The test programs used by the authors are all pretty small though (max 112 loc!):

Background: concurrency, liveness, and safety in Go

Go is a statically typed language that uses channels and goroutines (lightweight threads) for concurrency. Instead of chains of asynchronous callbacks, concurrent Go programs use logically structured flows of messages on channels. This avoids typical problems with locks and callback-hell…

On the other hand, Go inherits most problems commonly found in concurrent message-passing programming such as communication mismatches and deadlocks, offering very little in terms of compile-time assurances of correct structuring of communication.

The Go runtime does include a global deadlock detector, but this can’t detect partial deadlocks (involving only a strict subset of a program’s goroutines), and is ‘ultimately inadequate for complex, large scale applications that may easily be undermined by trivial mistakes or benign changes to the program structure.’

The Go type system can ensure that the values sent and received on channels are of appropriate types, but it cannot give any static guarantees about liveness or safety. Consider the following Go implementation of a concurrent prime sieve. It has several elements making it hard to reason about, including unbounded iteration (L3, 6, 13), dynamic channel creation (L11, 15), and spawning of concurrent threads (L12, 16).

GoInfer and Gong combine to provide static verification of liveness and the absence of communication errors in such programs. GoInfer extracts concurrent behavioural types from a Go program, and Gong then performs analysis on those types for verification.

Type inference with GoInfer

GoInfer is written in Go, using the go/ssa (Static Single Assignment) package. The SSA intermediate representation is transformed into a system of type equations by converting each SSA block into an individual type equation.

To give a flavour, for the prime sieve program above we end up with the following type equations:

• The overall program has the type $(new a)(\mathbf{g}\langle a \rangle | \mathbf{r} \langle a \rangle)$. Interpret this as ‘create a new channel ‘a’, and then behave as the generator process ‘g’ and a recursive process ‘r’ in parallel’.
• The generator process has type $\mathbf{g}(x) \hat{=} \bar{x};\mathbf{g}\langle x \rangle$. (Output a value on channel ‘x’, then behave as g(x) again).
• The recursive process ‘r’ has type $\mathbf(r) \hat{=} x;(new b)(\mathbf{f} \langle x, b \rangle | \mathbf{r} \langle b\rangle)$. (Receive a value on channel ‘x’, then spawn a new channel b and behave as a filter process ‘f’ in parallel with ‘r’, both using the new channel ‘b’).
• The filter process ‘f’ has type $\mathbf{f}(x,y) \hat{=} x;(\hat{y};\mathbf{f}\langle x,y \rangle \bigoplus \mathbf{f} \langle x,y \rangle)$ (Receive a value on channel ‘x’ and then either behave as a process that outputs a value on channel ‘y’, followed by filtering, or as a filtering process (without outputting a value on ‘y’ first).

Underpinning these type equations is a language called MiGo (mini-Go) which models the concurrency passing features of Go itself. MiGo uses the familiar ‘!’ and ‘?’ for message send and receive and has the following compact syntax which I present without further explanation in the interests of space:

Expressed in MiniGo, the prime sieve program looks like this:

Given MiniGo, the authors define an operational semantics and a behavioural typing system for it.

Go’s channel types are related to those of the Π-calculus, where the type of a channel carries the type of the objects that threads can send and receive along the channel. Our typing system augments Go’s channel types by also serving as a behavioural abstraction of a valid MiGo program, where types take the form of CCS processes with name creation.

(CCS = Calculus of Communicating Systems)

Verification with Gong

[Gong] checks for liveness and detects communication errors even in the presence of highly dynamic and unconstrained communication topologies, containing infinitely many recursive processes and channels, which can often be out of scope for existing behavioural type-based analyses.

Gong is written in Haskell (of course it is 😉 ), and inputs a system of type equations representing a Go program’s concurrent behaviour (as produced by GoInfer). First it checks that the system is fenced, and if so it generates and checks all of the (k) reachable terms for liveness and channel safety.

If (the type of) a program is fenced, then we know that even if it spawns infinitely many processes, the program will actually consist of a finite number of communication patterns (that may themselves be repeated infinitely many times). The coloured regions in the diagram below show the fenced regions for the concurrent prime sieve example:

Given that a program is fenced, a symbolic semantics can be defined in terms of a labelled transition system (LTS) which we know will have finite state. Liveness is shown for three classes of programs:

1. Those with a path to termination. Programs in this class that are typeable with a live type can always satisfy liveness.
2. Those that do not contain infinitely occurring conditional branches – if such a process is assigned a live type, then it is itself live.
3. Infinitely running programs which contain recursive variables in conditional branches. At the type level, the choice as to which branch to take is abstracted away, so we don’t know what branch will really be taken based on the evaluation of the expression presented to the conditional. If the program is live when non-deterministically reducing to either branch of the conditional, then it must itself be live.

Where next?

In future work we plan to extend our approach to account for channel passing, and also lock-based concurrency control, enabling us to verify all forms of concurrency present in Go. The results of §4 suggest that it should be possible to encode our analysis as a model checking problem, allowing us to: (1) exploit the performance enhancements of state of the art model checking techniques; (2) study more fine-grained variants of liveness; (3) integrate model checking into the analysis of conditionals to, in some scenarios, decide the program class (viz. § 5.3).

Type checking has come a long way… automated liveness and safety checking for all forms of concurrency in Go would be an incredibly useful tool, no doubt saving hours of frustration trying to diagnose and debug errant concurrent behaviours.