System programming in Rust: beyond safety

System programming in Rust: beyond safety Balasubramanian et al., HotOS’17

Balasubramanian et al. want us to switch all of our systems programming over to Rust. This paper sets out the case.

Despite many advances in programming languages, clean-slate operating systems, hypervisors, key-value stores, web servers, network and storage frameworks are still developed in C, a programming language that is in many ways closer to assembly than to a modern high-level language. Today, the price of running unsafe code is high… Why are we still using C?

About 2/3 of the 2017 CVEs relating to the Linux kernel can be attributed to the use of an unsafe language, and pervasive use of pointer aliasing, pointer arithmetic and unsafe type casts defeat the use of software verification tools.

So why are we still using C? Because safe languages have overheads that are too high for many use cases argue the authors. (And because of familiarity, and large existing C codebases, I would add).

Is it reasonable to sacrifice safety for performance, or should we prioritize safety and accept its overhead? Recent developments in programming languages suggest that this might be a false dilemma, as it is possible to achieve both performance and safety without compromising on either.

Enter Rust!

Rust achieves both safety and performance by embracing linear types. In the Rust ownership model, a variable that is bound to an object acquires ownership of that object. When the variable goes out of scope, the object is deallocated. Ownership can be transferred to another variable, but doing so destroys the original binding. You can also borrow an object within breaking the binding, but only within the syntactic scope of the declaration, and when it does not exceed the scope of the primary binding.

This ownership model eliminates pointer aliasing. There is an unsafe subset of the language that is not subject to the single ownership restrictions (best left to the standard library, to implement e.g., doubly-linked lists). There is also a mechanism for safe read-only aliasing which involves wrapping the object with a reference counted type, Rc or Arc. And if you must have write aliasing (for example, to a shared resource), you can enforce this dynamically by wrapping the object with the Mutex type:

In contrast to conventional languages, this form of aliasing is explicit in the object’s type signature…

Several projects have demonstrated Rust’s suitability for building low-level high-performance systems, but the authors of this paper want us to go further and consider additional benefits for systems programming that are enabled by Rust’s type system: software fault isolation (SFI), program analysis – especially static information flow control (IFC), and safe traversal of pointer-linked data structures which enables automation of tasks such as checkpointing.

(Recall we looked at timely dataflow as used by Mosaic a couple of weeks ago. Timely dataflow is implemented in Rust).

Software Fault Isolation (SFI)

SFI is the idea of enforcing process-like boundaries around program modules in software, without relying on hardware protection. Modern SFI implementations enable low-enough cost isolation in some applications (e.g., browser plugins and some device drivers), but “their overhead becomes unacceptable in applications that require high-throughput communication across protection boundaries.” Consider for example a network processing framework forwarding packets through a pipeline of filters that should ideally be isolated from each other:

Sending data across protection boundaries requires copying it, which is unacceptable in a line-rate system.

You can avoid copying with a tagged shared heap, but this introduces other runtime overheads (up to 100%) for tag validation on each pointer dereference.

Rust’s single ownership model allows us to implement zero-copy SFI. The Rust compiler ensures that, once a pointer has been passed across isolation boundaries, it can no longer be accessed by the sender.

The authors demonstrate how to build an SFI library in Rust, which supports secure communication across isolation boundaries with negligible overhead. The library exports two data types, protection domains (PDs) and remote references (rrefs).

Arguments and return values of remote invocations follow the usual Rust semantics: borrowed references are accessible to the target PD for the duration of the call; all other arguments change their ownership permanently. The sole exception is remote references: the object pointed to by an rref stays in its original domain and can only be accessed from the domain holding the reference via remote invocation.

Remote references (rrefs) are weak references to a reference table. All remote invocations are proxied through this table.

If a panic occurs inside a domain, its domain reference table is cleared and a user-provided function is invoked to re-initialize it from a clean slate.

The cost of this isolation is accessed in the context of a network processing framework using null filters which forward batches of packets without doing any work on them:

The overhead grows from 90 CPU cycles for 1-packet batches to 122 cycles for 256-packet batches, which is roughly the cost of 2 or 3 L3 cache accesses.

Information flow control

Static information flow control (IFC) enforces confidentiality by tracking the progress of sensitive data through a program. It doesn’t require a great leap of imagination to see that aliasing make this problem much harder – you have to keep track of all aliases to a variable and everywhere that they are used.

Modern alias analysis achieves efficiency by sacrificing precision, posing a major barrier to accurate IFC. By restricting aliasing, Rust sidesteps the problem.

The authors implement a proof-of-concept IFC for Rust. Rust macros transform a program into an abstract representation in which the value of each variable is simply represented by its security label. Input variables are initialised with user-provided labels, arithmetic expressions over secure values compute an upper bound of their arguments, and an auxiliary program counter tracks the flow of information via branching on labeled variables.

The resulting abstract program is verified using a verifier implemented as an extension of SMACK. This is able to detect problems such as the secret data leak in the following program fragment:

Safe traversals

… checkpointing, transactions, replication, multiversion concurrency, etc., involve snapshotting parts of program state. This, in turn, requires traversing pointer-linked data structures in memory. Ideally one would like to generate this functionality automatically and for arbitrary user-defined data types. However, doing so in a robust way can be complicated in the presence of aliasing.

In Rust, the problem is much simpler. By default all references in Rust are unique owners and can be safely traversed without extra checks. If aliasing is present, it is detectable through the use of Rc and Arc wrappers, which make these wrappers a convenient place to deal with aliasing with minimal changes to user code and without expensive lookups.

The authors built an automatic checkpointing library for Rust following this observation using a Checkpointable trait (interface) and a custom implementation for Rc (Arc could be extended similarly). A compiler plugin automatically generates an implementation of the Checkpointable trait for types composed of scalar values and references to other checkpointable types.

Our library adds the checkpointing capability to arbitrary user-defined data types; in particular, it checkpoints objects with internal aliases correctly and efficiently.

The last word

We show that Rust enables system programmers to implement powerful security and reliability mechanisms like SFI, IFC, and automatic checkpointing more efficiently than any conventional language. This is just the tip of the iceberg: we believe that further exploration of linear types in the context of real systems will yield more game-changing discoveries.

20 thoughts on “System programming in Rust: beyond safety”

Leonid Ryzhyk says:

June 14, 2017 at 7:19 am

Thanks for the great post about our HotOS paper! The final version of the paper can be found here: https://research.vmware.com/publications/system-programming-in-rust-beyond-safety
Pingback: System programming in Rust: beyond safety – Full-Stack Feed
Zteve says:

June 14, 2017 at 12:15 pm

It would be helpful for the general reader if a short explanation of the term “linear type” were included in your summary. Wadler used the term (in “Linear types can change the world!”
Philip Wadler. In M. Broy and C. Jones, editors, Programming Concepts and Methods, North Holland, Amsterdam, 1990) and defined them thus:

> Values belonging to a linear type must be used exactly once: […] they can be neither duplicated nor discarded.

and also in “Adoption and Focus: Practical Linear Types for Imperative Programming” by Manuel Fahndrïch Robert DeLine, they say, more vaguely:

> [values of linear type are] those on which we can check protocols, but to which aliasing restrictions apply.

Rust types (and user-defined types in particular) are linear by default (it is possible to define types whose values can be duplicated, but there are pretty severe restrictions) and aliasing (by the use of references) has some tight restrictions as you explained in your review.

Rust is an extremely exciting development tool for the systems programmer.
1. Zteve says:
  
  June 14, 2017 at 12:17 pm
  
  PS: The paper you reviewed doesn’t have an explanation of the term “linear type” that I can see.
Nik Boyd says:

June 14, 2017 at 2:35 pm

This discussion of Rust reminded me a little bit of a design pattern I discovered some time ago. Here’s a link to some notes about that. http://educery.com/papers/patterns/visitors/promised.operations.html

Some advantages I still see in using Promised Operations: it only requires fairly standard “equipment” in the language, it retains encapsulation, it foists boundary enforcement on the underlying platform using type interfaces, and so it can be used across multiple programming languages, esp. those that have 1st class type interfaces, or fully abstract classes. So, Java, C#, and perhaps C++, although I’m not 100% sure about C++, given that it’s underpinnings are usually more primitive. Obviously, there are still trade-offs.

Thanks for the notes about Rust. This blog is awesome, a must read for me, every day.
Stephan Sokolow says:

June 14, 2017 at 6:38 pm

Linear type systems (allow exchange, not weakening or contraction): Every variable is used exactly once.
Affine type systems (allow exchange and weakening, not contraction): Every variable is used at most once.
— https://en.wikipedia.org/wiki/Substructural_type_system

Rust’s type system is affine, not linear. Catching a return value being discarded is accomplished via the must_use type annotation, which is a built-in lint, not a core part of the type system.
1. Leonid Ryzhyk says:
  
  June 15, 2017 at 2:18 am
  
  The exact definition of linear types in the context of procedural languages like Rust is actually a bit subtle. In Rust, once you have a binding or a reference to an object, you can read or (in case of a mutable binding/reference) write it any number of times. So, the single-use constraint does not refer to being able to _access_ the object exactly once. Rather, it refers to being able to transfer the ownership of the object to exactly one recipient.
  
  In Rust objects are silently dropped when they go out of scope (discarding a return value is a special case of this). One can view this as implicit ownership transfer to the memory allocator. In this sense, Rust’s type system is truly linear.
  1. Stephan Sokolow says:
    
    June 15, 2017 at 4:19 pm
    
    A fair point, but I always think in terms of which interpretations make the terms most useful for further discussion… and I’d argue that interpreting things that way makes both terms less useful.
Pingback: What I’m Reading – 6/14 – stevenhelferich.com
Pingback: System programming in Rust: beyond safety | ExtendTree
Pingback: Data Science newsletter – June 15, 2017 | Sports.BradStenger.com
Pingback: End of term, and Orders of Magnitude | the morning paper
charlesdoliver says:

July 12, 2017 at 1:20 pm

[…] System programming in Rust: beyond safety Balasubramanian et al., HotOS’17 Balasubramanian et al. want us to switch all of our systems programming over to Rust. org/wiki/Substructural_type_system

Rust’s type system is affine, not linear.
timothyjcharles says:

July 17, 2017 at 11:25 am

[…] System programming in Rust: beyond safety Balasubramanian et al., HotOS’17 Balasubramanian et al. want us to switch all of our systems programming over to Rust.
Thanks for the notes about Rust.
Scott says:

August 3, 2017 at 7:25 am

Is the source for the SFI, IFC and checkpointing libraries available anywhere? A quick bit of searching around is not turning anything up.
Pingback: Writing parsers like it is 2017 | the morning paper
simbo1905 says:

August 25, 2017 at 8:22 pm

typo “borrow an object within breaking” should be “without breaking”
markmhamann says:

August 29, 2017 at 7:01 am

[…] this year we looked at ‘System programming in Rust: beyond safety‘ which made the case for switching from C to Rust as the default language of choice for […] Wadler used the term (in “Linear types can change the world!
DevBlog says:

October 30, 2017 at 12:21 pm

[…] this class we looked at ‘System programing in Rust: beyond base hit‘ which made the character for switching from C to Rust as the nonremittal linguistic process of alternative for […] Wadler used the term (in “Linear types can alteration the humankind! […] this class we looked at ‘System programing in Rust: beyond base hit‘ which made the character for switching from C to Rust as the nonremittal linguistic process of alternative for […]
Pingback: RustBelt: securing the foundations of the Rust programming language | the morning paper