Last week we saw the benefits of rethinking memory and pointer models at the hardware level when it came to object storage and compression (Zippads). CHERI also rethinks the way that pointers and memory work, but the goal here is memory protection. The scope of the work stands out as particularly impressive:
We have adapted a complete C, C++, and assembly-language software stack, including the open source FreeBSD OS (nearly 800 UNIX programs and more than 200 libraries including OpenSSH, OpenSSL, and bsnmpd) and PostgreSQL database, to employ ubiquitous capability-based pointer and virtual-address protection.
The protections are hardware implemented and cannot be forged in software. The process model, user-kernel interactions, dynamic linking, and memory management concerns are all in scope, and the protection spans the OS/DBMS boundary.
The basic question here is whether it is practical to support a large-scale C-language software stack with strong pointer-based protection… with only modest changes to existing C code-bases and with reasonable performance cost. We answer this question affirmatively.
That ‘reasonable’ performance cost is a 6.8% slowdown, significantly better than e.g. the 50% overheads of Address Sanitizer.
CHERI is guided by two underlying principles:
- The well-known principle of least privilege: running software should have the minimum privileges possible to do what it needs to do, and
- A new principle identified in this work, the principle of intentional use: where a set of privileges is available to a piece of software, an invoked privilege should be selected explicitly rather than implicitly.The conceptual model that ensues has the following properties:
- memory accesses are based not just on arbitrary integers (checked against only the process address space), but also on abstract capabilities that confer an appropriate set of memory permissions.
- abstract capabilities are constructed only through legitimate provenance chains of operations, successively reducing permissions from initial maximally permissive capabilities provided at machine reset
- code is not given access to excessive capabilities
And this all has to work for whole-system executions, not just the C-language portion of user processes. The goal of all this of course is to prevent attackers injecting, manipulating or abusing pointers in the runtime environment.
CHERI adds a new hardware data type for strongly protected C-language pointers, the CHERI capability (the evaluation uses an FPGA-based implementation). A capability combines a good old-fashioned address pointer with bounds constraining the range of addresses and permissions limiting its use. The resulting pointers are 128-bits wide, together with one out-of-band tag bit.
- Provenance validation ensures that only capabilities derived via valid transformations of valid capabilities using capability instructions can be used
- Capability integrity prevents direct in-memory manipulation of architectural capability encodings. If any violation is detected the tag bit is cleared, and the data can no longer be interpreted as a capability.
- Monotonicity prevents the permissions or bounds associated with a capability from being increased.
The instruction set contains explicit instructions for working with capabilities, and legacy instructions addressing memory via vitual addresses are indirected through a ‘default data capability’ (DDC) register.
The core of CHERI has been covered in earlier papers, what’s new in this paper is the extension of capability support to the full userspace process environment including interactions with system calls, signals, dynamic linking, and process debugging. This enables all legacy loads and stores via the DDC to be eliminated.
The abstract capability model is implemented with a subtle combination of architectural capabilities (as provided by the hardware) and the critical systems code involved in managing paging, context switching, linking, memory allocation, and suchlike.
The work includes changes to the CHERI ISA, the C compiler, the C language runtime, the virtual memory APIs, and the CheriBSD kernel.
At hardware reset the boot code is granted maximally permissive architectural capabilities. The kernel then narrows these to ones separately covering userspace, kernel code, and kernel data. When a process address space is then replaced by
execve, the kernel establishes new memory mappings for the contents of the address space, subdividing the previously created userspace capability.
On a context switch the kernel saves and restores user-thread register capability state, and updates virtual-physical mappings. Similar housekeeping needs to be done when swapping and on signal delivery.
All standard methods of accessing process memory have been altered to use an explicit capability, so the kernel can only access the memory specified and authorized by the user process, as shown below.
The CheriABI implementation is used to compile FreeBSD and PostgreSQL, then all of the respective test suites are run. In the table below, the numbers in each column represent the number of test programs, not the number of individual tests. The MIPS rows show the test suite results on a standard mips64 system. Digging into the PostgreSQL test failures, just over half are due to test assumptions about output order or pointer size, the remaining half-dozen or so still need further investigation.
Most programs (almost 800 C programs in the FreeBSD source tree) require no modifications. The following table breaks down the types of changes required for those that do:
The biggest cause of change (42 cases) is calling conventions (CC) in BSD libraries when using variadic arguments. With capabilities these require correct function prototypes, and when programs declare their own callbacks fixing each one is the only solution.
Microbenchmarks (MiBench) and the FreeBSD system call timing benchmarks show modest performance impact in some cases, and performance improvements in others (3.4% slower to 9.8% faster for system calls). For a macro-benchmark PostgreSQL’s
initdb tool was used. PostgreSQL runs 6.8% slower as a CheriABI binary.
The memory safety benefits are evaluated using the BOdiagsuite of 291 programs. Each program has three memory-safety violating variants: min is typically an off-by-one error, med is an off-by-8-bytes error, and large is an off-by 4096 bytes error. CheriABI is compared against vanilla mips64 and Address Sanitizer. It shows a very high success rate in detecting these safety violations.
Note that Address Sanitizer has high overheads (3x stack memory, 12.5% total memory, and around 50% performance).
In addition to finding test-suite issues, we have found and fixed dozens of bugs including buffer bounds violations and variadic argument misuse in FreeBSD programs, libraries, and tests.
The last word
We have demonstrated a complete memory-safe UNIX system that is practical for general use… our implementation of CheriABI shows the existence of a path forward from our current run-time foundations set on the shifting sands of integer pointers, to a future where strong referential integrity enforces the principles of least privilege and intentionality even on lowest-level software.
And an afterword! It strikes me that you can think of a foreign-key reference in a database a bit like a memory address pointer (the virtual address space it indexes into consists of the rows of the foreign table). Today those foreign keys are just like unprotected integer pointers in the world of virtual memory – they have no associated capabilities or protections, can be directly manipulated, etc.. What if queries returned ‘capabilities’ instead of raw keys?? That might be an interesting model for thinking about the four Ps: provenance, purpose, permissions and privacy.