One VM to Rule Them All

One VM to rule them all – Wuerthinger et al. 2013

Building high-performance virtual machines is hard , and a number of widely used languages only have lower-performance implementations. Wuerthinger et al. want to make it easier for you to create high-performance VMs without needing to create highly complex implementations.

According to the paper:

  • Java and the CLR have high-performance implementations
  • PHP, Python, Ruby, R, Perl, Smalltalk, MATLAB, APL and others have low-performance implementations
  • Javascript is somewhere in-between

We present a new (Virtual Machine) approach and architecture, which enables implementing a wide range of languages within a common framework, reusing many components (especially the optimizing compiler).

The language implementation framework is called Truffle and the compilation infrastructure is called Graal. Both are open source projects under the OpenJDK umbrella.

Only the guest language specific part is implemented anew by the language developer. A core of reusable host services are provided by the framework, such as dynamic compilation, automatic memory management, threads, synchronization primitives, and a well-defined memory model.

The guest language is written as an AST interpreter, in Java using an annotation based ‘DSL’. The combination of node rewriting during interpretation, optimizing compilation, and deoptimization delivers high-performance from an interpreter without requiring a language-specific compiler.

… at no point was the dynamic compiler modified to understand the semantics of the guest language. A guest language developer who operates within our framework gets a high-performance language implementation, with no need to understand dynamic compilation.

Supported optimizations include type specialization (e.g. rewriting a method to work with integers rather than doubles), inline caching, and resolving operations.

Type specialization rules (in this case for an add operation) look like this:

[code lang=text]
@Specialization(rewriteOn=ArithmeticException.class)
int addInt(int a, int b) {
return Math.addExact(a,b);
}

@Specialization
double addDouble(double a, double b) {
return a + b;
}

@Generic
Object addGeneric(Frame f, Object a, Object b) {
… // handle generic case
}
[/code]

The paper contains an assessment of the suitability of Truffle for implementing JavaScript, Ruby, Python, J, R, and functional languages.

Graal is also being used with the context of OpenJDK Project Sumatra to generate code for GPUs. A GPU backend for languages such as J and R, and for array-processing libraries for other languages, offers the potential of high-performance parallel execution and is something we intend to pursue in the near future.

High performance is achieved through a combination of techniques.

  • Node rewriting specializes the AST for the actual types used (eliminating unneccessary boxing etc.)
  • Compilation by automatic partial evaluation leads to highly optimized machine code without the need to write a language-specific dynamic compiler
  • Deoptimization back from the machine code to the AST interpreter handles speculation failures.

In case you’re wondering how all this compares to LLVM:

A number of projects have attempted to use LLVM as a compiler for high-level managed languages. These implementations have to provide a translator from the guest languages’ high-level semantics to the low-level semantics of LLVM intermediate representation (IR). In contrast, our approach requires only an AST interpreter; our system can be thought of as a High-Level Virtual Machine (HLVM).

I’d love to take this for a spin and try implementing a little language. It’s been too long since I worked on the AspectJ language and compiler…