Towards Practical Gradual Typing – Takikawa et al. 2015
This is the first in a selection of papers this week from the recent ECOOP ’15 conference.
What’s the best building material? It’s fairly easy to see that the answer depends on what you are building (e.g. a garden shed vs. a skyscraper), where you are building it (local stones and other building materials, planning considerations, any site-specific or climatic considerations and so on), how durable you expect the structure to be (and in the face of what adversities – e.g. does it need to be earthquake-resistant), the anticipated costs of construction for different approaches, and the skills of the available workers. There is no single best building material.
Yet of course we can argue for hours about the best programming language! Early in the development of a program we want a building material that is quick to work with and malleable – a modelling clay that is easy to shape and re-shape as we work towards a satisfactory design. As the program grows and layers solidify we’d like to introduce stronger materials designed to structurally reinforce them and ensure they can withstand the stresses and strains placed on them over the long-haul.
This is all of course an oblique reference to the debate between dynamically and statically-typed languages. When you introduce the time dimension, it’s easy to see that there is no single best building material, and moreover that within the same program the amount of typing information you may want likely varies as a function of the maturity of the codebase. Different parts of the system may have different requirements simultaneously too – older ‘load-bearing’ components (i.e. those with a lot of dependencies on them) will benefit from structural reinforcement, whereas new extensions under development want a more malleable material while they are being formed.
We’d like to start out with flexible dynamic typing, and introduce static typing over time where it makes sense. This is what gradual typing supports.
Gradual type systems allow programmers to add type information to software systems in dynamically typed languages on an incremental basis. … The gradual-typing thesis states that a maintenance programmer ought to be able to augment an existing untyped code base with types in order to benefit from their software engineering advantages, and to ensure that future programmers will continue to receive those benefits.
The paper is based on a language called Typed Racket, but the principles uncovered should apply to other dynamic languages too, and future work by the authors intends to apply it to Reticulated Python. The primary
This paper presents the first implementation of a sound gradual type system for a higher-order, class-based, practical OO language that supports runtime class composition. Abstractly, it makes three concrete contributions: (1) design principles for gradual typing in a higher-order, object-oriented context, (2) a new mechanism to make the soundness-ensuring contract system performant, and (3) a novel performance analysis framework for gradual type systems.
Gradual typing is not just an academic concern, it’s also being adopted in industrial programming languages too:
Industrial designers of programming languages have started to adopt ideas from the gradual typing research community. In particular, both Typescript and Hack allow programmers to add types to programs in their respective base languages, JavaScript and PHP. These efforts, like Typed Racket, focus on supporting the idioms in the underlying languages such as traits—an alternative to mixins that emphasize horizontal composition—or prototypes. They make no effort, however, to put the interoperation of typed and untyped code on a sound footing.
The authors demonstrate four theory-derived principles for introducing gradual typing to an untyped object-oriented language. A further five principles are then derived from these in conjunction with practical experience.
Theory-derived Principles
- Class types differ from object types
the type system must distinguish the types of classes from the types of
objects, because expressions can yield either of these.
- Dynamic class composition requires structural types
Since class values may appear anywhere and without a statically-known name, we cannot simply identify classes by their names as in nominal type systems such as those of Java or C#. Instead we introduce class types that are compared structural ly. Put differently, classes are not identified just by a name, but by the types of their members (i.e., methods, fields, etc.).
- Object types need subtyping, class types need row polymorphism
Class types must support polymorphism because mixins are parametrized over base classes. To accommodate this idiom, TR’s type system borrows row polymorphism from the extensible record and object literature. The type system restricts row polymorphism to class types and disallows width subtyping; allowing both is unsound. Conversely, it accommodates existing design patterns via object types that have width subtyping but lack row polymorphism.
- Soundness checks need opaque contracts and contract sealing
Unlike an ordinary type system, a gradual type system must also support the sound interoperation of typed and untyped code. In TR, higher-order contracts mediate these interactions. Concretely, when a typed module in a gradually-typed language exports to/imports from an untyped one, the type system must translate any type specifications into contracts that uphold the type’s invariants at run-time. On top of ordinary class contracts, principle (4) requires sealing contracts for protecting mixins and opaque contracts that prevent untyped programs from accessing hidden methods on objects.
Practice-derived Principles
- Derive instance-types from class types
The separation of class types from object types calls for a linguistic mechanism that derives types of instances from types of classes or vice versa. The key to this syntax design is to choose a convenient default choice.
- Help type definitions specify inheritance
To accommodate sharing of features in class types, TR’s Class type constructor comes with an optional #:implements clause that allows a given class type to copy another class type’s member definitions.
- Make type definitions mutually recursive
While a gradual type system for dynamic classes demands the introduction of structural types, writing down such types imposes a significant notational overhead. In practice, class definitions consist of nests of deeply interwoven “has-a” or “is-a” relationships. For example, one class type from TR’s standard GUI library consists of 245 methods signatures, and moreover, the type refers to instances of 23 other class types plus itself. Not surprisingly, some of these 23 class types refer back to the original class type. In short, a gradual typing system for a dynamic object-oriented language needs a powerful construct for defining large mutually recursive types.
- Infer types for class members (locally)
Real-world programming also means reducing the number of required type annotations. A gradual type system therefore needs an algorithm that reconstructs some types automatically. Module-level type inference à la Hindley & Milner causes too many problems, however. Hence, TR locally infers the types of class members when possible, e.g., fields initialized with literal constants.
- Collapse layers of proxy values and eliminate runtime checks
All implementations of sound gradual type systems with structurally typed classes need a way to merge layers of proxy wrappers.
Runtime efficiency
Soundness calls for run-time checks that enforce the type specifications when a value flows from the typed portion of a program to an untyped one. All theoretical designs choose either casts or contracts for this purpose… Each conversion may put up a new higher-order contract boundary between two components, which may have negative consequences for the performance of the overall system. In theory, a completely converted program—without casts or type assertions—should have the same performance as a completely untyped one, because the type checker simply removes types in this case.
The goal for Typed Racket is to enable types to be added anywhere in the program without ever slowing down the program by more than a factor of 10 (Safe TypeScript reported 72x slow-downs). To achieve this it is necessary to collapse layers of proxies (a naive implementation just keeps on adding wrappers) that have compatible enforcement contracts.
For Typed Racket, we intend to investigate the use of profiling techniques that pinpoint the most expensive boundaries (between typed and untyped code) so that programmers can eliminate those first.