Multi-context TLS (mcTLS): Enabling secure in-network functionality in TLS

Multi-Context TLS (mcTLS): Enabling secure in-network functionality in TLS Naylor et al. SIGCOMM 2015

We’re rushing to deploy HTTPS everywhere – and about time – but this has interesting implications for middleboxes since it’s hard for them to do their job when traffic is encrypted end-to-end. Say you want to add caching, compression, an intrusion detection system (IDS), parental filter, tracker blocker, or WAN optimizer to TLS sessions – how do you do this? You can’t, at least not without breaking the end-to-end TLS guarantees.

Consider an enterprise network that wants to insert a virus scanner in all employee sessions. A common solution is to install a custom root certificate on the client. The middlebox can then create a certificate for itself purported to be from the intended server and sign it with the custom root certificate. After the client connects to the middlebox, the middlebox connects to the server and passes the data from one connection to the other.

This Split TLS solution provides no mechanism for authenticating the middlebox (or even realising it’s there), provides no guarantees to the client beyond the first hop, and gives middleboxes full read/write access to the data stream while it is decrypted between TLS sessions.

Given these problems, it should not be a surprise that users are concerned about (transparent) middleboxes. One could even argue that using TLS with a middlebox is worse than not using TLS at all, since clients and servers are under the illusion that they have a secure session, while some of the expected security properties do not actually hold.

So it seems we have three choices: (a) the status quo, in which HTTPS is subverted to pretend to give guarantees that aren’t really there, (b) stop using middleboxes and place all functionality at the endpoints, or (c) design a principled mechanism for middleboxes in TLS sessions. This paper of course, is all about option (c), with a protocol the authors call multi-context TLS (mcTLS).

There’s a camp that argue we should just stop using middleboxes (option b), on the basis that they violate the original Internet architecture and the end-to-end principle. Pragmatically, the authors put forward the case that middleboxes are here to stay:

  • They have proven to be useful, and can add functionality not provided by endpoints such as virus scanners inside an enterprise.
  • Some middlebox functionality is inherently more effective when placed in the network, and relying on client-side implementations may be problematic (just consider the upgrade cycle for a start).
  • Real-world deployments tend to have about the same number of middleboxes as they do L3 routers, and all actors in the Internet use them. “Given this investment, middleboxes are unlikely to go away, so we need a clean, secure way to include them in encrypted sessions.
  • The Internet is not a centrally managed monopoly, but a market-driven ecosystem with many actors making independent decisions.

The bottom line is simple: just like end-to-end encryption, middleboxes are an integral, useful part of the Internet and they are here to stay.

The basic idea

The first goal is to make middleboxes explicit, so that endpoints have both knowledge and control over which network functions are part of the session. Once this path is set up, mcTLS allows the content flowing over the connection to be divided up into multiple contexts, for example: request headers, request body, response headers, and response body (though any division is possible). Middleboxes may be granted no access, read-only access, or read-write access to each context independently. Endpoints can detect any modifications made, whether legal or illegal.

Consider the following examples of application layer middleboxes and the permissions they need – note that none of them read/write access to all of the data.

A data compression proxy for example would need write access to HTTP responses, or going further separate contexts could be used for images (accessible by the proxy) and the rest of the page artefacts (which may not be). A parental filter may only need access to the HTTP request headers. A corporate firewall may give read-only access to the IDS (and security appliances no longer need to impersonate end servers). In an online banking scenario, a server can prevent unknown middleboxes from participating in the session, regardless of what level of access the client assigns. And with HTTP/2 streams, mcTLS would allow browsers to easily set different access controls for each stream.

mcTLS upholds the properties provided by TLS (R1-R3 below), and adds two new requirements of its own (R4 and R5):

  • R1 Endpoints should be able to authenticate with each other (and all middleboxes).
  • R2 Only the endpoints (and trusted middleboxes) can read or write the data.
  • R3 All members of the session must be able to detect in-flight modifications by unauthorized third-parties, and endpoints must be able to check whether the data was originated by the other endpoint (vs. having been modified by a trusted middlebox).
  • R4Endpoints must always be able to see all trusted middleboxes in a session, and middleboxes can only be added with the consent of both endpoints.
  • R5 Middleboxes should be given the minimum level of access needed to do their jobs.

Protocol details

To satisfy the requirements above, mcTLS adds two features to TLS: encryption contexts and contributory context keys.

An encryption context is simply a set of symmetric encryption and message authentication code (MAC) keys for controlling who can read and write the data sent in that context. Applications can associate each context with a purpose (opaque to mcTLS itself) and access permissions for each middlebox.

The two endpoints of the connection each generate half of each context key (hence contributory context keys) and send their halves to each of the middleboxes. Thus a middlebox can only gain access if it receives both halves of a key, ensuring the client and server are both aware of every middlebox and agree on their access permissions.

The initial handshake is very similar to the TLS handshake. It’s purpose is to:

  • Allow the endpoints to agree on a cipher suite, a set of encryption contexts, and a list of middleboxes and their permissions.
  • Allow the endpoints to authenticate each other and all of the middleboxes (if they choose to)
  • Establish a shared symmetric key Kendpoints between the endpoints.
  • Establish shared symmetric keys Kwriters and Kreaders for each context.

For full details of the mcTLS extensions to the TLS handshake, see section 3.5 of the paper. Once the handshake has completed, the mcTLS record protocol handles communication.

The TLS record protocol takes data from higher layers (e.g., the application), breaks it into “manageable” blocks, optionally compresses, encrypts, and then MAC-protects each block, and finally transmits the blocks. mcTLS works much the same way, though each mcTLS record contains only data associated with a single context; we add a one byte context ID to the TLS record format. Record sequence numbers are global across contexts to ensure the correct ordering of all application data at the client and server and to prevent an adversary from deleting an entire record undetected.

Endpoints can limit read access to contexts by withholding the corresponding Kreaders key from a middlebox. MACs are used to ensure that only legal modifications (made by middleboxes with write-permissions) are made, and that any illegal modifications can be detected. mcTLS follows an endpoint-writer-reader approach, in which every mcTLS record contains three keyed MACs…

  1. An endpoint MAC generated with Kendpoints which the endpoints can use to check for any modifications
  2. A writers MAC generated with Kwriters (shared by endpoints and writers), and
  3. A readers MAC generated with Kreaders (shared by endpoints, readers, and writers).

Endpoints generate all three MACs when assembling a record. If a writer middlebox modifies a record, it generates new writers and readers MACs based on its modifications, and forwards the endpoint MAC unchanged. Reader middleboxes simply forward all MACs unchanged of course.

Each party receiving a record can now determine its status by examining the message received and the MACs:

  • An endpoint can check if any modifications have been made by checking the endpoint MAC. It can confirm that no illegal modifications were made by checking the writers MAC.
  • A writer can check that no illegal modifications have been made in the chain before it by checking the writers MAC.
  • A reader can check that no illegal modifications have been made by checking the readers MAC.

In the protocol as described, readers cannot detect illegal modifications made by other readers, but the moment a writer or endpoint is reached the modification will be detected. The authors describe extensions to provide this additional protection if required, but consider the benefits insufficient to justify the additional overhead.

What about the overhead?

The authors compared mcTLS to an unencrypted session, an end-to-end TLS connection (E2E-TLS) in which middleboxes blindly forward the encrypted data, and a split TLS scenario in which TLS connections are split at middleboxes with decryption and subsequent re-encryption for each leg. The main results are as follows:

  • mcTLS’s handshake is not discernibly longer that Split TLS’s or E2E-TLS’s, based on a measurement of time-to-first-byte.
  • mcTLS transfer times are not substantially higher than SplitTLS or E2E-TLS irrespective of link type, bandwidth, or file size.
  • mcTLS has no impact on real-world web page load time
  • In terms of data volume, apart from the initial handshake overhead (negligible for all but short connections), mcTLS adds less than 2% overhead for web browsing
  • mcTLS servers can serve 23-35% fewer connections per second than Split TLS, but mcTLS middleboxes can serve 45-75% more.

(Click to enlarge).

Introducing middleboxes into TLS while maintaining the security expectations of clients, content providers, and network operators is not easy…

… We show that building such a protocol is not only feasible, but also introduces limited overhead in terms of latency, load time, and data overhead. More importantly, mcTLS can be incrementally deployed and requires only minor modifications to client and server software to support the majority of expected use cases. By using mcTLS, secure communication sessions can regain lost efficiencies with explicit consent from users and content providers.