Théophile Wallez

Understanding MLS (part 0): What is MLS?

Messaging Layer Security

MLS is a secure group messaging protocol, which is being standardized at the IETF. It aims to be usable in messaging apps (such as Signal, WhatsApp, etc), in team communication tools (such as Slack, Matrix or Zulip), or even in conference software (it is already deployed in Cisco’s Webex!).

MLS was designed with the following features in mind:

MLS relies on the following components, that are not fully specified Their required behavior are specified in a separate document, also being standardized at the IETF., and can be implemented differently depending on the requirement of the application:

The “group” part of “secure group messaging” is useful even if only two people want to talk to each other: nowadays, people often want to be able to communicate on several devices (such as phones and laptops), hence a conversation between Alice and Bob will (for example) be in practice a group conversation between Alice’s laptop, Alice’s phone, Bob’s phone and Bob’s tablet.

Forward secrecy and post-compromise security

In order to precisely understand the confidentiality guarantees that MLS enjoys, we must explain the two notions of post-compromise security and forward secrecy.

Long-lived protocols

To read this blog post, your web browser used the TLS protocol to establish a secure connection with my web server, which probably took less than a second. Group messaging is a different beast: the protocol runs for a long period, sometimes several years. Secure group messaging protocols must therefore account for the possibility that one of the devices is stolen or compromised during the lifetime of the conversation.

What happens to confidentiality when an attacker learns secret values (such as private keys) during a compromise? The forward secrecy and post-compromise security properties answer this question.

Definitions

Forward Secrecy (FS) and Post-Compromise Security (PCS) refine the confidentiality property: they state what happens to confidentiality when a device is compromised by an attacker.

Forward secrecy means that even if private keys become known to an attacker, they still can’t decrypt messages that were sent in the past Fun-fact: emails encrypted with PGP don’t have this property! If your private key leaks, then all mails that were encrypted with this key can be decrypted.. In other words, confidentiality of messages you send now are not affected by future leak of private keys.

Timeline representing confidentiality over time when a compromise happen with forward-secrecy
Confidentiality over time with forward-secrecy. Messages sent before compromise stay secret, however there is no guarantee about messages sent after compromise.

Post-compromise security is the “reverse” of forward secrecy Because of the similarity of the definition with forward secrecy, post-compromise security is sometimes named “backward secrecy”.: even if private keys become known to an attacker, they still can’t decrypt messages that will be sent in the future, after some period of healing (where those private keys will be replaced by new ones). In other words, confidentiality of messages you send now are not affected by previous leak of key material.

Timeline representing confidentiality over time when a compromise happen with post-compromise security
Confidentiality over time with post-compromise security. After a period of healing, messages sent after compromise stay secret.

An essential security property with dynamic groups

Even if you think that device compromise is not part of your threat model, forward secrecy and post-compromise security are essential with dynamic groups.

Assume that Alice is working at Company, and thus is in the secure group chat of Company. To decrypt messages sent within the group, Alice was given some group chat secret. When Alice leaves Company, we don’t want her to be able to decrypt messages sent within Company’s secure group chat, even if she knows some older group chat secret. This is guaranteed by post-compromise security! Indeed, we can see things as if Alice “compromised” an old group chat secret. If the healing happens just after Alice leaves the group chat, then post-compromise security will guarantee that she will not be able to decrypt messages sent within Company’s secure group chat after she leaves.

Similarly, when Alice joined Company, she was given secrets to communicate in Company’s secure group chat, but we don’t want her to be able to decrypt past messages sent within Company. In a similar fashion, this is guaranteed by forward secrecy.

Therefore, forward secrecy is an essential property when new people can join the group, and post-compromise security is an essential property when people can be removed from the group.

Timeline representing confidentiality over time when Alice joins and leaves a group
Confidentiality over time when Alice joins and leaves a group, with forward secrecy and post-compromise security. While Alice obtained a group chat secret when she was in the group, she is unable to decrypt messages sent before she joined and after she leaves.

The importance of messaging app design

Forward secrecy says that if an attacker compromises a device at some point, they still can’t decrypt messages that were sent in the past. However, if messages sent in the past are still unsecurely stored on the device, then even the best cryptography can’t prevent the attacker from learning these. One solution is to use ephemeral messages, by deleting them automatically after some time, or to store messages securely in a secure enclave. Note that this concerns the design of the messaging app and not of its underlying cryptography, and it can play a central role in its adoption See Collective Information Security in Large-Scale Urban Protests: the Case of Hong Kong, USENIX Security ‘21..

State of the art in secure group messaging, before MLS

The Signal messaging app relies on the Signal protocol, which has the same properties as MLS (such as forward secrecy and post-compromise security), but only works between two devices. It is the state of the art secure messaging protocol: it has been thoroughly analyzed by academia, and nowadays any serious secure messaging app relies on a variation of Signal’s protocol For example, the wildly used messaging app WhatsApp uses Signal’s protocol under the hood..

The attentive reader would notice that Signal is not limited to two-person conversations: it allows group conversations. It can be done by using pairwise Signal channels: when you send a message to a group, you encrypt your message separately to every participant in the group. This requires a number of cryptographic operations linear in the size of the group!

The “Sender Keys” protocol See this Signal blog post, or the WhatsApp Security Whitepaper, section “Group Messages”. relaxes the post-compromise security properties of Signal to obtain a more efficient secure group messaging protocol, but healing the group against a compromise still requires to use pairwise Signal channels, hence a number of cryptographic operations linear in the size of the group.

So in the state of the art, the best way to heal from a compromise involves a number of cryptographic operations linear in the size of the group. This does not scale well for large groups (e.g. 10k members), which begs the need for a better protocol.

Here comes MLS: a secure group messaging protocol, with sub linear post-compromise security healing.

Understanding MLS through a series of blog posts

During my PhD, whose topic is building a formal security proof for MLS, I eventually came to have a good understanding of its inner workings.

Although the RFC has become less and less terse with the latest revisions It now has nice diagrams and even high-level descriptions!, it is still presented as a monolithic protocol. In my reference implementation, I spent a great deal of effort to cleanly separate it into three sub-protocols, to make my proofs as modular as possible The decomposition is explained in our USENIX Security ‘23 paper, and will also be explained in the next blog posts..

Furthermore, only the final design of MLS is presented in the RFC, which contains a lot of mechanisms to improve performance and security. It can be overwhelming when they are presented all at once. To easily understand MLS, we need to understand how a first simple design evolved iteratively into the MLS we have now.

In this series of blog posts, I will present each sub-protocol independently, starting with a simple design, and improve it step by step to obtain the current design.

What next?

In the next blog post, I will present how the Signal protocol works, because MLS borrows a lot of ideas coming from Signal’s double ratchet algorithm.