This isn’t (necessarily) a security vulnerability; merely an observation that I don’t think has been articulated adequately within the cryptography community.
I thought it would be worth capturing somewhere public so that others can benefit from a small insight when designing cryptosystems.
Background
Once upon a time, there was Symmetric Encryption, which provided confidentiality, but no integrity; and Symmetric Authentication, which provided integrity, but no confidentiality.
Soon, cryptanalysts realized that, in the digital world, “no integrity” often implies “no confidentiality”, and began to consult various Oracles of arcane nature (Padding, Error, etc.).
The cryptographers were very cross with this discovery, and proposed several paradigms of combining Encryption and Authentication to prevent adaptive chosen-ciphertext attacks.
This was called, Authenticated Encryption (AE).
There were three flavors of AE that cryptographers debated. Moxie Marlinspike attempted to settle this debate with the Cryptographic Doom Principle, which lauded Encrypt then MAC as the safest way to combine the two algorithms.
However, not all MAC algorithms are created equally. Cryptographers sought fast message authentication algorithms to prevent denial-of-service attacks for encryption in transit. To that end, several fast message authentication codes were proposed, including GHASH (used today in AES-GCM) and Poly1305 (used with ChaCha20).
Furthermore, cryptographers wanted standardized algorithms that didn’t just authenticate some given ciphertext and nothing else.
Why? There is an entire class of attack against cryptosystems (which is easier to describe against an Encryption at Rest system than Encryption in Transit) called a Confused Deputy Attack.
Confused Deputies
A Confused Deputy can refer to a lot of attacks in computer security. When discussing cryptography, I mostly mean this:
- Take two ciphertexts encrypted under the same key. For example, two fields from the same database record, or the same field from two separate records.
- Swap them.
- Does the application decrypt them without any fuss?
If so, you can (as a legitimate unprivileged user of a system with read/write access to an otherwise encrypted database) simply swap records around and the application will happily decrypt them for you.
The application that performs the decryption is the deputy, and it is vulnerable to confusion.
However, this requires that all fields use the same symmetric key. One mitigation strategy is to derive a per-row or per-field key. Another is to properly use an AEAD mode.
AEAD Saves the Day
The context of a message is often important in decisions concerning whether or not it should be decrypted.
To that end, we don’t have many generic AE constructions that aren’t also AEAD constructions: Authenticated Encryption with Associated Data.
To protect against confusion, simply include the relevant context (i.e. database primary key and/or field name) as the Additional Associated Data parameter when encrypting or decrypting the record.
If an attacker tries to fiddle with these values, they will simply fail to decrypt even though the ciphertext is valid.
Invisible Salamanders
Much has been written about the Invisible Salamanders problem, so I’ll be brief: It’s easy to, when using fast MAC algorithms like GHASH and Poly1305, construct a situation wherein:
If your intuition for how message authenticators work is “HMAC but fast”, this is astonishing. On the decrypt path, if you can trick multiple parties into using different keys ( for you,
for them), you can trick them into decrypting very different plaintexts without any clue that anything is amiss.
There are several papers that discuss mitigating this risk, but it almost always involves somehow publishing a commitment of (at least) the key alongside the ciphertext. Any collision-resistant hash function of sufficient length and security can fulfill this purpose.
For example: If you’re using a Key Derivation Function in your protocol, you might include one more KDF output with a static “info” parameter. On the decrypt path, you would recalculate this from the input key material and compare it with that’s stored (in constant-time!) before proceeding with the normal decrypt procedure.
Towards Lucid Deputies
If you’re designing a system that stores encrypted data, and you intend to protect against the types of Confused Deputy attacks described above, there is a hidden trap that you can fall into one of two ways.
- If your mitigation strategy involves deriving a distinct key per record (or per field on a record) from one input key
- If you’re building a multi-tenant data warehouse where every customer’s data is encrypted under a per-customer key
(There are some variations of the above scenario that might also be valid, but those are the two primary ones.)
If you’re using a standard AEAD mode in such a setup (wherein multiple keys are used), and you aren’t including key-commitment in your protocol design, you’re almost certainly prone to Invisible Salamanders.
The entire reason the Invisible Salamanders attack works is because AEAD modes were designed for the key to be held constant (and secret!) while the nonce, AAD, and plaintext/ciphertext are variable. By making the key variable (and especially by making it attacker-influenced), you’ve exited the scope of the security properties of the algorithms.
AEAD modes are only vulnerable when the key varies.
The moment you make your key variable, and you’re only using a naked AEAD mode without any additional overhead? Invisible salamanders.
Should We Worry?
The good news: Exploiting this without access to the raw key material isn’t trivial (especially if you use AES-GCM-SIV).
Even better news: Exploiting this often results in garbage plaintext rather than anything meaningful, which more often leads to errors and crashes than code execution.
However, it’s still something your cryptographer should keep in mind when you consult them about your encrypted data storage designs.
(You do have a cryptographer on your payroll, don’t you?)