IoT end-to-end encryption with E4 (5/n): cryptography

After our last post on E4’s security model, we’d like to focus on the core cryptography components that we choose, and provide some rationale for these choices. We realized that there can’t be a “best” choice cryptographic algorithms, for it’s hard to establish a single unified metric of goodness when, in practice, meaningful factors can be not only theoretical security and optimal speed on the target platform—which is the simplified model often considered—but also possibly:

  • Available support, be it hardware or software, in the platforms to be supported. For example, a product may run on two types of devices, the older generation and the new one, each one having its own set of hardware accelerators and software libraries.
  • Availability and performance of a portable implementation running on all platforms. A cryptographic primitive may have acceptable performance when using platform-optimized code, but may be too inefficient with portable, non-platform-optimized code.
  • Standardization requirements. The most common example is that of product requiring FIPS 140–2 accreditation, in which case the primitives not includes in the FIPS 140–2 standard cannot be used.

Whereas it’s fair to say that security is a “commodity”, and that peak speed is rarely a concern on IoT platforms, the above three requirements are quite common. For example, a device might already use AES for other purposes than end-to-end encryption and it therefore makes sense to reuse it for the encryption layer. This is particularly true when, like E4, you deal with a component that is directly integrated in the software of the platform (be it at the bottom of the application layer or top of the transport layer), as opposed to approaches that need to run a dedicated application with its own process, memory space, and so on.

What cryptography does E4 use?

E4 supports two main cipher suites, implemented in the C and Go open-source libraries:

  • Symmetric-key mode, which like mobile telephony and TLS’ PSK mode uses pre-shared symmetric keys. In this mode, we use AES-SIV (with a 256-bit key for the PRF and for encryption), SHA-3-256, as well as Argon2id to derive keys from passwords in our client applications. In this mode, the C2 server shares the identity key of each device, and manages the topic keys assigned to each topic (or “conversation”), sharing them with authorized clients.
  • Public-key mode, where each client has a private/public identity key pair, where the public key is known to the C2 and the private key remains on the device. Furthermore, devices are pre-provisioned with the C2’s public key, use along devices’ public keys to send encrypted control messages. We use the “25519 ecosystem” here: X25519 for Diffie-Hellman key establishment, and Ed25519 for signatures of messages sent by devices, and encrypted using symmetric cryptography. Indeed, since messages are not sent to a specific device but (as in MQTT) tagged with a topic, using asymmetric key establishment or encryption wasn’t suitable.

In addition we have defined additional cipher suites, that are not by default integrated in our client library:

  • The FIPS 140–2-friendly crypto, which only uses primitives covered by FIPS 140–2, namely AES-GCM (instead of AES-SIV), and nistp256 elliptic-curve crypto (with ECDSA instead of Schnorr-like Ed25519).
  • The post-quantum mode, as a variant of our public-key mode, using post-quantum signature and key establishment primitives. We have experimented with different algorithms, offering different trade-offs, and can propose different options for different performance and security profiles.

The remainder of this post is structured as a Q&A, and answers some of the most common questions we have heard, as well as questions that we anticipate from users as well as cryptography experts. If you have other questions, please don’t hesitate to contact us at [email protected].

Why AES-SIV and not AES-GCM, which is more standard?

Simple: we don’t want to become completely insecure if the devices has no PRNG or a weak one. Whereas GCM is broken when a nonce is repeated, SIV with a fixed nonce has optimal security for a deterministic scheme, that is, it only reveals duplicate messages because it then transforms a same input to a same output. In E4, this can only occur if the device does not have a clock (because a timestamp is used as associated data).

GCM is nonetheless supported when FIPS 140–2 compliance is required, in which case a unique nonce must be used for each message encrypted with the same key. In this case, we use the PRNG available with additional data in order to mitigate the risk of a poor PRNG.

Why SHA-3 and not BLAKE2, which is faster?

We’ve nothing against BLAKE2 (nor BLAKE3), but we believed that embedded platform and their libraries (open-source or proprietary ones) were more likely to support SHA-3 than BLAKE2. They would be even more likely to support SHA-256, but we wanted to avoid the length-extension property. Note that Ed25519 uses SHA-512 internally, as specified.

Why 256-bit symmetric keys?

We use 256-bit symmetric keys, which provide a 256-bit security level for AES-SIV, when 128-bit might arguably be enough (and make AES a bit faster). Another objection is that Curve25519 provides 128-bit security. An argument for 256-bit symmetric keys might be that they guarantee post-quantum security. A more concrete motivation for us was the demand of certain customers for 256-bit symmetric keys instead of just 128-bit.

What side-channel defenses do you have if any?

Our code aims to run in constant-time with respect to sensitive data (except for the length or encrypted data). Defenses against side channels such as electromagnetic emanations, or against invasive attacks such as laser fault attacks, tend to be specific to the platform and threat model. For example, you might protect against single-fault attacks by using some basic level of redundant computation, though this might be insufficient against attacks injecting two or three faults simultaneously. Furthermore, such fault attacks could also be mitigated by physical means such as fault detection mechanisms.

A specific class of side-channel attacks are timing attacks on AES exploiting secret-depending memory look-ups. A variety of techniques exist to defend against this. Tricks such as masking are commonly used in public-key cryptography that add and remove harmless information during the operation so as to obscure the desired information.

This is harder to do in a symmetric context, as well as in our embedded context as it relies on good randomness. We must therefore look to the implementation itself. The naive implementation of AES from its description, along with table-based AES, uses tables held in main memory that have different access times depending on previous access patterns and which addresses have been moved into higher caches. This clearly cannot be used.

Techniques to implement ciphers in “constant-time” exist and have been known for some time. Mike Hamburg et al. applied a vector permutation technique to AES which is suitable for platforms with SIMD-type registers, such as SSSE3 on x86. OpenSSL supports this for pre-AESNI processors. Clearly, however, this is not ideal on Cortex-M and 8-bit AVR processors.

Another technique was pioneered by Eli Biham and named “bitslicing” by Matthew Kwan. It reduces the standard algorithm description to a series of logic circuits, implementing the same algorithm in terms of “logic gates” i.e. AND, OR, NOT, XOR instructions etc. In order to do this, we often have to output multiple blocks from the original algorithm in one go. During the AES competition, the Serpent submission was self-bitslicing and could output a single block of the primitive; unfortunately when applied to AES multiple blocks must be output. The same technique has been applied to AES by Boyar and Peralta and Thomas Pornin implemented this in his excellent BearSSL Library.

Hardware support, if it exists, should be constant-time, so if available this can also be used. If not, we use the best available option: Thomas’ bitsliced implementation.

Can I use the platform’s primitives implementations instead of those form the E4 client library?

Yes, for example if you wish to use hardware accelerators on your platform, or implementations already available in your application, we can tweak our client library to use those.

How do you store secret keys?

Storing secret keys depends to a certain extent on what is available on the platform in question. An embedded system with a full Linux system might use a scheme not too dissimilar to the desktop. However, keys need to be protected as best as is possible. Secure EEPROM components might be used and in this case E4 is written in such a way as to support writing to such a location. If technology such as ARM’s TrustZone is available, E4’s logic can run in the Secure Zone to protect key material from normal components that are potentially exploitable and run in the Non-secure Zone.

Since E4 is built on well-tested cryptographic standards, we can also make use of secure elements if present. These provide smartcard-like functionality to protect keys from theft while providing a signing facility. This is particularly useful in public key mode.

Finally, as you would expect, if hardware acceleration is available for symmetric primitives, E4 can be configured to use this. AES acceleration is typically available in ARMv8 A series and onwards, although care must be taken (The Raspberry Pi 4 chipset for example uses a Broadcom BCM2711 System-on-Chip without AES support). Alternatively something like the i.MX6’s Cryptographic Acceleration and Assurance module could be used on NXP’s chipsets, or other accelerators on other chipsets.

Are your implementations formally verified?

No, not yet at least. But as mentioned above, other implementations of AES, SHA-3, or Curve25519 can be used. An option is for example to use the formally verified implementations from the HACL* project. We are currently working on formal models of our protocol to formally verify its security properties.

Are there any backdoors or “golden key”?

No, you can check in the code of our open-source libraries. If you use E4 in your products, you can have full access to the source code of the software deployed on your platform.