Weird crypto concepts

At Teserakt, we solve real security engineering problems by leveraging cryptography, aiming for the best choices in terms of security, performance, and ease of integration. Being part of the community for many years, we actively follow the theoretical and applied research in cryptography and security, in order to build innovative yet robust solutions to modern IoT systems.

If you know a bit about cryptography, you sure know basic concepts such as encryption, hashing, as well as notions such as preimage resistance, forward secrecy, semantic security, zero-knowledge proof, and others that we routinely encounter in modern applications. However, cryptography is a rich field with a broad unexplored territory, with many notions that haven’t yet made it to popular applications. In this post we’d like to give an overview of some of these notions, and provide leaks to further readings, would you be interested in learning more about them.

AES-GCM-SIV and SIV-AES

AES-GCM-SIV is variant of AES-GCM where the nonce used for encryption is determined from the tag computed by authenticating the plaintext (and any associated data). AES-GCM-SIV’s MAC, called POLYVAL, is slightly different from GCM’s GMAC. The benefit of AES-GCM-SIV compared to AES-GCM is that it remains secure if a same nonce is reused—a.k.a. misuse resistance.

SIV-AES is a different thing from AES-GCM-SIV.

For some reason, AES in SIV mode is not called AES-SIV, but goes by the official name of Synthetic Initialization Vector (SIV) Authenticated Encryption Using the Advanced Encryption Standard (AES), abbreviated to SIV-AES—having AES-CCM, AES-GCM, AES-GCM-SIV, and AES-SIV wasn’t confusing enough.

Like AES-GCM-SIV, the main reason for using AES-SIV is to avoid the hazard of repeated nonces. Unlike AES-GCM-SIV, SIV-AES does not use a MAC based on binary polynomial multiplication, but instead the AES-based CMAC, a variant of CBC-MAC. This makes SIV-AES simpler than AES-GCM-SIV, but also slightly less fast.

More:
https://tools.ietf.org/html/rfc5297
https://tools.ietf.org/html/rfc8452

Catalytic space computation

A form of computation where the memory required does not need to be completely empty, but may contain information that is restored after the computation is completed. This has been leveraged in proofs of catalytic space, for example proposed as a proof-of-resource for blockchain protocols.

More:
https://iuuk.mff.cuni.cz/~koucky/papers/catalytic.pdf
https://eprint.iacr.org/2018/194

Dolev–Yao model

Cryptographers sometimes pedantically refer to the “Dolev–Yao model”‘ when they just mean the active attacker adversarial model, wherein the attacker can eavesdrop, intercept, and modify data transmitted. But the Dolev–Yao model is much more than this. It is the first formal model for cryptographic protocols, and a symbolic framework to describe an analyze their security.

More:
https://www.cse.huji.ac.il/~dolev/pubs/dolev-yao-ieee-01056650.pdf
https://cseweb.ucsd.edu/classes/sp05/cse208/lec-dolevyao.html

Indelibility

The property of a timed transaction record (such as a blockchain transaction) that cannot be back-dated. In the context of IoT transactions, this can be an important property in situations with extreme network latencies and unreliable clocks.

More:
https://eprint.iacr.org/2018/664.pdf

INT-CTXT

Integrity of ciphertexts, a security notion applicable to authenticated encryption schemes that formalizes the practical impossibility for an attacker to create a valid ciphertext even if they know many valid ciphertexts for messages of their choice. If an authenticated cipher is both IND-CPA and INT-CTXT, then it is also IND-CCA.

More:
https://www.cosic.esat.kuleuven.be/school-iot/slides/AuthenticatedEncryptionII.pdf
https://eprint.iacr.org/2018/135.pdf

Invisible and anonymous signatures

Invisibility is the property of a public-key signature that cannot be identified as valid or invalid unless the signer has agreed to reveal that information. This may sound like it makes the signature anonymous (that is, the signature does not reveal the signer identify, or public key), but it does not necessarily (counterexample: sign in addition the signature with a non-invisible signature scheme). However, any anonymous signature is invisible.

More:
https://en.wikipedia.org/wiki/Undeniable_signature
https://www.researchgate.net/publication/338794416_A_Note_on_the_Invisibility_and_Anonymity_of_Undeniable_Signature_Schemes

Non-outsourceability

The property of a proof-of-work system whose “work” cannot be outsourced to third parties without also sharing the outsourcer’s private key, and therefore access to mining reward. This was proposed to prevent pools and hosted mining. More generally, non-outsourceability can be the property of computations that cannot be delegated without compromising some sensitive data.

More
http://soc1024.ece.illinois.edu/nonoutsourceable_full.pdf
https://eprint.iacr.org/2020/044.pdf

Indistinguishability obfuscation (iO)

Obfuscation is about taking as input a program and producing a second program that in some sense hides how the first program works—its internal variables, secret arguments, and so on. Cryptography sees a program as one of the possible abstract representations, typically a Boolean circuit with AND, OR, and NOT gates. iO can be seen as a raw encoding of the input–output relation that hides the “implementation details”, such as sub-procedures or intermediates variables.

The notion of indistinguishability is just a way to formalize the intuitive security notion of secure obfuscation, by saying that obfuscations of two distinct, yet equivalent programs, should not tell which of the two programs has been obfuscated. iO is very powerful and sounds like the solution to many problems, but in practice it’s not because of the high complexity and ineffiency. For example, iO gives you a straightforward way to created functional encryption and proxy re-encryption schemes, by obfuscating the decrypt-and-reencrypt process (interestingly, you can also get iO from functional encryption).

More:
https://eprint.iacr.org/2013/454.pdf
https://eprint.iacr.org/2015/163.pdf

On C and embedded platforms

The C language is still prominent in the industrial embedded world, where “IoT” often refers to platforms much more limited than a Raspberry Pi. Often having to deal with such environments, we wrote the following informal explainer about C for internal company needs, and thought it could be of interest for more readers. This is basic material, mixing C and operating systems knowledge, aimed at readers with no or limited understanding of how you go from C source code to an executable. We could expand on many points, but for now we just share this meandering overview.

C compilation

C is really a kind of portable-ish assembler with an abstract model of memory. When we say compilation, we specifically mean the C-to-object translation phase. This is called a “translation unit” in C-speak. Firstly, the file to be compiled is read. Then the preprocessor is executed – any preprocessor directives are expanded and removed, meaning headers are literally included directly in the file. Once this is done (and if it does not error), then and only then is the compilation phase run. The job of this phase is to turn C code into assembly and store it in a format for later use.

Linking

This is probably the most undocumented black magic all programmers rely on but don’t know about. From the above compilation phase we typically have an “object file”. So what you see is typically .c to .o, which you’re used to, all well and good. However, linking into the overall binary requires some further work and we must first take a digression into.

Program loading

Again, operating systems are hiding a lot of complexity here. Programs on Linux are now ELF files (previously a.out, hence why GCC without a -o option when compiling produces a file named this way) and on Windows the executable format has gone through several iterations: flat DOS files were COM and had no structure at all. COFF followed and has structure, as does its successor PE (Portable Executable).

What these files do is tell the program loader: where in the file the various sections of program code reside, and where the program would like sections to be loaded in memory. Modern operating systems often do not respect this request, partly because flags have been added to mark the code as “position independent” (-fPIC, or /dynamicbase) and so the code contains no addresses that need to be “fixed” by the program loader. An early performance problem in Windows programs was that DLLs had fixed base addresses and using the same one with multiple DLLs in the same process meant all subsequent loads also required rebasing, which was slow in the pre 1GHz processor days.
So to recap: PE and ELF files are used by the operating system to describe their internal contents and roughly what they are, such that code can be marked read+execute, data read only, bss read-write and so on.  These files also contain two other important pieces of information: external library dependencies and information marking what architecture they run on. This allows the operating system to deny loading a program quickly if it isn’t the right architecture (it would likely simply crash if loaded).

Linking again

So now we’ve talked about object formats, the job of the linker is to take the objects produced by compilation and put them together into the desired output. Since C supports functions implemented in other translation units (of course) and even in external libraries, part of the job of the linker is “symbol resolution”. It will try to find where these “symbols” are and match them up when producing the final binary. Specifically, it wants to know what address to encode for using call instructions, or if it should emit an entry saying “this program depends on an external library and wants function X from it, please load this before loading this program”.

There are in general two types of object produced by a linker: an ELF binary executable with an entry point (by convention this is a symbol called start or _start, but this jumps into libc and it is libc which eventually calls main after doing its own initialization) and a shared object (.so files) which basically exports a list of functions other code may use. The shared object concept comes from the days when disk space was limited. This allowed the same piece of code to be loaded by multiple programs, but only need to exist once on disk.

There is a third type of output, but the linker itself isn’t technically responsible for it. This is a “static library”. It is essentially a bit like a zip-archive (but not a zip) of object files. Linkers generally treat these files just as they treat other object files and will look in them and perform resolution as normal. This allows for the code to be entirely included in another executable or shared object without any external dependencies.

A full discussion of dynamic linking (how shared objects are eventually loaded) is incredibly complex and we won’t go into it here. What you need to know is that “soname” is used to allow multiple versions of the same object to exist on Unixes. You might have: /usr/lib64/libe4.so that symlinks to /usr/lib64/libe4.so.1, which symlinks /usr/lib64/libe4.so.1.2.7. This
allows applications to link at two levels. They may link: to /usr/lib64/libe4.so, which means “use the latest libe4” or to /usr/lib64/libe4.so.1, which means “use the latest libe4 with major version 1”.
As an aside, you might wonder to what degree you can control the output of the linker given you mostly just use it without ever thinking. Well, the linker has its own entire scripting language and ld --verbose will dump the default script it uses for your system. Here are some examples from the Xen hypervisor (which is essentially a kernel): https://github.com/xen-project/xen/blob/16bbf8e7b39b50457bb2f6547f166bd54d50e4cd/xen/arch/x86/xen.lds.S and https://github.com/xen-project/xen/blob/16bbf8e7b39b50457bb2f6547f166bd54d50e4cd/xen/arch/arm/xen.lds.S

Static vs. dynamic executables

You might have heard of static vs. dynamic linking. This is quite simple: if you run “file” on a binary and it says “statically linked” then it has no dependencies on shared objects at all. If you try “ldd” it will say “not a dynamic executable”. By contrast, a dynamic executable will say so, and ldd will print the list of libraries required at load-time (more may be loaded in both cases by dlopen). It is not technically possible to have a static binary on Windows at all. In this case it tends to mean “the libc shipped with Windows is linked in statically, rather than as a DLL”. Similarly, with glibc, static linking is very-hard-to-impractical on modern Linux systems.

Lastly, static executables are not magically super portable. They still use system calls and so require a minimal kernel version that implements these calls. They are also bound to the operating system they are compiled for, generally speaking. A static binary can typically be produced with C using musl or uclibc, and Golang does this when cgo is not invoked.

Toolchains

“Toolchain” is the term we use for all the tools needed to build an executable in C. We’ll give an overview of the GNU tools, as these copied earlier Unix tools and are generally everywhere.

GCC & binutils

Firstly, there is binutils. This is a set of tools for working with binaries, specifically including an assembler, linker and objcopy. Secondly, there is the GNU Compiler Collection (GCC). We normally think of GCC as a C compiler (this is called the frontend) but technically it is simply a command line interface to the backend compilers, which are invoked depending on the file type. Now, GNU toolchains have two properties: triplers and platforms. A triplet specifies information about the machine to be produced, e.g. riscv64-none-elf – this says the code should be risc-v, there’s no OS expected, and the output format is ELF.

Platforms specify where the toolchain is to be built, where it will run, and what it will produce code for. These are respectively the GNU build, host, and target options. Yes, this means you can compile GCC on x86-64, where the GCC will run on aarch64, and produce code for riscv32. “Cross compilers” are generally those that target other platforms. Usually build==host in this case (normally x86-64) but the compiler produces output for another platform. GCC+binutils is not the only compiler suite in town, but it’s by far the most common because it runs absolutely everywhere. This is one of the most successful GNU projects and was a strong enabler of the Linux ecosystem.

LLVM & Clang

LLVM and Clang are another pairing. They’re somewhat different in that they were designed to be more modular, which is what happens when you start your project in the 2000s having observed 20 years of GCC mistakes. Here’s how they fit together: LLVM is a virtual machine, but not the VMware sort. It has its own “instruction set” called intermediate representation, or IR. From this it has “backends” that translate that to the assembly of target architectures, and finally assemble them to machine code/objects. Originally it used GNU’s linker to assemble these objects, but has since grown its own. It has its own assembler too.

Clang is the C-language frontend. Specifically, it knows how to produce LLVM IR from C (and also C++ and Objective-C) and invoke the rest of the toolchain to get the linker to work and produce binaries. There are of course other LLVM frontends: Rust and Swift are the two most well known, but there’s an Ada one and an Ocaml one too.

Microsoft

Finally, Microsoft’s compiler is also widely used for Windows platforms. The architecture of their tools is pretty uninteresting from our point of view, but they have: cl.exe (compiler frontend), link.exe (linker), lib.exe (library tool), ml.exe (assembler) etc.

Run time symbols and ABI

Let’s briefly talk about run time symbols briefly. When we’re outputting shared or static objects for the linker, the ELF and PE formats do not support arbitrarily-named functions as they were all invented when you were either quite young or not born. Just like International DNS names are encoded to fit into 80s-era DNS, so too are function names for languages that are not C or do not follow its naming conventions. In particular, a C++ function in a namespace like this:
    mylibrary::myfunction()
would be encoded in a mess of _ZN7-prefixed names.

The ABI is the application binary interface. This is what we expect programs to do when calling functions, and how they should pass arguments in registers, and so on. These days there are two standards for x86-64: AMD64 for Unix designed by AMD+Unix people, and Microsoft’s, who felt the need to be different. Since different platforms have different registers and even vastly different features, they typically have their own ABI. Sometimes, as was the case for 32-bit x86, there are multiple competing calling conventions produced by different compiler vendors.

When we talk about the ABI, we generally conflate both of the above. There are no standards in C and C++ for how processes should behave at this level, it’s simply by convention. However there are some important points to note: only the C convention is widely respected. Binary C++, Rust, Go, etc. distributions of libraries generally don’t happen. Hence #[repr(C)] and #[no_mangle] in Rust, and cgo in general.

libc

C doesn’t need a runtime (it is of course portable assembler) but the C standard does actually specify quite a lot that assumes a full operating system, including things like fopen for opening files. This necessitates a standard library to accompany the language. Rust has std and core, Golang is batteries included – this is well understood.

There is some blurring, particularly on Unix, as to what is “libc” i.e. ISO C, what is POSIX i.e. “Unix-like” and what is just plain Linux. A particular case where the roles are thoroughly and entirely blurred together is the dynamic loading of libraries. That is outsourced to /lib/ld-linux.so.2 (on libc6 systems), which is hardcoded into the ELF binary. This library is developed under the auspices of glibc. More information here: https://linux.die.net/man/8/ld-linux.so – in short, libc also provides the dynamic loading resolution, which is sort-of a core part of Linux/Unix. You can see some of the problems this causes with: https://www.musl-libc.org/faq.html and you will also find that anything but a static binary will fail to run on Alpine Linux for the same reason (Alpine uses musl even for dynamic linking).

Like most standard libraries, some initialization is required and hence libc typically hijacks the default entry point symbol _start or start, and defines “main” as the starting point for consuming programs. Just as there are multiple competing compiler implementations, so there are multiple competing libc implementations. glibc is the typical Linux libc, from the GNU people. musl and uclibc are alternatives. Since libc is integral to programs and because there is some blurring between roles, libc also has a role to play as the program loader on Linux, especially when dynamic libraries are used. libc is typically implemented as a shared object (and as an example of soname, I vaguely remember the crossover from libc.so.5 to libc.so.6 from my early Ubuntu days). It is possible not to use libc by using the -ffreestanding option. This assumes the executable will provide everything it needs itself (including implementing its own entry point).

Microcontrollers and platforms other than the PC

If your environment has an operating system, you’re done. The rules are the same as for that operating system and the ABI it uses for that platform. How to do stuff on Linux on ARM is relatively well defined. The only exception to the standard development process is that you will probably use a cross compiler, because x86 is far more powerful than ARM. I highly do not recommend trying to compile large projects on a Raspberry Pi. The NDK for Android, for example, contains a cross compiler capable of targeting common Android targets, particularly armv7-a and aarch64.

Without an OS, you are in the same situation as a kernel engineer. You have to be careful how much of libc you rely on and you must define your entry point as the platform in question requires. You may have to set up interrupt handlers. There is no program loader: you will have to convert your ELF output to something like Intel’s HEX, or just a flat binary file that’ll be flashed to storage. You can use an RTOS kernel (previously discussed) that’ll do some of this work for you if it supports those chipsets. See https://en.wikipedia.org/wiki/Intel_HEX#Format and https://www.zephyrproject.org/
Kernel engineers generally have things slightly easier, namely that common bootloaders like grub actually can read ELF files too. UEFI firmware executes PE images (same as Windows) with a specific subtype to describe them as UEFI components. As a specific example of this, one can build the Linux kernel as an EFI binary. How much work needs to be done really really depends.

Depending on the system in use, the toolchain may integrate and try to “hide” some of the complexity of initializing hardware by hand. For example, Atmel Studio is a Visual Studio Shell-based project (uses VS but implements its own project logic) that has various wizards and emits appropriate code for each board. More commonly, a “board support package” is provided (https://en.wikipedia.org/wiki/Board_support_package) that may look like anything but commonly contains a makefile-based configuration system. For embedded Linux, Yocto is now quite common (based on buildroot). This is a Linux kernel menuconfig style system where you pick your configuration and then type make and wait.

Some concepts in amongst this are important. First is the “hardware abstraction layer”. In Windows, this tries to hide the details of the CPU setup in use, in particular between uniprocessor (old), symmetric multithreading (one processor with multiple cores) and symmetric multiprocessing (multiple independent processors, possibly with multiple threads each). More generally, it means trying to make higher level code as agnostic as possible with regards to the hardware. Second, the concept of scheduling simply does not exist at the very low level. Your kernel is setting an interrupt timer to take back control periodically from various processes, deciding what to schedule next and then doing that, even with kernel threads (this is called a fully pre-emptive system). The opposite extreme is cooperative scheduling, where the process must indicate that it wants to give up control at certain points. If it does not, e.g. if it gets stuck in a spin-loop, that’s game over. Most typical microcontrollers are uni-processor (single threaded).

Debugging and flashing

This last section will be brief, although there’s a lot to say, which might appear in a subsequent post. Essentially the common debug protocol is U(S)ART or the ARM-extension SWD. Modern PC bioses are typically flashed with SPI. Most “development kits” for microcontrollers come with an “On Circuit Debugger”, which means the USB port provides three things: power to the device, sometimes a serial port, and usually a UART-over-USB port. This allows for easy debugging. You can buy dedicated debugging kit, however.

IoT end-to-end encryption with E4 (6/n): deployment

In this post, we’ll present how E4 could be deployed in your infrastructure. We’ll go through its requirements, its components, and the additional services that we offer for monitoring and observability. We’ll describe how everything is set up, from application deployment to initial provisioning of devices and the start of operations.

E4 is mainly offered as an on-premise solution, such that our applications will run on your own infrastructure, as opposed the managed version. Our philosophy is to have a minimal footprint, and a tight exposed surface from our services. We also carefully E4’s back-end to be modular, so we can easily adapt to your constraints. We can also interface with your existing services, such as your own database server or logging system, if you already have them in place.

Deploying E4 is a 4-step process:

  1. Installing applications and services
  2. Preparing the device software
  3. Enrolling the devices
  4. Operating the devices

Below we’ll review each of these steps:

Installing applications and services

E4 is composed of several components and services, where the bare minimum required is:

  • Our E4 core application (C2)
  • An MQTT broker (typically the one already used by your IoT devices); we use VerneMQ, but any MQTT 3.1-compliant
  • A relational database management system (any *SQL, such as SQLite, PostgreSQL, MySQL, and so on). Teserakt can provide the database (including a failover replication and backups) or reuse one you already have.

Additional components typically required for production environments are:

  • Our automation engine application, enabling key rotation policies
  • A web UI, to ease devices and topics management (or two command line clients, providing the same set of features)
  • ElasticSearch, Logstash/Filebeat, Kibana instances for the application logs and/or MQTT messages collection, processing, and analysis
  • Jaeger and OpenCensus agents instances, for observability metrics collection and analysis

The high-level architecture looks as described in the diagram below:

architecture

Hosting requirements

Sizing the hosts for deploying all the stack is mostly dependent on the number of devices you are managing, but our applications are very lightweight, and usually require little resources to operate (that is, less than 500MB of memory). Other services such as storage and monitoring require more, but all in all everything would fit on a single standard virtual machine, with 4 cores and 8 GB or memory.

Our back-end applications are built in Go and are statically compiled by default. This allows them to run on a wide range of OS and architectures, should it be Linux, OSX, Windows or Arm platforms, without any additional libraries or packages.

Additionally, our stack can fully be deployed in containers, such as Docker. Application images are built from our continuous integration on each release, and can be pulled from our private registry. Containers make it really easy to deploy and update, but are not always an option. We can adapt to what works the best for you.

Automated installation and updates

We want to offer repeatable, reproducible, and verifiable deployments. This is why we choose Ansible and scripted the full installation process of our services. This requires at least one host, and SSH access with a user having administration privileges. All the services are configured from a single file, making first installation or maintenance updates as simple as the push of a button.

Monitoring

Although we can provide our monitoring stack, we don’t exclude integrating with your own to leverage your existing processes and avoid having to define new ones. All our applications produce structured logs in JSON format, making it easy to adapt and transfer to any log management solution or SIEM. Observability traces can also be forwarded to any OpenTracing compatible back-end of your choice.

Scalability and high availability

Our deployment support horizontal scalability, by load-balancing requests to our applications. But it would require a large number of operated devices before hitting the limits of a single instance, as our applications generates no load or traffic when not provisioning new devices or performing key rotation operations. For high availability, we still suggest to deploy failover instances of the C2, automation engine, and database in order to ensure service continuity. This being highly dependent of your infrastructure and use cases, we’ll provide you advice tailored to your needs prior to the initial deployment.

Preparing the device software

Here, the goal is to add the E4 client library in your application, in order to support message protection. We’ve already discussed the client library integration in our previous post, and here’s a quick reminder:

  • The application must load a saved version of an E4 client during initialization, or create it if it does not exist yet
  • The application must subscribe to its device’s control topic upon initialization
  • Outgoing MQTT messages must be protected before being published
  • Incoming messages must be unprotected before letting the application process them

Our client library is open-source and available in various languages and format to allow a wide range of applications to include it. But in case you don’t see one fitting your needs, feel free to contact us.

Enrolling the devices

When integrating the client library in your application, you will probably notice that creating an E4 client require an initial key. Such key will for example be generated and burned into the device at production time. Then, you will be ready to provision the device.

We’ve identified two main scenarios to provision device keys in E4:

  • Manually: where you will need to hold the device identifier and its key (either a symmetric key or a public key, depending on the preferred mode of operation), and manually provide those information to the C2, via the CLI client, the web interface, or directly calling the API.

  • Automatically: based on a trust-on-first-use (TOFU) mechanism, whereby the device will be first started inside a trusted environment, and register itself in E4 by making a C2 API call, with its identifier and key.

Operating the devices

Once registered on the C2, the device is almost ready to be operated. The last step is to assign it the topics it must publish to or receive messages from, so that the C2 server can transmit the keys of those topics to the the device. With a proper client library integration, and the device powered up enrolled, this should only require a few clicks from the web UI, and can also be automated in the enrollment phase.

To deal with reboot or power failures, on first creation, the E4 client will save a copy of its state inside the provided storage, allowing to restore it from this save when the device application initialize, without having to go through the E4 client creation steps again. The E4 client updates the saved data automatically each time it is modified by a control message, such as when receiving a new key (from a key rotation), when a topic is assigned or removed, or any other operations where information is transmitted to the client and must be persisted.

Conclusion

From here, your devices are ready to communicate securely, with minimal changes to your infrastructure and applications. Thanks to our modular approach, E4 can be easily adapted to your needs and constraints. Contact us if you’d like to learn more about it or have any questions.

IoT end-to-end encryption with E4 (5/n): cryptography

After our last post on E4’s security model, we’d like to focus on the core cryptography components that we choose, and provide some rationale for these choices. We realized that there can’t be a “best” choice cryptographic algorithms, for it’s hard to establish a single unified metric of goodness when, in practice, meaningful factors can be not only theoretical security and optimal speed on the target platform—which is the simplified model often considered—but also possibly:

  • Available support, be it hardware or software, in the platforms to be supported. For example, a product may run on two types of devices, the older generation and the new one, each one having its own set of hardware accelerators and software libraries.
  • Availability and performance of a portable implementation running on all platforms. A cryptographic primitive may have acceptable performance when using platform-optimized code, but may be too inefficient with portable, non-platform-optimized code.
  • Standardization requirements. The most common example is that of product requiring FIPS 140–2 accreditation, in which case the primitives not includes in the FIPS 140–2 standard cannot be used.

Whereas it’s fair to say that security is a “commodity”, and that peak speed is rarely a concern on IoT platforms, the above three requirements are quite common. For example, a device might already use AES for other purposes than end-to-end encryption and it therefore makes sense to reuse it for the encryption layer. This is particularly true when, like E4, you deal with a component that is directly integrated in the software of the platform (be it at the bottom of the application layer or top of the transport layer), as opposed to approaches that need to run a dedicated application with its own process, memory space, and so on.

What cryptography does E4 use?

E4 supports two main cipher suites, implemented in the C and Go open-source libraries:

  • Symmetric-key mode, which like mobile telephony and TLS’ PSK mode uses pre-shared symmetric keys. In this mode, we use AES-SIV (with a 256-bit key for the PRF and for encryption), SHA-3-256, as well as Argon2id to derive keys from passwords in our client applications. In this mode, the C2 server shares the identity key of each device, and manages the topic keys assigned to each topic (or “conversation”), sharing them with authorized clients.
  • Public-key mode, where each client has a private/public identity key pair, where the public key is known to the C2 and the private key remains on the device. Furthermore, devices are pre-provisioned with the C2’s public key, use along devices’ public keys to send encrypted control messages. We use the “25519 ecosystem” here: X25519 for Diffie-Hellman key establishment, and Ed25519 for signatures of messages sent by devices, and encrypted using symmetric cryptography. Indeed, since messages are not sent to a specific device but (as in MQTT) tagged with a topic, using asymmetric key establishment or encryption wasn’t suitable.

In addition we have defined additional cipher suites, that are not by default integrated in our client library:

  • The FIPS 140–2-friendly crypto, which only uses primitives covered by FIPS 140–2, namely AES-GCM (instead of AES-SIV), and nistp256 elliptic-curve crypto (with ECDSA instead of Schnorr-like Ed25519).
  • The post-quantum mode, as a variant of our public-key mode, using post-quantum signature and key establishment primitives. We have experimented with different algorithms, offering different trade-offs, and can propose different options for different performance and security profiles.

The remainder of this post is structured as a Q&A, and answers some of the most common questions we have heard, as well as questions that we anticipate from users as well as cryptography experts. If you have other questions, please don’t hesitate to contact us at [email protected].

Why AES-SIV and not AES-GCM, which is more standard?

Simple: we don’t want to become completely insecure if the devices has no PRNG or a weak one. Whereas GCM is broken when a nonce is repeated, SIV with a fixed nonce has optimal security for a deterministic scheme, that is, it only reveals duplicate messages because it then transforms a same input to a same output. In E4, this can only occur if the device does not have a clock (because a timestamp is used as associated data).

GCM is nonetheless supported when FIPS 140–2 compliance is required, in which case a unique nonce must be used for each message encrypted with the same key. In this case, we use the PRNG available with additional data in order to mitigate the risk of a poor PRNG.

Why SHA-3 and not BLAKE2, which is faster?

We’ve nothing against BLAKE2 (nor BLAKE3), but we believed that embedded platform and their libraries (open-source or proprietary ones) were more likely to support SHA-3 than BLAKE2. They would be even more likely to support SHA-256, but we wanted to avoid the length-extension property. Note that Ed25519 uses SHA-512 internally, as specified.

Why 256-bit symmetric keys?

We use 256-bit symmetric keys, which provide a 256-bit security level for AES-SIV, when 128-bit might arguably be enough (and make AES a bit faster). Another objection is that Curve25519 provides 128-bit security. An argument for 256-bit symmetric keys might be that they guarantee post-quantum security. A more concrete motivation for us was the demand of certain customers for 256-bit symmetric keys instead of just 128-bit.

What side-channel defenses do you have if any?

Our code aims to run in constant-time with respect to sensitive data (except for the length or encrypted data). Defenses against side channels such as electromagnetic emanations, or against invasive attacks such as laser fault attacks, tend to be specific to the platform and threat model. For example, you might protect against single-fault attacks by using some basic level of redundant computation, though this might be insufficient against attacks injecting two or three faults simultaneously. Furthermore, such fault attacks could also be mitigated by physical means such as fault detection mechanisms.

A specific class of side-channel attacks are timing attacks on AES exploiting secret-depending memory look-ups. A variety of techniques exist to defend against this. Tricks such as masking are commonly used in public-key cryptography that add and remove harmless information during the operation so as to obscure the desired information.

This is harder to do in a symmetric context, as well as in our embedded context as it relies on good randomness. We must therefore look to the implementation itself. The naive implementation of AES from its description, along with table-based AES, uses tables held in main memory that have different access times depending on previous access patterns and which addresses have been moved into higher caches. This clearly cannot be used.

Techniques to implement ciphers in “constant-time” exist and have been known for some time. Mike Hamburg et al. applied a vector permutation technique to AES which is suitable for platforms with SIMD-type registers, such as SSSE3 on x86. OpenSSL supports this for pre-AESNI processors. Clearly, however, this is not ideal on Cortex-M and 8-bit AVR processors.

Another technique was pioneered by Eli Biham and named “bitslicing” by Matthew Kwan. It reduces the standard algorithm description to a series of logic circuits, implementing the same algorithm in terms of “logic gates” i.e. AND, OR, NOT, XOR instructions etc. In order to do this, we often have to output multiple blocks from the original algorithm in one go. During the AES competition, the Serpent submission was self-bitslicing and could output a single block of the primitive; unfortunately when applied to AES multiple blocks must be output. The same technique has been applied to AES by Boyar and Peralta and Thomas Pornin implemented this in his excellent BearSSL Library.

Hardware support, if it exists, should be constant-time, so if available this can also be used. If not, we use the best available option: Thomas’ bitsliced implementation.

Can I use the platform’s primitives implementations instead of those form the E4 client library?

Yes, for example if you wish to use hardware accelerators on your platform, or implementations already available in your application, we can tweak our client library to use those.

How do you store secret keys?

Storing secret keys depends to a certain extent on what is available on the platform in question. An embedded system with a full Linux system might use a scheme not too dissimilar to the desktop. However, keys need to be protected as best as is possible. Secure EEPROM components might be used and in this case E4 is written in such a way as to support writing to such a location. If technology such as ARM’s TrustZone is available, E4’s logic can run in the Secure Zone to protect key material from normal components that are potentially exploitable and run in the Non-secure Zone.

Since E4 is built on well-tested cryptographic standards, we can also make use of secure elements if present. These provide smartcard-like functionality to protect keys from theft while providing a signing facility. This is particularly useful in public key mode.

Finally, as you would expect, if hardware acceleration is available for symmetric primitives, E4 can be configured to use this. AES acceleration is typically available in ARMv8 A series and onwards, although care must be taken (The Raspberry Pi 4 chipset for example uses a Broadcom BCM2711 System-on-Chip without AES support). Alternatively something like the i.MX6’s Cryptographic Acceleration and Assurance module could be used on NXP’s chipsets, or other accelerators on other chipsets.

Are your implementations formally verified?

No, not yet at least. But as mentioned above, other implementations of AES, SHA-3, or Curve25519 can be used. An option is for example to use the formally verified implementations from the HACL* project. We are currently working on formal models of our protocol to formally verify its security properties.

Are there any backdoors or “golden key”?

No, you can check in the code of our open-source libraries. If you use E4 in your products, you can have full access to the source code of the software deployed on your platform.

IoT end-to-end encryption with E4 (4/n): security model

Is your communication secure? This sounds like a simple question, but a closer look at any communication protocol, boundary or permissions system will show it is anything but simple. You will often see more seasoned security professionals, before implementing security controls, talk about a threat model. More simply put: whether something is secure or not depends on your assumptions and expectations.

Threat modelling

We will use two example to introduce the concept of threat modelling in the context of encryption applications:

BEAST is an attack discovered several years ago that could decrypt data under specific conditions by misusing the TLS protocol. There are many good summaries and explanations of this attack online, so we will not repeat the details here. The important point is that the capabilities of an attacker were never really put forward, beyond a vague interception capability. More precisely, the model overlooked risks unique to the CBC mode of operation. We thus had higher expectations of the protocol than it actually provided.

Let us pick another, sometimes controversial example. Signal is a state-of-the-art end-to-end encrypted messaging service that requires accounts be bound to a phone number. This could be considered problematic if you expect Signal to hide who you are: you need to acquire a SIM card not associated with your identity, which is difficult in many European countries. You may also have more benign reasons: you might not want to give out a number you also use for WhatsApp and on which people can call you. However, if you start from the assumption that either this is not Signal’s concern, or that your adversary always knows who you are, then Signal suddenly looks much better. It depends on your perspective and your expectations.

A reasoning for the E4 security model

When we set out to design E4, we set ourselves the goal to bring state-of-the-art end-to-end encryption to the internet of things; or “Signal for IoT” if you like, yet technically very different than Signal. However embedded devices have a different set of constraints than mobile devices or computers. There is the obvious, namely that they tend to have a less powerful processor and much more limited memory. They may even work on battery. Less obvious is perhaps the fact that the device might be a sensor or other remote device, so human interaction to fix issues or accept key rotations may either be prohibitively expensive or outright impossible.

The Signal protocol allows you to verify the identity keys of the party you are talking with by comparing fingerprints. Once done, you can be sure that the messages you are exchanging are done so with the correct party – any change will be detected. Further, thanks to the double ratchet, your messages have “post-compromise security” – if the key for a single message is compromised the damage is limited to a few messages, not the entire conversation. Signal also does not require heavy parsing of complicated data formats like ASN.1. It is widely recognized as the state of the art in messaging security.

It may therefore seem logical to adopt Signal directly for embedded and IoT devices. Unfortunately it is not quite as straightforward as this. What we want, therefore, is to get as close to the level of security provided by Signal as is possible given the constraints of the embedded world. Also note that, instead of just “encryption”, we often prefer to talk of data protection, which covers other security aspects such as authenticity, integrity, and replay protection.

A high level view of requirements

What specifically do we want? After much research and discussion with our partners and customers, we came up with a list of requirements for E4. We will begin with some of the more obvious ones:

  • Data must be protected from its producer to its consumer(s), keeping in mind that there may be one or many data consumers, the set thereof may change over time.
  • Devices may be remote and must be autonomous. There may be no capacity for a human to validate keys or otherwise take decisions on the devices.
  • A fleet, or perhaps multiple fleets, of devices are managed by a single organization. This organization’s devices may have interoperate with legacy systems, and potentially external systems.
  • The organization has the capacity to deploy a management component, but may otherwise use infrastructure they do not trust and in particular wishes to trust the minimal amount of infrastructure possible.
  • The network between devices and from device to any controlling servers may cross multiple protocols. For example, the first “hop” may be LTE (“4G”) or satellite for some devices, followed by the public cloud.
  • Devices don’t necessarily address messages to another device or entity, but instead may tag messages with a “topic”, such that all clients subscribed to that topic will receive the message.

Some consequences of these choices are immediate in our design. In order to trust the minimum possible infrastructure, we made the choice to provide application layer end-to-end encryption – as opposed to a transport-specific protocol. This means our cryptography can be adapted to the protocol in question and if necessary wrapped in multiple different protocols. This also means only the two devices communicating actually need to read the message.

The lack of operator/human interaction also implies keys need to be managed. As a result, we engineered a command-and-control server for this purpose, which we call C2 for short. The server needs to be deployed somewhere with sufficient security controls, but otherwise acts as a client within the network and does not need to remain online continuously.

The C2 occupies a privileged position in our protocol. As will be discussed later, in some deployments it will possess secret keys for the devices, and in all deployments it replaces the human operator in deciding what keys to trust, when to change them and so on. Using a channel secured uniquely per each device, the instructions are then sent. We assume that the C2 is owned and operated by, or on behalf of, the same organization running the devices and the operators of the C2 are trusted. This is in contrast to the cloud and network infrastructure that might be used for communication, including brokers and gateways.

An obvious objection to this in a secure messenger scenario such as Signal would be that it amounts to key escrow, and the C2 can decrypt any messages on the network. This is true, but the scenario under consideration is different. In a secure messaging scenario, individuals may not be part of the same organisation and may not completely trust each other and as such key escrow is inappropriate. In the E4 model, human interaction with devices cannot be assumed: the device cannot ask a human to approve a fingerprint, verify another device, join or leave a group. We must offload this work to a third party. We also make the assumption that all the devices belong to the same organization and it is acceptable for that organization to have access to all keys of its own devices, should it desire.

We have talked about trust for a whole organization. It is of course entirely possible that an organization could run multiple, siloed C2s if they wished to enforce an entirely separate trust boundary. Nothing in E4 prevents this.

The challenges of embedded devices

Now we will look at the specific challenges posed by embedded devices.

  • Embedded devices may be capable of limited cryptographic primitives, or may only support a restrict set of algorithms. We will cover our specific cryptographic choices in a later post.
  • Devices may have very limited storage space. Within the constraint that something must be stored, E4 should still work.
  • We may not be able to rely on the clock on the device. Perhaps the clock is unreliable, or perhaps there simply is no clock. On the other hand, if a reliable clock exists we wish to exploit this fact.
  • The device may not have a cryptographic coprocessor or other “secure element”. Customers may for example be retrofitting older devices. In particular, while customers may be able to deploy new hardware based on hardware security solutions, we did not want to require this. On the other hand, if such a device exists, we would like to be compatible and able to exploit it.
  • The device may have poor to no entropy. This is a known problem with embedded devices whose complexity compared to desktop computers is much reduced and for which there is little environmental randomness to exploit, or doing so is incredibly slow.

Our desire to bring strong encryption to most embedded devices, even older ones, led to our decision to make E4 a software solution. We have of course not discounted the risk of hardware tampering and attacks and customers who wish to use components that protect against such risk may of course do so. For example, a platform can leverage AES accelerators or ARM TrustZone to make their E4 deployment more secure. We have chosen standard cryptographic primitives that are highly likely to be compatible with secure elements.

Limited storage space and limited cryptographic primitives led us to our most restrictive design: a symmetric-key only mode. There is a tradeoff here, of course, but we are still able to provide excellent levels of security.

Like most protocols, we wanted to provide anti-replay protection and we implement this using a time window during which messages may be accepted. If a clock is not considered reliable, we can use our protocol’s control messages for synchronization. If a clock does not exist, or this protection is not desired, it can be disabled.

The lack of clock also introduces another problem: revocation. In the internet’s Public Key Infrastructure this is done by associating a key (certificate) with a window of validity (expiry date). There are documented problems with this approach, namely if keys must be revoked prior to this expiry date. We certainly cannot afford to process lists of expired keys to achieve this end, and if the clock cannot be trusted then expiry itself becomes problematic. Luckily, we can simply take inspiration from the double ratchet and be bullish: when we want to expire a key, we replace it and flush it from memory.

Surmounting the entropy problem has been studied in cryptographic research. Protocols that rely on certain parameter choices being random have led to attacks when they are not, such as the k=3 ECDSA bug in the PlayStation. This has led to deterministic encryption designs, and in particular synthetic initialization vector designs (such as deterministic ECDSA and AES-SIV). These designs are secure without the need to provide entropy to the cryptographic code. For key generation, the C2 takes over responsibility and is presumed to run on a server which has more than sufficient entropy – this is true of almost any normal server you have and does not impose any specific requirements.

Device communication

Our next step was to look at how devices actually communicate. There are two obvious cases: receiving firmware updates from the organization’s server, and submitting data back to a collection point.

However, we didn’t want to constrain our protocol to these two use cases and have to add fixes at a later date. Given the need for the C2 to manage keys on devices, we needed a separate control channel that we have discussed previously; this should be separate to a normal message channel that devices may use. Devices may be subscribed to a single message channel in addition to their control channel, or they might be part of a group shared with other devices. Those mappings need to be changed dynamically and we may wish to rotate keys based on various criteria that have been discussed in previous posts.

From the point of view of security constraints, this does not impose anything on us aside from one requirement: we need to be able to rotate keys for all channels, including the device’s own key, as needed.

When we are not so constrained

Not all devices are so constrained. In particular, surprisingly small devices are capable of public key cryptography and post-quantum algorithms are being benchmarked against the ARM Cortex-M4.

Customers may wish to use public-key cryptography if it exists. They may also want to use post-quantum algorithms and we should be ready to deploy these once standardized.

On the other hand, “ciphertext agility”, where protocols negotiate the cipher suite they will use, are inefficient and have documented downgrade problems. We wish to avoid this, and do so in the most trivial way possible: we have different versions of our protocol for these use cases, that customers may adopt as necessary.

The E4 model: summary

Putting all of this together, E4 is designed to meet the following constraints:

  • Devices in a single organization must be managed together and remotely, and include a key management solution that is easy to operate and scales to many devices and messages.
  • It is possible to deploy a command-and-control server in a suitably protected fashion.
  • The infrastructure used, aside from the C2 and communicating components, is minimally trusted. In particular, end-to-end encryption is required.
  • We wish to still be deployable in very constrained environments using only symmetric key cryptography. However, public key and post quantum support should be possible if desired, but this choice must be made according to risk.
  • We wish to be able to retrofit and deploy inexpensively in software, while remaining compatible with and ready to support state of the art hardware protection with minimal effort.
  • We cannot necessarily trust the clock on the device. If one is present we work around this lack of trust; if not we may degrade the replay protection gracefully.
  • We cannot rely on randomness on the device. The managed keys approach allows us to mitigate this for key generation, and appropriate choice of cryptographic algorithms allows us to remove our reliance on this requirement elsewhere.
  • We wish to be able to rotate and replace keys flexibly and simply, without reliance on revocation schemes and expiry dates that have proven unreliable.

Are the communications of devices secure with E4? We believe they are. Messages cannot be modified or read by any intermediate network node such as a message broker; only the end devices can decrypt. We are able to rotate and replace keys flexibly, limiting the window of exposure and allowing a reasonable level of forward and backward secrecy. Except in very constrained cases, we can supply replay-protection in addition, and we do not require a source of randomness. We can do this in devices that can only support software updates and have limited processing power. However, if the devices are capable of public-key or even post-quantum algorithms, we can deploy a version of our protocol that leverages this. If the customer can deploy hardware security, we can adapt to that too. We can do this without requiring human interaction with the devices, supporting true embedded use cases, where the devices are in remote or inaccessible locations.

IoT end-to-end encryption with E4 (3/n): automation component

Our previous post, introduced the architecture and components of E4, Teserakt’s encryption and key management for IoT systems. This post is about one of these components, the automation engine, which we’ll describe more in depth today.

Keys are not forever

Devices managed by E4 rely on symmetric or public-key cryptography to protect their communications. In both mode, each device has its own identity key, plus one one key for each topics (you can see a topic as a "conversation", as a data type, as a classification level, and so on). Since devices may be deployed for years, there must be processes to renew these keys remotely and securely. Motivations for rotating keys include:

  • Revocation of a key that has possibly been compromised, or is not to be trusted for various reasons

  • Provide forward secrecy, that is, guarantee that if a key is compromised at some point, then earlier communications (with a different key) cannot be decrypted

  • Provide backward secrecy, or post-compromise security, for example in order to guarantee that, would some topic key be compromised, future topic keys remain secret, and therefore content protected with these keys remain protected.

Such key rotation may be done manually, or partially manually by creating some scripts, creating a custom network service, and so on, and updating the script manually when needed. This is a cumbersome and error-prone procedure that is unlikely to scale and to prove reliable in critical environments. We therefore propose an automation system that is simple to use (both through a graphical UI and scripts), and that easily scales to many topics and many messages.

The E4 automation engine address exactly this problem. It automates the rotation of any E4 keys by defining rules following a simple grammar, depicted below:

In the context of our key rotation automation, we then use following terms:

  • Rule: a rule is the main component of the engine. It holds a list of triggers and targets controlling when the key rotation will happen, and which device or topic will be updated.
  • Trigger: a trigger is a condition that must be fulfilled in order to execute a key rotation for all the parent rule’s target. A trigger can be a predefined period of time, or a number of system events, such as a threshold of clients joining or leaving a certain topic.
  • Target: a target designates either a client device or a topic, for which the key will be updated when one of the rule’s triggers will have its condition fulfilled.

When to rotate keys

We currently support basic use cases such as:

  • Rotating a device or topic key at a fixed time interval
  • Rotating a topic key after a certain number of clients joined or left this topic

Those cover the most common scenarios requiring a key rotation. We’ve also carefully designed our engine and rules format to be flexible and easily extensible. If you have any other use cases which are not actually covered, we’d be happy to hear about them!

We’re often asked what’s the right time interval for key rotation. There is no single right answer, for it depends on a number of factor including the threat model and risks, the network reliability, the cost of sending control messages, etc.

How it works

As seen in our previous architecture post, the automation engine extends the C2 server functionalities. When enabled, it starts an internal scheduler that monitors the time-based rules, and also registers a channel on the C2 server to receive system events. On each events, the automation engine will check from the defined rules if any triggers are due, and request a key rotation for each of the rule’s targets on the C2. The trigger state then reset and is waiting again for its conditions to be met to fire again.

Client key rotation

A client’s key is its more critical key, and should ideally be better protected than topic keys, which are shared with other devices. In the symmetric key mode, the client’s key is also known to the C2, and used to protect control messages sent to that client. One of thes commands supported by control messages is SetIDKey, which offers a new client key to a remote device.

When a client key is rotated, the C2 server thus issues a SetIDKey command, containing a newly generated key, and protected with the current client key. The command is then transmitted to the client via the MQTT broker, by publishing it on the client control topic.

In the public-key mode, the C2 only knows clients’ public keys, not their private keys. Therefore client keys can’t be rotated in the public key mode.

Topic key rotation

When a topic key is rotated, the C2 server first generates a new random key, before issuing a SetTopicKey command for every subscribed clients on this topic. Each command is then protected, using the client key of its recipient, before transmitting on each client control topics.

When a topic has many subscribed clients, there might be a significant delay between the first and last SetTopicKey commands transmission (and thus reception). Some clients may thus use the old key while others are already using the new one, preventing the decryption of messages in each way. To avoid message loss, we thus need a key transition mechanism, which we implemented by defining a configurable grace period.

Grace period

The grace period is a client parameter, which defines the duration during which both the old and new key can be used, starting from the time when a new topic key is received. The client will store the newly received key, and keep the old one for a configurable amount of time. The devices then does the following:

  • When receiving a message, it first tries to unprotect it with the new key. If this fails (as indicated by invalid authentication tag), the client then tries with the old key if it is still within the validity period.

  • When transmitting a message, the new key is always used. This might prevent some devices from reading the message, however this is safest behavior in case the old key was compromised. Furthermore, using the old key would also prevent messages to be decrypted by clients whose grace period has passed.

Again, when we’re asked what’s the best grace period value, our answer is that it depends on several factors, including the number of clients subscribed to a topic, how often keys are rotated, the network’s latency and reliability, and so on. Ideally, the grace period should be chosen after analyzing empirical data. The best value of the grace period is one that sufficiently minimizes messages loss, while minimizing the amount of messages encrypted using the old key (thus, the shorter the period, the safer).

Conclusion

Key management, and more particularly secret rotation, can be a tedious task. With E4, we try to make it transparent, fail-safe, and easy to configure and operate. You can configure your device and topic keys rotation, either from the dedicated page on the web console, or from the command-line client, allowing to create or view active rotation policies.

You can try it yourself, and start rotating keys using our e4 client, with binaries for common architectures available on the release page. Automation rules can then be created from the demo console.

IoT end-to-end encryption with E4 (2/n): architecture and integration

The previous post in the series introduced the E4 protocol, which brings end-to-end encryption to IoT networks in a scalable way. We now describe how simple and unobtrusive E4 is to integrate in an existing infrastructure. We’ll start with a basic example of an IoT architecture, present the limitations in terms of data security, and then introduce E4 components and their integration.

An typical IoT architecture

A common industrial IoT architecture includes a multitude of connected devices, reporting their data such as measurements to a processing back-end. In the example depicted below, the MQTT transport protocol is used, such that devices and the back-end connect to an MQTT message broker in order to exchange messages according to a publish/subscribe logic.

Because devices directly publish their data as cleartext messages, this data is exposed to the message broker, which can read its content, modify it, and create fake messages on behalf of the devices. This can happen in different scenario, for example when:

  • The broker is operated by a third party, such as a cloud operator
  • The broker is compromised by an attacker, who then controls its operation
  • The broker database is compromised, and therefore all its messages

Other network components (routers, gateways, etc.) can also do the same if the connection between devices and the broker is not protected.

In other words important security properties are missing in this architecture:

  • Data authenticity and integrity of messages
  • Data confidentiality

The same concerns apply when data is sent to the devices, such as firmware or configuration updates, as well as when devices communicate with each other.

E4 integration

The E4 solution aims to protect IoT data without making any change to the message broker (or similar component), and in a way that it mostly independent of the transport protocol (in this example, MQTT). As you can see in the image below, our server component (C2) connects to the message broker like any other device, and other network components (devices and back-end) integrate our software library in order to support the E4 security layer. The subsequent sections explain how this integration is done.

Terminology

We defines a few terms:

  • E4 clients (or just "clients"): In E4, clients are anything able to publish or receive messages. In our examples, the devices, sensors or processing backend are all identified as clients. E4 clients must each have a unique identifier, and a secret key.
  • C2 server (or just "C2"): The back-end component of E4, which is seen by the network as just another device. C2 sends control messages to E4 clients and performs security monitoring operations, among others.
  • Topics: MQTT topics are the message queues where messages will be published and read from. In E4, topics can be associated with a secret key, which clients will use to protect messages tagged with this topic.
  • Protect/Unprotect: Instead of just encryption/decryption, E4’s security layer can also provide integrity, authenticity, and replay protection. That’s why we refer to the message transformation as "protection" rather than only encryption.

Modifications to clients

To integrate E4, the main modifications take place at the applications level, which need to send or receive messages on the clients. They need to integrate the E4 library in order to:

  • Protect a message before publishing it. This ensure that intercepted messages can’t be read nor modified by unauthorized parties.
  • Unprotect a message after receiving it. This allow to retrieve the original message payload from a protected message.
  • Subscribe to the device’s control topic. This allow the C2 server to send control messages to this unique client.
  • Store an identifier and related key material.
  • Process control messages sent by the C2.

C2 integration

The C2 is E4’s key management server, whose main roles are to:

  • Keep track of client devices and their associated cryptographic keys.
  • Identify communication topics, and their associated cryptographic keys.
  • Send control messages to the clients.
  • Perform analytics to monitor the network.

The C2 exposes an API (gRPC or HTTP) allowing to register all your clients and topics, and securely provision keys to devices in a manual or automated way. To do so, it only needs to reach the MQTT broker, as a normal client (ideally with MQTT QoS parameter set to 2).

Besides its core service, C2 provides additional and optional services:

  • A simple web-based UI, to manage clients and topics manually. See our public demo https://console.demo.teserakt.io.
  • A command-line tool (c2cli), allowing scripting and batch processing for your clients and topics management.
  • An automation engine to define, manage and automate key rotation policies (which will be presented in a future post in this series).

As depicted above, additional services directly connect to the core C2 service, meaning there is still no change to make to your existing infrastructure.

C2 database

The C2 can use popular SQL RDBMSs (such as MySQL, PostgreSQL, SQLite) to store your devices and topics. All the cryptographic keys are stored encrypted, and only decrypted when needed, that is, when transmitting them to clients in control messages.

In its simplest setting, database records are encrypted using a key controlled by the C2 service. Depending on the security needs and available technology, the following components can be used to manage the database encryption key:

  • HSM via PKCS#11 interface
  • Dedicated key management service such as Hashicorp’s Vault
  • Cloud-based key management services such as AWS KMS

Control messages

In order to control each devices internal state, such as their identity key (used by the C2 server to encrypt the device’s control messages), or the list of topic keys (used to encrypt or decrypt the data transmitted or received by the client), the C2 service emit control messages, which can be of one of the following types:

  • SetTopicKey to set a key for a given topic, allowing the device to read or publish messages on this topic.
  • RemoveTopic to remove a topic key from the client, preventing the device to read or publish messages on the topic.
  • ResetTopics to remove all the topic keys stored by th client.
  • SetIDKey to set the device identity key, required to process control messages.

As we’ll describe in a future post, E4 support symmetric-key and public-key cryptography modes. In the public-key mode (which is the safest), the following messages are used:

  • SetPubKey to add the public key of another device in the local storage, allowing to authenticate the message sender.
  • RemovePubKey to remove a stored public key from the client’s storage.
  • ResetPubKeys to remove all the public keys stored by the client.

Control messages are protected using the current device key, and published by the C2 on the device’s control topic (created from the client unique identifier we mentioned above).

The processing of control messages is implemented in the client software, as you can see for example in E4’s Go client.

Summary

E4 allows you to protect your IoT data with minimal changes required, and a short integration and testing period. E4 does not require any change to the message broker, nor additional hardware or software capabilities on the clients.

E4 is also robust and resilient to outages: although the C2 services are designed for scalability and high-availability, devices can still communicate reliably if these are unavailable or are temporarily down for maintenance..

If you’re interested in learning more or have any question, feel free to contact us!

Links