Stronghold 1.0 Stable Release

A New, Improved Way to Manage Sensitive Data

TL;DR:
The stable release of Stronghold comes with ground-breaking changes. The client interface to interact with the Stronghold runtime now features a types-based approach for easier integration. The management of sensitive data inside memory has been upgraded to provide a more hardened security model.

Stronghold is a secure software library written in Rust to manage sensitive data. Read on for the nitty-gritty of Stronghold, an overview of the steps that brought us to the decisions we took as a team, and last but not least, a discussion of software security.

Security concerns are at the heart of DLT development, with attacks and hacks leading to the innovation of new defenses. The IOTA Foundation experienced its own security crisis in February 2020, when its Trinity software wallet for desktop and mobile operating systems was attacked, resulting in the theft of around 8.55 Ti in IOTA tokens. From this crisis emerged a new determination to develop an open-source security framework that could not only counter the exposure of sensitive data in the IOTA wallet but ultimately be used by anyone, anywhere. That framework is Stronghold, which is now receiving its stable release (find it on GitHub or on Crates.io).

Stronghold was first introduced with the launch of Firefly, the Rust-based IOTA wallet that replaced Trinity. Developed by a team led by Daniel Thompson and Tensor, Stronghold was designed as a secure software that could be trusted to guard crypto seeds stored on desktops or mobile devices. The system that protected assets before Stronghold made use of a third-party service.

Now, Stronghold has evolved beyond a feature of the IOTA wallet to become a security software that can be integrated by anyone who wants to store secrets safely.

Building Stronghold

Implementing security features is notoriously difficult, and our favorite language, Rust, lacked libraries with the very specific feature set that we needed. Of course, there are some industry-grade cryptographic and security libraries (libsodium, for example) that offer best-in-class features, but none matched all our requirements, which were:

A software library written in Rust.
Data structures that manage sensitive data.
Persistence and exchange of sensitive data.
High scalability.
APIs that provide a rich feature set, while not being an overblown “feature creep”.

In designing the architecture of Stronghold, the team took a layered approach. The core of Stronghold is made of the runtime, which implements primitives like encrypted or guarded memory (explained below in more detail) and contiguous and non-contiguous memory variants (ditto).

On top of the runtime sits the engine, which provides data structures consisting of the vault, a simple database structure, and buffer types, the core logic to load and store volatile – read “in-memory” – sensitive data to disk, while being encrypted with xChaCha20-Poly1305 and compressed with LZ4.

Stronghold's public API resides inside the client, giving access to the main features. With the client, you can write secrets, execute complex cryptographic procedures, and write and read plain unencrypted data in the insecure store. In addition to the locally available client functionality, Stronghold also supports working with remote instances. Its functionality differs from traditional remote password stores, in that secrets and sensitive data are never exposed.

In a way, Stronghold acts like a glove box, the sealed containers in a science lab that allow gloved hands to manipulate objects inside the box without breaking containment. But this glove box contains secrets, and the hands are the procedures run on the secret data – they remain gloved when reaching inside the box because the “hands” never “touch” the secrets themselves.

A challenge for a security system like Stronghold is allowing many users to access it simultaneously without compromising security. Solving this challenge was an odyssey in itself and ultimately led us to change the entire architecture, as we’ll discuss in the next section.

Concurrency with actor systems

Today’s computing power derives from many CPU cores working in parallel. Any software that doesn’t use a form of present-day concurrency will perform worse than software that does. Stronghold is no different, and Rust is an excellent programming language that offers a lot of concurrency as well as asynchronous primitives. However, discussing the inherent differences between concurrent, asynchronous, and parallel code execution is beyond the scope of this blog post.

Stronghold employed a well-known concurrency architecture: the actor model. The basic idea of the actor model is to have isolated actors, each taking care of some functionality. Actors receive messages with data to act upon and send data back when they are finished processing it. Since each actor contains its own state and concurrency is achieved by not directly calling functions but by polling messages, most of the undesirable concurrency problems are gone. Deadlocks will never occur. So far, so good? While other languages (for example, elixir) have the actor model pretty much baked in with an excellent supervisor and so forth, integration with Rust involves a lot of boilerplate code. This approach implied that users of Stronghold were being forced to use an actor model in some capacity. We wanted to answer whether we could provide a simple interface, ideally some primitive types to work on with simple function calls, and still run in a concurrent setup without the headaches that come with locks and mutexes.

Stronghold is a multi-tenant system, meaning that multiple clients, each with their own sensitive stored data, can access Stronghold (virtually) at the same time. This is known as “concurrency,” and finding the right software to manage concurrency was an important chapter in the development of Stronghold.

*Concurrency enables users to pull up a chair and access Stronghold at the same time as other users.*

Initially, we looked for an actor framework to manage concurrency, and following a trial period with Riker before it got discontinued, we chose Actix. Although Actix has an easier interface to implement all the functionality using their actor system, with messages and actors as the main primitives, it only runs actors along a single thread rather than multiple threads, which would isolate client states. Also, Actix’s governing actors system consumes a supported async runtime, which means that other tasks from other systems could not be scheduled on the same runtime.

These drawbacks meant that Actix was ultimately not a good fit for Stronghold, and we had to admit that the actor framework (although great in concept) added needless complexity to the system due to how most Rust actor models are implemented. So we stepped back and took a deep dive into the architecture to see which parts could be replaced and which could be upgraded.

Experimental concurrency synchronization

Enter Software Transactional Memory (STM). What is it, and how does it solve the problem? STMs have been around for quite some time (see Nir Shavit and Dan Touitou, Software Transactional Memory in Distributed Computing, Volume 10, Number 2, February 1997). The main idea is that each operation on memory happens in an atomic transaction. Whenever memory is modified, this modification is written into a log. While inside a transaction, reading from memory also is done through a log. The transaction has finished when all changes recorded inside the log have been committed to the actual memory. A transaction fails if another thread tries to modify the targeted piece of memory between operations. A failed transaction can be re-run any number of times. It’s a bit similar to how Git works. Multiple developers are trying to merge changes on the same branch. If they try to modify the same regions, a conflict will occur, and the transaction is rolled back.

This approach guarantees that modifications to memory are always consistent, but it comes with a restriction. Since transactions can be retried, operations inside a transaction necessarily need to be idempotent and shouldn’t have any side effects. In an extreme case, think of a function that launches a rocket: you simply can’t reverse the process. Another edge case concerning STM-based approaches is interleaving transactions, where reads and writes are alternating between two threads. In a worst-case scenario, both transactions would retry indefinitely.

Upgraded runtime

Stronghold's upgraded runtime includes several aspects, which we’ll explore below.

1.) Locked memory and guarded pages

As already briefly mentioned regarding the architecture of Stronghold, the runtime provides memory (speak managed allocation) types for handling sensitive data. One such type is Boxed. This type locks allocated memory and prevents it from being recorded in a memory dump. Since locking memory depends on the operating system, the Boxed type relies on libsodium’s 'sodium_mlock' function, which calls the mlock function on Linux or equivalent functions on other operating systems. mlock prevents the current virtual address space of the process from being paged into a swap area, thus preventing the leakage of sensitive data. This, in turn, will be used by guarded heap allocations of memory.

Guarded heap allocations work by placing a guard page in front and at the end of the locked memory, as well as a canary value at the front.

Guarded heap allocations might look like this:

Libsodium provides three types to guard memory:

sodium_mprotect_noaccess: This makes the protected memory inaccessible. It can neither be read from nor can it be written to.
sodium_mprotect_readonly: This makes the protected memory read-only. Memory can be read but not written to.
sodium_mprotect_readwrite: This enables reading from and writing to protected memory.

Stronghold exposes locked memory via the 'LockedMemory' trait, which exposes two functions that need to be implemented:

///Modifies the value and potentially reallocates the data.
fn update(self, payload: Buffer<u8>, size: usize) -> Result<Self, MemoryError>;

///Unlocks the Memory and returns a Buffer
fn unlock(&self) -> Result<Buffer<u8>, MemoryError>;

Currently, the trait is implemented by three types of memory:

RamMemory: The allocated value resides inside the system's ram.
FileMemory: The allocated value resides on the file system
NonContiguousMemory: Allocated memory is being fragmented across the system's ram or file system, or a combination of both.

2.) Non-contiguous memory types and the “Boojum” scheme

Non-contiguous memory types split protected memory into multiple fragments, mitigating any memory dumps and making it extremely difficult for attackers to retrieve stored data. The following section describes non-contiguous memory types in more detail with a use case we often encountered and solved with a pretty decent method.

Data, like an array in memory, is usually laid out contiguously. This means that individual elements or their byte representation are stored in adjacent addresses. Think of some number sequence as addresses where the individual elements are stored. This is not always desirable from a security point of view. Imagine sensitive data in contiguous memory. An attacker could easily find the starting address and then read out the rest without any problems. But we can do better than that. Apart from operating systems that may split available virtual memory into memory pages of a fixed size (on *nix-based systems this size may be 4k of bytes), we can create a structure that holds references to parts of memory (here, sensitive data) where each part is distributed randomly across the memory space. One obvious advantage of this approach is that memory can be more easily used if the chunk size of the data parts is reasonably small.

One of the major headaches we faced in the development of Stronghold was proper passphrase management. Whenever you need to load a persistent state from a snapshot file, you require a password. If you were a single user of Stronghold, and reading and writing would be interactive (providing the password each time) it wouldn’t be so much of a problem. The time window in which the passphrase would be used to decrypt and later encrypt to persist a state would be very small and almost non-predictable. Now, consider an application that requires constant writing into a snapshot: the passphrase to successfully encrypt the snapshot needs to be stored somewhere in memory. This could be a huge problem: given that the attacker has access to the machine, they could simply dump the memory of the running process and read out the passphrase in plaintext! Horrible! Luckily there is a solution to that: It’s called the Boojum Scheme, as described by Bruce Schneier, Niels Ferguson, and Tadayoshi Kohno in Cryptography Engineering.

The main idea behind the boojum scheme – which is a special case of the non-contiguous data type – is to split some sensitive data into two or more shards. The shards are fixed in size and placed randomly in memory. Each shard is continuously rotated, which means the content is randomized according to the mathematical operations described below. Whenever the key is needed, the shards are reassembled so the original key can be reused in a way that no user interaction is needed. To further improve the security of the shards, the memory access to each of them is locked and guarded as well, rendering any attempt to access the contents of the shards very difficult.

Using this scheme to protect a passphrase residing in memory, we achieve resistance towards cold-boot attacks (a type of attack where memory is retained after a forced shutdown).

Zeroizing memory

One important aspect of secure software engineering is what happens to memory when it gets deallocated. In an ideal world, memory that is no longer in use would be overwritten with either zeroes or truly random values. But the actual truth is that deallocation tends to be lazy. Sensitive data may be retained in memory, even though it has been freed and reclaimed by the OS. Luckily there is already a rust crate that offers all the conveniences of zeroing out memory after it has been dropped: zeroize.

Putting it all together

With the new runtime, we now have quite a few options to ensure sensitive data in memory is protected. Non-contiguous types are fairly new to Stronghold and we have to figure out what is a good balance between performance ( everything is in RAM ) and security ( fragmented across RAM and file system). There is another limit put by the operating system when it comes to the maximum number of protected memory regions. On some Linux machines, we encountered empirically that the limit was about ~8000 guarded pages. To fix that, we decided not to store guarded pages inside a vault but guard them on demand. The number of sensitive data at rest is presumably higher compared to sensitive data being present for cryptographic procedures. Sensitive entries inside the vault at rest are encrypted with XChaCha20-Poly1305, which gives us some decent security.

Upgraded client interface

Since we dropped the Actix actor framework, the client interface has undergone some changes. All possible operations with a running instance of Stronghold are now reflected via primitive types.

Cryptographic procedures

While the parts of Stronghold described before were mostly concerned with writing secrets, the question arises: "what can you actually do with a secret that is never exposed? Why even store it?”

The main principle of Stronghold is to be an alternative to hardware security tokens. This also implies that cryptographic operations can be executed on the stored data. In this sense we refer to data as key material, a sequence of bytes forming a private key. Cryptographic operations may involve signing data, creating new keys, deriving keys, or even encrypting or decrypting data. Each of these operations is wrapped in a primitive we call “procedure”. A “procedure” is similar to a function call on the secure storage. The procedure’s type determines its usage. An example would be to have a procedure that creates a key at a given location inside a vault or exports the public part of a private key. But procedures alone are not the only way to access secrets.

Stronghold features a framework to build pipelines of cryptographic operations. The pipeline pattern is an abstraction over chained function calls. Each stage of a pipeline can either produce a value – e.g. generate a secret (BIP39 and Mnemonic, Ed25519) – or process an existing value – e.g. deriving a secret key from an existing key (SLIP10) – or export a value from an existing secret – eg. export the public key of a key pair. The procedures framework is a neat way to offer a consistent API to work with secrets, but it’s not the only way to access the vault.

The framework is abstracted in a way that combinations of simple and complex cryptographic procedures are possible. One note to mention is that custom procedures are not possible with Stronghold for a single reason: since a procedure accesses secrets, providing a custom procedure would allow a user to expose a secret and return it, which would violate one of Stronghold's core principles.

Operations involving sensitive data make heavy use of the procedures framework. For example:

// .. we initialize 'client' somewhere before the calls

// This constructs a 'GenerateKey' procedure, that will generate a key at given
//output location in the vault

let generate_key_procedure = GenerateKey { 
		ty: keytype.clone(), 
        output: output_location.clone(),
        };

// Even though this procedure does not create a useful output, the result can be
// used to check for errors
let procedure_result = client.execute_procedure(StrongholdProcedure::GenerateKey(generate_key_procedure));

// Front the previously generate key, we want to export the public keylet public_key_procedure = stronghold::procedures::PublicKey { 
		ty: keytype, private_key: 
        output_location,
        };

The future of Stronghold

With the stable 1.0 version release, we feel confident that Stronghold has reached a level of maturity to be freely used in many applications that need an additional layer of security. But the work on Stronghold will not end here. We see more extensions and low-level changes to the architecture in the near future. Stronghold is only as good as software security can get. Still, the future speaks for the integration of secure hardware modules. They may be platform provided ( like a TPM) or roaming like a smartcard (e.g. a solokey). Many different APIs would be required to support easy integration of hardware-based security. This led us to the decision to aim for a platform-based approach, working on a coherent API, that supports basic functionality across hardware platforms with optional opt-in extensions.

One core feature we are eager to leverage to a more generalized approach is to provide the means to run arbitrary procedures on the vault in a trusted environment. So far, Stronghold is shipped with quite a few cryptographic functions and utility procedures. Yet more algorithms or utilities could be required to be shipped by the maintainers of Stronghold. As of now, we are researching a newer approach to leverage this “restriction” in a way that offers more flexibility to work with sensitive data but at the same time provides reasonable defaults to protect access to it. We strongly believe that this will require extended hardware support like trusted execution environments present in either proprietary hardware like Intel’s SGX or open-source approaches like RISC-V keystone. Further down the road to isolate sensitive data in memory, one prospective way could be to integrate hardware secure modules, making use of the already mentioned extension to support secure hardware modules.

Last but not least, it is also up to you to make the best of the Stronghold library. What feature would you like to see in Stronghold?

Links included in this blog post