The release of the DevNet for IOTA 2.0 marked one of the most important milestones in the IOTA Foundation’s history. For the very first time, we were not only able to demonstrate the ideas behind IOTA 2.0 in a real, fully decentralized and Coordinator-free network, but we were also able to study the interaction of the different components.
The ability to measure and evaluate the performance of each of these components has allowed us to discuss and decide on a series of optimizations to make the protocol more robust, efficient and less complex.
The majority of changes involve small optimizations of the data flow and code simplifications. But we have also made adjustments to the communication layer and some crucial changes to the consensus layer.
The improvements to the consensus layer significantly simplify the protocol by deriving all information from the Tangle and handling incoming transactions in an optimistic manner. The first implementation has been completed, and we are simultaneously working on in-depth simulations and research papers. Only a few more changes are left to be implemented, while we continue on a clear path.
As for the communication layer, an efficient scheduling algorithm is currently implemented in the IOTA 2.0 DevNet. This works with a rate setter module to ensure fairness and short delays. As a next step, we will implement a mechanism to penalize attackers, which is already spec'd. Additionally, we are currently working on providing an improved user experience to facilitate the usability of the network.
This more technical blog post dives into the details of the IOTA 2.0 improvements, describes the current status and paints a general picture of the road ahead.
As described in a previous blog post, instead of asking other nodes for their opinion to resolve conflicts, nodes now derive all the information they need simply from observing the Tangle. The resulting protocol is very similar to the original vision of IOTA, which used a pure Nakamoto-style consensus without an additional voting overhead where nodes state their opinion by creating messages in the Tangle.
With FCOB+FPC every transaction was penalized with a quarantine time to set an initial opinion and break metastability from the get-go in the rare case of conflicts that cannot be resolved in time. On the contrary, the changes to the consensus allow nodes to state their opinion without applying a quarantine time. Only when a conflict is unresolved for some time is a metastability-breaking mechanism such as FPCS activated to decide the conflict. Not only does this optimistic procedure speed up confirmation times, it also makes the protocol much simpler and leaner.
Where we are right now
The implementation of the improvements to the consensus is progressing at a fast pace: A first implementation of pure On Tangle Voting (OTV) as a modular conflict selection function and the like switch is complete and we are currently in the testing, bug fixing and writing documentation phase. Immediate next steps involve the simplification of markers for the Approval Weight (AW) and updating the tooling, i.e., GUI wallet, library and their documentation.
The first version of our metastability breaker based on FPC has been thoroughly validated by academia (FPC-BI, Fast Probabilistic Consensus with Weighted Votes). The same results and concepts can be extended to our current optimized proposal which is a straightforward modification of the original FPC and will be soon available in a rigorous scientific paper. We will continue our validation by preparing a scientific research paper demonstrating the strength of our improved consensus mechanism.
In addition to describing the mechanism of OTV, we will analyze its performance by running extensive simulation studies based on the multiverse simulator, which is currently being extended to serve as a general simulation framework for our consensus. This includes studying confirmation times and resilience in various double-spend scenarios, as well as the consideration of an adverse environment. Finally, we will incorporate OTV together with FPC on a set (FPCS) and explore the implications of these symbioses in detail.
The road ahead
With pure OTV, nodes set their initial opinion on conflicts based on their perception of the heavier branch, which makes the protocol behave similarly to Bitcoin’s longest chain rule. This means that consensus is inherently probabilistic and the client (e.g., a node, wallet or exchange) decides on their security threshold when to consider a transaction as confirmed (this is known as the 6-block rule in Bitcoin). Accordingly, we will introduce several Grades of Finality (GoF), mainly based on Approval Weight, that define increasing security thresholds. The higher the grade, the higher the security. However, in rare cases (e.g. being eclipsed or partially offline) a node might receive missing information at a later point in time, which could trigger the need to reorganize its local perception, just like in Bitcoin. Thankfully, a high GoF setting makes these occurrences virtually impossible. Nonetheless, this reorg procedure needs to be implemented in the node software to reset components that depend on confirmation.
Once we have a stable version of pure OTV with reorg running, we can shift our focus onto implementing a metastability breaking mechanism to go hand in hand with pure OTV, kicking in if a conflict is unresolved for too long. Specifically, we'll first implement Fast Probabilistic Consensus on a set (FPCS) as a metastability breaker, utilizing positive previous research results. The modular conflict selection function allows us to easily exchange the way we set the initial opinion; therefore, even experimenting with other approaches to reach consensus involves only minor adjustments of code.
As a Sybil protection, votes are weighted by cMana. However, absolute cMana values might not yield sufficient information. Instead, we use active cMana to express the weight of a node in relation to all recently active nodes. Therefore, active cMana is a crucial component and needs rigorous study to understand its boundaries and limitations.
In traditional Proof of Work-based distributed ledger technologies (DLTs), the right to add a new message onto the ledger is guaranteed by the mining process where one of the nodes is elected after completing some computationally expensive task. As an incentive to perform such a task, the winning node obtains transaction fees and block rewards. In such DLTs, fees act as a filter to select which messages should be added to the ledger: consequently, during congested periods, the amount of fees becomes larger.
IOTA, on the other hand, is meant to be the backbone of emerging digital economies, in particular the machine economy. Therefore, a protocol that requires fees and high-end devices is neither feasible nor sustainable. Removing fees, however, requires the adoption of an explicit mechanism to deal with large traffic bursts.
In the IOTA Foundation, we take an approach inspired by the “standard Internet” to define the IOTA Congestion Control Algorithm (ICCA). Unlike DLTs based on Proof of Work, our approach permits the optimal exploitation of the network resources, capped only by the physical constraints of the network in terms of bandwidth and transaction processing capabilities. Furthermore, ICCA can be efficiently used by low-end nodes as it does not require any expensive task or fees. Our solution provides several other fundamental features:
- Consistency: all nodes eventually write the same messages to their local ledger.
- Fairness: network access is granted proportionally to a “scarce resource”, that is access Mana (see next).
- Efficiency: if the demand exists, the full potential of the network will be used.
Where we are right now
This subsection describes the actions a node must perform when a new message is created and when a message is received from a neighbor, and which are currently implemented in the IOTA 2.0 DevNet.
A quantity called access Mana provides writing rights to nodes. From the user perspective, access Mana translates in a certain throughput for the node. Interestingly, this throughput adapts with the current traffic conditions: nodes progressively increase their message generation rate (throughput) until the point where congestion is detected; once this happens, the throughput is reduced. Nodes detect congestion by looking at the number of their own messages on hold in their outbox buffer.
After a newly received solid message (see next paragraph) passes certain validation filters (such as, signature check and syntactic validation), it is added to the local ledger. After this, the message is appended to the outbox buffer managed by a scheduler to decide which message has to be added to the set of tips and gossiped to the neighbors. The scheduler uses a lightweight round-robin algorithm to efficiently deal with a large number of messages simultaneously. Recently, we have moved the scheduler after message booking to optimize the resynchronization procedure.
Solidification is the process of requesting messages whose past cone is not yet known to a node. Similarly, when dealing with transactions, the consumed outputs (inputs) of a transaction need to be known to a node in order to properly validate the transaction. In the past, the protocol required all inputs of a transaction to be in its containing message’s past cone. Validating whether something is in the past cone might require walking in the Tangle, which is an expensive operation and has proven to be unfeasible in the DevNet. Therefore, the first version of a different approach to solidification has been implemented, one that not only removes the past cone requirement but also simplifies the data flow and allows for more parallelization.
The road ahead
In DLTs, malicious nodes have a great interest in affecting the functioning of the network. For this reason, each solution must be validated against all potential threats. As we have experimentally proven, ICCA is resilient against attacks: when attackers try to get a faster throughput than the allowed one, their messages are delayed by honest nodes as they will not be scheduled according to the current access Mana vector. Hence, messages issued by the attacker will stay in the honest nodes’ outboxes for a long time. While this does not affect the performance of honest messages it is important to identify such malicious behaviors for two reasons: (i) the attacker can inflate the number of messages in the outboxes of its neighbors, making them run out of resources; (ii) honest nodes may have attached on top of malicious messages, leading to inconsistent ledgers across nodes. To overcome these issues, we will blacklist malicious nodes by dropping their messages. Malicious flows are detected when the number of their messages in the outbox exceeds a given threshold. The way in which this threshold is set (which is critical for quickly detecting malicious behavior while not blacklisting honest nodes) has been described in a recent paper and will soon be implemented in our DevNet.
It is also important to have an accurate estimation of the validity of the timestamps inside the messages. For instance, old messages should not be added to the scheduler as, with high probability, they have already been received by neighbors. Also, attaching to old messages is undesirable, as this would increase the confirmation time for new ones (we assume that old messages have already been validated). In addition, accurate timestamps are needed to guarantee fast resynchronization and discourage attackers from harming the performance of the protocol. We are currently investigating a metric, called inclusion score, which evaluates the likelihood of a message being considered as “recent” and ready to be scheduled: this metric takes into account message timestamp and scheduler occupancy of messages from the issuing node.
The throughput of a node is regulated by the rate setter algorithm. This means that nodes cannot continuously generate messages in case of congestion as they would be penalized by ICCA. We are currently working on providing estimations of the node throughput computed as a function of the access Mana vector. With a reasonable guess, it would be possible to answer the following question: assume that a user (a wallet or a node) has a new message to issue: what node should be chosen so that this message is quickly added to the IOTA ledger? This research direction tries to assign new messages to nodes that are less busy at a given point in time. We formulated this challenge as an optimization problem and are testing a few proposed policies based on message creator occupancy or delay per node.
On the other hand, remember that the time necessary for a message to be delivered to all nodes is (almost) independent of the access Mana of the issuing node. This implies that the access Mana vector only regulates the rate at which new messages are created, but it does not affect the usability of the network. On average, delays are expected to be reasonably low: we estimate that, under normal conditions, messages stay in the scheduler for just fractions of a second; in heavy traffic conditions, this delay increases slightly. In the current proposal, we deal with heavy traffic by setting a minimum mana threshold required to nodes to be able to issue messages: trivially, if such a threshold is lower, more nodes are able to issue messages and potentially create congestion. We are currently working on opportunistically adapting this threshold with respect to the current traffic.
The Tangle could become very large indeed, and not every user will be able to keep track of its entire history back to the genesis. Therefore, local snapshots should allow nodes to safely prune their database; that is, sufficiently old data can be deleted without harming the network's security. Snapshots depend on many parts of the protocol and have a direct impact on partition tolerance, since partitions cannot be merged beyond local snapshots. It is also closely related to trustless syncing, because nodes without the entire history stretching back to genesis cannot easily verify the ledger state without an additional mechanism, such as merkle tree or polynomial commitments. These topics are currently under research.
The IOTA 2.0 DevNet has provided our team with valuable insights on our quest to fully decentralize the IOTA network with a leaderless, feeless and permissionless distributed ledger protocol.
We have made great progress in the GoShimmer implementation by adding some of the recent improvements (especially with the consensus) and by beginning to refactor the codebase. While we are excited to see this iterative progress towards a stable network implementation, there are still several key research questions for our team to evaluate, to validate and to then implement in GoShimmer. We are confident that these changes will make the protocol overall more secure, simpler to implement and faster to use.
Once we have implemented all the required changes, we will upgrade the IOTA 2.0 DevNet and invite the IOTA community together with our team and research partners to participate and to experiment with the improved protocol. Based on the results and feedback from that network, our team will then work on finalizing the IOTA 2.0 specifications and begin with the development of a stable implementation in Go and Rust. This will then lead to the launch of an incentivized testnet and ultimately the “Coordicide” of the IOTA mainnet.