How WANdisco Enables High Availability for Distributed Ledgers

By Ramki Thurimella, Aug 13, 2021

Editor’s note: The author would like to thank Granville Barnett PhD and Yeturu Aahlad PhD for their significant contributions to this article.

This article provides an overview of recent work integrating WANdisco’s Distributed Coordination Engine (DConE) with two of the leading Distributed Ledger Technologies (DLTs). DConE has been the foundation of WANdisco’s solutions spanning use cases such as multi-site replication of Subversion and Git repositories, Hadoop-compatible file systems, Apache Hive and Databricks and the migration of on-premise data to cloud providers such as Microsoft Azure and Amazon Web Services.

DLTs can be used in various sectors and are gaining traction particularly in the financial sector. Here we give an overview of the integration of DConE with two popular DLTs: Hyperledger Fabric and R3 Corda.

Introduction

Bitcoin (https://bitcoin.org/bitcoin.pdf) pioneered digital ledger technology (DLT) and set in motion a number of subsequent innovations. These include Ethereum (https://ethereum.org/en/) and the associated programming language Solidity to write smart contracts. Abstractly, a DLT permits the construction of an append only log, a digital ledger. Each item or block in this ledger is cryptographically and immutably linked with its predecessor, resulting in a blockchain. Transactions in a DLT do not need to be mediated by a central authority: participants can transact directly with each other. A key innovation of DLTs is that they facilitate a cryptographic means for determining (and verifying) asset provenance.

DLTs have the potential to bring new efficiencies to the financial industry. For example, it stands to offer increased transparency owing to the ease of transaction traceability. Also, faster reconciliation means quicker transaction completion time. These improvements can be achieved while operating in a highly available manner. One example use of DLT is in determining the chain of custody of agricultural produce such as cocoa to verify its fair trade compliance, or the ports and resulting regional compliances that freight passed through while in transit.

DLTs can generally be classified into two types: permissioned and non-permissioned. In the former, a governing council determines who is allowed to join its network based on trust; by contrast, any node can join a non-permissioned DLT which strives for complete disintermediation. Bitcoin is the most popular example of a non-permissioned DLT; Another well-known DLT, Ethereum, can be configured as a permissioned or non-permissioned DLT. In a non-permissioned DLT, such as Bitcoin, blocks are appended to a hash-based proof-of-work (PoW) ledger.

A problem that exists in both permissioned and non-permissioned DLTs is the agreed ordering of blocks within the ledger. It is solved easily in the permissioned DLTs using a consensus protocol where all nodes agree on ordering of blocks before they are appended. In Bitcoin this is determined by the longest chain rule, i.e. appending the newest block to the longest chain. This works because nodes that would like to add a block to the ledger are required to solve a problem that is computationally hard, but easy to verify if a solution is provided, similar to the cryptographic hash functions. The act of solving is often referred to as mining. Appending a block to a chain that is not the longest is not in a node’s interest because other nodes are likely to abandon that chain, and the node risks wasting the effort it put in mining that block. The largest chain has the most amount of energy expended (in the form computing resources) by the miners in constructing that chain. In Bitcoin, so long as the majority of nodes on the network control the majority of computational resources, it is impossible for adversaries to derive a ledger with a greater PoW.

The success of Bitcoin and Ethereum have demonstrated the practical application of DLTs and to an extent de-risked the technology, thus increasing its appeal to use cases outside of cryptocurrencies. These generalized DLTs have found applications in a variety of business-specific scenarios. Permissioned DLTs are typically deployed in environments where participants are known and there is mutual trust between the participants and the governing body of that network. For example, a logistics company may have a trusted network whose participants represent different groups within the same company. Each group would be granted appropriate permissions based on the role it plays.

A permissioned DLT requires all participants to be identifiable often via the use of a type of digital certificate (e.g. X509) granted by a certificate authority. In a permissioned DLT, the PoW-based approach is unnecessary for the ordering of transactions and is replaced by the use of an algorithm such as Raft or Paxos (https://lamport.azurewebsites.net/pubs/lamport-paxos.pdf). Two well known permissioned DLTs are Hyperledger Fabric (https://www.hyperledger.org/use/fabric) and R3 Corda (https://www.r3.com/corda-platform/). Fabric and Corda come with Raft implementations for their ordering and notary services, respectively.

WANdisco has been shipping products for 15 years built upon a Paxos-based Distributed Coordination Engine (DConE) library. These products address use cases in multi-site replication of Subversion and Git repositories, the migration of on-premise data to cloud, keeping Hadoop Filesystem (HDFS) content, Apache Hive and Databricks data consistent across on-premises, Hybrid Cloud, multi-region and Multi Cloud environments. Consequently we decided to integrate DConE with Fabric and Corda as we believe that DConE offers distinct operational advantages over Raft; additionally, as a company we have been at the forefront of distributed system trends. Therefore, DLTs present a natural application of our technology.

WANdisco DConE and Hyperledger Fabric

In contrast to a public blockchain, such as bitcoin, which has a single global ledger in the form of a blockchain which grows with time (it is 320G at the time of this writing), Fabric maintains separate ledgers, one for each grouping of entities, called a channel. Each channel represents a private subnet of communication between its members analogous to a home network with many non-routable internal IP numbers, with one external facing IP number. The home devices communicate with the outside world using one external facing IP assigned by the ISP. Similarly, within a channel, members can talk to each other, but an outsider would have to go through an external interface to get access to the private information of a channel. In this sense, Fabric's approach to security & privacy is like that of a private subnet. Note that any given organization can participate in multiple channels. Thus, the channel facilitates its members to transact with each other with state updates being persisted to its ledger.

For example in the figure below, there are two channels: One consisting of {Org 1, Org 2} which persists the state changes to Ledger A, and the other {Org 2, Org 3} with Ledger B. Here Org 2 participates in both the channels and maintains two separate ledgers, A and B.

Figure 1: Fabric network with multiple channels ( See Figure 5 in Demystifying Hyperledger Fabric (1/3): Fabric Architecture)

Updates to the channel’s ledger can be performed only by the invocation of chaincode which is Fabric’s terminology for a smart contract. Chaincode can be constructed in various programming languages such as golang, java, javascript etc. When chaincode is executed with the goal of modifying the state of the ledger, care is taken to ensure that chaincode executions do not interfere with each other. This is accomplished by running contracts in separate containers.

When a user is registered with a Fabric channel, a role of admin, peer, client, orderer, or member is associated with that user. As per this assignment, each user provides services expected of that role for that channel. This is analogous to a hospital management system wherein there are various roles, e.g. nurse, doctor, administrator, pharmacist etc., each conferred with different privileges based on the responsibilities they are expected to fulfill.

The users submit requests to a peer to invoke chaincode upon a specified channel. Each such request engenders an execute-order-validate cycle, as can be seen in the simplified data flow given below:

  1. The SDK submits a proposal to execute chaincode on a channel to the peers required to satisfy the channel’s endorsement policy, e.g. the chaincode is endorsed by at least two peers. Each of the respective peers executes the chaincode, records the read and write set and ensures that the execution does not invalidate any data invariants, i.e. performs a consistency check.
  2. The SDK collates the proposal responses from the respective peers and verifies that the results of the executions are the same.
  3. If the results agree, the SDK builds a transaction from them. It creates a block by including this and other such reviewed transactions. This block is then forwarded to the ordering service to be sequenced.
  4. The orderer defines the relative sequence number within the channel’s ledger to which the submitted block is assigned. The derivation of this sequence number is via the use of Raft or in our case DConE. Upon assignment of this sequence number, the transactions within the block are annotated as valid or invalid as shown in the next step.
  5. Using the transaction ordering within a block, validating peers check and mark if a transaction has been invalidated by an earlier transaction, popularly referred to as the double-spend problem. The block consisting of transactions marked as valid/invalid is appended to the ledger and broadcast to the channel members. Upon receipt of the newly appended block, nodes update their own copy of the channel’s ledger, preserving the sequencing information defined by the orderer.

Figure 2

A unique advantage of DConE over Raft (the recommended orderer in Fabric 2.x) as the consensus implementation for the ordering service is the true distributed nature of the underlying algorithm, in particular the absence of a strong leader per-channel. Under Raft, only the leader can order the proposals; by contrast, all DConE nodes collectively perform this function in a decentralized fashion. In Fabric, each channel is represented as a distinct instance of a Raft network which models a distributed state machine (DSM), where only the leader of that DSM can order the proposals. A leader of a DSM is elected per-term. Such an election may be necessitated in a number of scenarios, e.g. when the leader of the DSM is not available due to some fault such as a network partitioning or a host failure such as lack of disk space etc.

By contrast, DConE has no notion of DSM leadership. Our integration uses the same DSM per-channel deployment model as Raft. A key advantage of DConE over Fabric is that transactions can be ordered in scenarios where the Raft-based orderer cannot actively service ordering requests. For example, under the Raft-based orderer, a leadership election will occur should the leader of a channel be considered unavailable. During the leadership election period the orderer cannot service any ordering requests. The duration of unavailability spans the detection of the channel leader being unavailable, the election of a new leader and that new leader beginning to serve ordering requests. This period is also influenced by the adverse network conditions as leadership election involves nodes communicating with one another, e.g. to cast votes.

In practice a Raft-based orderer services multiple distinct channels. This makes the failure of a node operationally more expensive as transactions belonging to those channels cannot be serviced while the aforementioned election is ongoing. DConE-based orderer does not have this disadvantage as it has no notion of a strong leader, resulting in greater availability for servicing transaction ordering requests under the same category of fault. In other words, not all Raft nodes (i.e. called orderers in Fabric terminology) are made equal, in the sense that the nodes elect one among them to be a leader. The leader decides in which order the peer requests should be processed.

The drawback with this approach is that if the leader fails, the peers must elect another leader, a very resource-intensive proposition, because leader election takes orders of magnitude more time compared to processing peer requests. In comparison, Paxos specification does not require a strong leader, i.e. all peers can be made equal in an implementation. In particular, DConE does not have the notion of a strong leader. Therefore, the question of a leader's failure followed by an election does not arise when DConE is used.

WANdisco DConE and Corda

R3’s Corda is designed to be a private distributed ledger. The way this is achieved is by restricting transaction information to only the nodes involved. Here is an example of an asset transfer and how Corda would handle it. Consider a scenario where Paul purchases a house from John, who got the house originally from Jane, and now Paul would like to turn around and sell the house. The buyer would ask Paul to prove that the house belongs to him. Paul produces the transaction details of〈John → Paul〉signed by the notary assigned to that transaction. The buyer would want to go further and see a proof that John got it legitimately and asks John to produce proof of ownership. John would produce, again,〈Jane → John〉transaction signed by a notary. This way the buyer would traverse the transaction chain backwards all the way to the person who constructed the house. For privacy reasons, Corda lets John keep only〈Jane → John〉and〈John → Paul〉transactions. This is known as the "need to know" privacy policy. This way, with the exception of the parties involved, others would not have any idea about the transaction history of Paul’s house.

This kind of privacy is essential in the financial industry. Since Corda's founders and investors came from this sector, its design is heavily influenced by privacy. The issue with this design is that if John or Jane is unavailable during the back chain validation phase, the sale cannot go through until they are back online. In other words, the privacy requirement unfortunately leads to multiple points of failure. DConE cannot fix this problem. This is inherent to Corda design. However, DConE is a much better alternative to the default notary that comes shipped with Corda.

As described above, Corda’s point-to-point communication model results in an architecture that is a departure from the channel-based architecture of Fabric. Again, in Corda only the participants involved in a flow (Corda’s terminology for a smart contract) are privy to the transaction. In Fabric, however, all members of a channel get to observe the transaction. A node in Corda hosts a distinct portion of the ledger (known as a vault) which contains only the states that the node observed as part of the point-to-point communication model. Transactions need to be notarized in order to be considered valid. For example, a flow sees the involved participants defined by the flow sign the transaction with their respective signatures and then submit the transaction to the notary which verifies that the states referenced within the transaction have not previously been spent and subsequently signs the transaction (the notary’s signature) thus finalizing the transaction. In Corda there are two categories of network participants: nodes and notaries. For our discussion we consider nodes as the entities that initiate a flow and the notary service (using consensus) to prevent double spends. The following simplified data flow represents the interaction between between different entities:

  1. A node initiates a flow which is then executed by the relevant nodes (or countersignatories). A flow is executed within a JVM.
  2. The initiator collates the countersignatures of the parties; the flow involves and verifies the transaction signature.
  3. The initiator sends the transaction to the notary service for notarization. The notary asserts all state references within the transaction have not been previously spent.
  4. The notary informs the initiator of the notarization. The initiator writes the state to its vault.
  5. The initiating node communicates the notarized transaction to the countersignatures of the transaction so they can write the state to their respective vaults.

Figure 3. Transaction flow in Corda

Corda is bundled with two choices for the notary node type for production use. The crash-resistant option is Raft. Our integration provides DConE as an alternative to Raft. The notary service uses consensus to replicate a database of spent inputs. If there are no conflicts in the notary’s database then the transaction’s state references are recorded along with the transaction identifier. DConE is used to replicate the database. Similar to the Fabric integration we offer the same benefit over Raft during a fault occurrence that leaves the leader in Raft unavailable. A DConE-based notary suffers from no such problem. Fabric benefits from the use of a widely used Raft library (etc. draft, https://github.com/etcd-io/etcd/tree/master/raft) that is used by a number of high profile open-source projects such as etcd (https://github.com/etcd-io/etcd), cockroachdb (https://www.cockroachlabs.com/) etc. By contrast, Corda uses a not so popular implementation of Raft (https://github.com/atomix/atomix), perhaps the reason for R3 designating it as experimental. DConE has been used by Fortune 100 companies for a number of years so we believe our integration offers something that is beyond just being another notary consensus provider.

Conclusion

WANdisco has integrated DConE with a number of high profile technologies over the last 15 years spanning source code control, HDFS, cloud and data migration. A natural application of our IP was the integration with permissioned DLT which is showing signs of adoption beyond the transacting of currencies like Bitcoin. We have integrated DConE with two popular DLTs: Hyperledger Fabric and R3 Corda, thus offering a solution for enterprises that require high availability of their distributed ledgers.

Download the Whitepaper


Ramakrishna 'Ramki' Thurimella Ramakrishna “Ramki” Thurimella PhD

VP of Research & Development, WANdisco

Ramki conducts research and development at WANdisco, building patentable technology; conceptualizing, designing, and architecting new products; and authoring technical articles. His technical expertise is in the areas of algorithms and data structures, information security, and distributed systems. He has (co)authored over 50 technical articles in top-tier conferences and journals. He previously served as the Chair of Computer Science and the Director of Cybersecurity Center at the University of Denver. Ramki has a PhD in Computer Science from the University of Texas at Austin and a Master of Technology from the Indian Institute of Technology, Madras.

FOLLOW

SUBSCRIBE

Get notified of the latest WANdisco Blog posts and Newsletter.

Our LiveData Story

Related Blog Posts

https://stag-static-www.wandisco.com/news-events/blog/company/announcing-livedata-platform-for-azure-ga

Company

LiveData Platform for Azure is Now Generally Available

Today, we announced that WANdisco’s LiveData Platform for Azure is generally available. The first na...

Oct 18, 2021

Read More
https://stag-static-www.wandisco.com/news-events/blog/tech-trends/leverage-data-first-strategy-your-aws-cloud-migration

Tech & Trends

Leverage a Data-First Strategy for Your AWS Cloud Migration

Leverage a Data-First Strategy for Your AWS Cloud Migration

Oct 12, 2021

Read More
https://stag-static-www.wandisco.com/news-events/blog/tech-trends/how-wandisco-enables-high-availability-distributed-ledgers

Tech & Trends

How WANdisco Enables High Availability for Distributed Ledgers

Overview of recent work integrating WANdisco’s Distributed Coordination Engine (DConE) with two of t...

Aug 13, 2021

Read More

Seeing is Believing. Try WANdisco Now.

Fully-featured, self-service and automated.

Start migrating Hadoop data in minutes, at any scale, to any cloud

Cookies and Privacy

At WANdisco, we respect your concerns about privacy and value the relationship that we have with you.

Like many companies, we use technology on our website to collect information that helps us enhance your experience and our products and services. The cookies that we use at WANdisco allow our website to work and help us to understand what information and advertising is most useful to visitors.

Please take a moment to familiarise yourself with our cookie practices and let us know if you have any questions by getting in touch through any of the methods listed on our "Contact Us" page.

We have tried to keep this Notice as simple as possible, but if you’re not familiar with terms, such as cookies, IP addresses, and browsers, then read about these key terms first.