GroveDB Sum Trees
One of the innovative features of Dash Platform that we at Dash Core Group are excited to present is a GroveDB feature called sum trees. GroveDB sum trees are Merkle AVL trees whose root nodes hold the sum of all integer values in the tree. Any time a value is added, removed, or updated in a sum tree, every parent node and hence the root’s “sum value” is updated. Sum trees are a very powerful tool for trustlessly and efficiently retrieving the sum of a set or range of values, as it would take a relatively large, often enormous, amount of computing resources to achieve otherwise. In turn, vital security checks are made highly efficient and simple, and a whole new arena of use cases is opened up for dApps.
Architecture
Sum trees have a simple architecture. In addition to the hashes of regular Merkle tree nodes, each node in a sum tree holds a “sum value” and an optional “sum item”. A sum item is a key-value pair where the value is a signed integer. It can represent anything that has an integer value, such as an identity’s credit balance or the number of unsold tickets remaining for a concert. A sum value is the value of a node’s own sum item plus the sum values of its child nodes.
As illustrated in the diagram above, the sum tree leaf nodes’ sum items are equal to their sum values. There’s nothing to sum. Moving up a layer, each parent node has their own sum item (which is optional, as will be explained next) and a sum value, which is the sum of their child nodes’ sum values plus their own sum item. This structure repeats itself from the leaves up to the tree’s root node, where the entire tree’s sum value is held.
Note that all sum tree nodes hold sum values but don’t need to hold a sum item. In other words, you can still hold any type of element in the tree, such as a key-value pair with a string value, a non-sum subtree, or even an integer value that isn’t a sum item. These elements won’t affect the sum value of the tree. You’re therefore able to collect sums in the same tree you use to prove the existence of non-sum elements with Merkle proofs.
Application
The general use case for sum trees is to trustlessly and efficiently retrieve the sum of a set or range of values. Developers can implement sum trees in their projects built on Dash Platform or any other platform which implements GroveDB. This section will walk through some example use cases.
The Dash Platform engine itself uses sum trees to store the balances of all identities and reward distribution pools, and constantly checks them to ensure there are no inflationary or other bugs present. This is actually a huge innovation for blockchains, as inflationary bugs have been rather common and destructive to blockchains in the past. These checks provide a strong layer of security against them. Dash Platform runs these sum tree checks on every block submitted to consensus, and any which don’t pass are prevented from entering the state.
To illustrate the massive efficiency gains sum trees enable, say you have a tree which contains the number of restaurants that accept Dash in each city of a US state. The root of the tree would be the total number of Dash-accepting restaurants in the state. If you’re a true Dash fanatic, you could compare the root sum value of that state’s tree with those of all the other states to help you determine which state to live in. Say that every city in the US has at least one restaurant that accepts Dash. With sum trees, to get this information, you would only need to query the root node of each state’s tree (50 states), verify the proofs, and compare the values. With regular Merkle trees, you’d need to query every city in every state (over 100,000 cities), verify the proofs for each city, add the values up yourself for each state, and then compare.
The efficiency savings in the example above are on the order of 2,000 times. Two other real-world applications where sum trees would be beneficial are:
- Proof of solvency: a topic which has been under much discussion again lately with the recent collapse of FTX. An exchange could store their users’ balances in a sum tree along with their hashed usernames. Users could be provided with Merkle sum proofs showing that their balances are correctly included as part of the total. To prove that they hold at least enough assets to cover the sum tree’s root value, the exchange could then do something like periodically sending a specified amount to a specified address at a specified time. While the current proof system in GroveDB would present some privacy leaks under this design, the future addition of more advanced zero-knowledge proof constructions will make this a very powerful application.
- Custom tokens: Dash Platform enables developers to easily create custom tokens for their data contracts with the address balances automatically stored in a sum tree. Each node of the sum tree holds an address’s balance as its sum item, and the root node can be used to check for bugs. It doesn’t only add an extra layer of security to the token contract, but it’s also a very efficient method of keeping track of address balances compared to current methods.
Looking Forward
Sum trees open the door to a whole new arena of security measures and use cases that trustless systems and dApps can take advantage of. They’re just another example of innovations GroveDB is bringing to the field of blockchain databases, and we at DCG are excited to show off the efficiency and security that Dash Platform will be able to provide utilizing tools like this. For future releases, we are investigating the integration of Verkle trees and zero-knowledge proof constructions, which have the potential to make sum tree applications even more powerful and versatile. For more details on sum trees, stay tuned for a Dash Improvement Proposal coming soon, written by our CTO Samuel Westrich and GroveDB developer Wisdom Ogwu.