Problem
Users need to find data in a DHT (see Anchor), but in some situations you don’t know what sort of anchor taxonomy you need to create at application design time so Symbol Anchor can’t be used. Examples include knowledge bases, category hierarchies, store departments, company directories, and address books that allow users to create custom fields.
Solution
Define three types of data that allow a tree of anchors to be constructed over time:
- An entry type that holds the anchors,
- A link type that links from one anchor to another, and
- A link type that links from an anchor to the data people are interested in.
Users can then find interesting information by traversing the tree, starting at its root node.
The naïve way to do this is with simple strings: each anchor is just a label, it links to its children, and in turn its children link to it. But this introduces a problem: What if two nodes in different parts of the tree have the same label? The DHT de-duplicates them into the same entry, smashing all their parent/child links together. This gets particularly painful if a node shares a label with one of its descendants: you get a cyclical graph, which causes infinite loops if anyone tries to build the whole tree at once.
You can solve this easily by making an each anchor a tuple of its string value and its parent’s hash. That way, two anchors with the same value in different parts of the tree will be completely different entries with different hashes. Here’s how it might look in a simple product category hierarchy. The hash of each anchor is shown after its value.
-
("", null)
→95f071
(root anchor)-
("Fruits", "95f071")
→f863f6
-
("Fresh", "f863f6")
→f0b73f
-
("Dried", "f863f6")
→149323
-
-
("Vegetables", "95f071")
→23920f
-
("Fresh", "23920f")
→e9a0e1
-
("Dried", "23920f")
→dd9148
-
-
("Nuts and seeds", "95f071")
→60061b
-
("Nut butters", "60061b")
→38effa
-
-
Each anchor has a link to each its children, whereas the connection from a child to its parent is explicit in its content. You can see how this simple structure lets you build a tree without clashes.
Warnings
-
As with all types of Anchor, consider how many children any anchor might have. Watch out for DHT hot spots.
-
Because relationship data is duplicated, you have to do a bit of your own work to ensure referential integrity. A child always links explicitly to its parent, but the only connection a parent has to its child is via a link. That means that, if an instances crashes after creating a child but before linking from the parent to the child, that integrity is broken. In the future, Holochain will allow you to create atomic batches of commits, but in the meantime you may want to create code that scans an agent’s source chain on startup and creates any missing links.
-
Building the entire tree, or even a single path through the tree, can get expensive. You may want to create extra links to improve lookup times:
- Descendant links that allow you to retrieve children and grandchildren in one query.
- Ancestry links that allow you to retrieve the entire path from a node to its root.
See the above warning regarding referential integrity.