Anchor Tree

Problem

Users need to find data in a DHT (see Anchor), but in some situations you don’t know what sort of anchor taxonomy you need to create at application design time so Symbol Anchor can’t be used. Examples include knowledge bases, category hierarchies, store departments, company directories, and address books that allow users to create custom fields.

Solution

Define three types of data that allow a tree of anchors to be constructed over time:

  • An entry type that holds the anchors,
  • A link type that links from one anchor to another, and
  • A link type that links from an anchor to the data people are interested in.

Users can then find interesting information by traversing the tree, starting at its root node.

The naïve way to do this is with simple strings: each anchor is just a label, it links to its children, and in turn its children link to it. But this introduces a problem: What if two nodes in different parts of the tree have the same label? The DHT de-duplicates them into the same entry, smashing all their parent/child links together. This gets particularly painful if a node shares a label with one of its descendants: you get a cyclical graph, which causes infinite loops if anyone tries to build the whole tree at once.

You can solve this easily by making an each anchor a tuple of its string value and its parent’s hash. That way, two anchors with the same value in different parts of the tree will be completely different entries with different hashes. Here’s how it might look in a simple product category hierarchy. The hash of each anchor is shown after its value.

  • ("", null)95f071 (root anchor)
    • ("Fruits", "95f071")f863f6
      • ("Fresh", "f863f6")f0b73f
      • ("Dried", "f863f6")149323
    • ("Vegetables", "95f071")23920f
      • ("Fresh", "23920f")e9a0e1
      • ("Dried", "23920f")dd9148
    • ("Nuts and seeds", "95f071")60061b
      • ("Nut butters", "60061b")38effa

Each anchor has a link to each its children, whereas the connection from a child to its parent is explicit in its content. You can see how this simple structure lets you build a tree without clashes.

Warnings

  • As with all types of Anchor, consider how many children any anchor might have. Watch out for DHT hot spots.

  • Because relationship data is duplicated, you have to do a bit of your own work to ensure referential integrity. A child always links explicitly to its parent, but the only connection a parent has to its child is via a link. That means that, if an instances crashes after creating a child but before linking from the parent to the child, that integrity is broken. In the future, Holochain will allow you to create atomic batches of commits, but in the meantime you may want to create code that scans an agent’s source chain on startup and creates any missing links.

  • Building the entire tree, or even a single path through the tree, can get expensive. You may want to create extra links to improve lookup times:

    • Descendant links that allow you to retrieve children and grandchildren in one query.
    • Ancestry links that allow you to retrieve the entire path from a node to its root.

    See the above warning regarding referential integrity.

1 Like

hey @pauldaoust… this is extremely well described and thought out… thankyou. I would like to implement this pattern for a happ … if there is any existing code that would be great… would be great if there were links to implementations of these patterns… a “zome library” to stop people reinventing the wheel… i am found of the node package ecosystem because 9/10 times someone has done what i need already and i can create apps much faster.

1 Like

@nphias I thoroughly agree. I suspect this forum will become a hub for devs from the community and the team to collaborate on building drop-in libraries that implement this pattern. Something similar to this anchor tree idea is going to be built soon, although it only supports two levels of hierarchy.

Hi there! Got a quick question -

  • A link type that links from one anchor to another, and
  • A link type that links from an anchor to the data people are interested in.

Do these links differ between Explicit & Implicit links?

Implicit vs explicit links is a term I’ve heard people using for actual DHT links vs an entry that has a property that contains an Address, but I can’t remember which is which. But let’s assume that children can’t have multiple parents…

  • A parent linking to multiple children (whether they’re anchors or people) would have to use proper Holochain links, with a defined link type and everything. They’re good for 1-to-many relationships.
  • A child linking to a single parent, regardless of whether the child is an anchor or a person, would have a parent_address field baked into it, because it’s a 1-to-1 relationship.

A note about the second point is that the entry will have to be updated (which means it’ll live at a new address) any time it moves to another parent.

1 Like

See this for a full explanation: https://holochain-open-dev.github.io/blog/implicit-explicit-links/.

Maybe implicit links was not the best term I could have come up with… Maybe implicit reference was better.

I was thinking that maybe “native” or “domestic” link for what comes with Holochain (explicit links). I do want to be mindful of cultural sensitivities. What do you think?

1 Like