Anything new in rrDHT? How is it different than KademliaDHT? Is the routing/searching algorithms are different?
Hi @premjeet and welcome to the forum! Nice to see your familiar username.
The Holochain team started out with Kademlia, as you probably know, but felt like it wasnât all that suited to their neighbourhood-based design. I donât know all the details, so here is my understanding.
Here are some things that did work with Kademlia:
- Peer selection for holding data is random (based on unpredictable output of hash function), which is great for randomised peer validation
- Peers are sorted into neighbourhoods (overlapping shards) and know about more nearby peers than about faraway peers (but still keep a small list of faraway peers, which speeds lookup/discovery)
But there was just a bit of mismatch in how Kademliaâs neighbourhoods work, particularly as regards Holochainâs assumptions of resilience (a neighbourhood knows about each other and communicate regularly about data health). Nothing insurmountable â Holochain stuff could be bolted on â but they started to feel like they were working against Kademlia, not with it.
rrDHT takes Holochainâs agent-centric assumptions and embeds them deep into the design. The address space looks like a wheel, and (as with Kademlia) agents and data live in the same address space. Each agent has:
- a âstorage arcâ, which is the portion of the DHT in which the agent claims to know about all data and enough peers whose storage arcs overlap
- a âquery arcâ, which is the portion of the DHT in which the agent claims to know about enough peers whose storage arcs overlap
- a few faraway peers to speed lookup times
The storage and query arcs can be thought of as small and large neighbourhoods. In the small neighbourhood, everyone knows each other well, takes care of each other, shovels snow off each otherâs sidewalks, watches out for burglars, etc. And each person has a different opinion of who their close neighbours are â I might extend that status to someone seven houses away, but they might not extend the same status to me because theyâre hardly ever home. In rrDHT terms, I might have more storage space and bandwidth but that person five houses away is just a smartphone and doesnât have much resources to give away.
The large neighbourhood consists of the area where I know my way around, I know the street names, and I know who lives in each house. But we donât look out for each other in the same way.
Within the small neighbourhood, agents gossip with each other to maintain resilience: if one of them goes offline, the others will try to contact slightly more distant neighbours to bring them into the small neighbourhood.
You can see a very oversimplified (and hence not 100% accurate) explanation here in some introductory material Iâm writing: https://hackmd.io/JmPpIE0lT7qD8BYOaO6Lyw
Thanks @pauldaoust. So itâs more dynamic, multi-dimensional and like geometricDHT! It works with Chord, Am I correct?
@premjeet Iâve never heard of geometricDHT; please share more!
rrDHT is very Chord-like in its geometry, and there are some similarities in lookup method, but rrDHT appears to me to be much more flexible, and perhaps more resilient too. Chord doesnât specify a resilience strategy but appears to suggest that nodes can map the neighbourhood around them in case one of their peers gets unreliable. rrDHT, on the other hand, is explicit about how to maintain resilience. Lookups should also be a bit faster; nodes can search both clockwise and counterclockwise at the same time.
Whatâs rrDHT? Has this been built in-house or is it something external that is being used? All sounds very interesting thanksâŚ
rrDHT is Holochainâs DHT implementation. Itâs currently being built in-house.
This was exactly what I was curious about regarding small-world networks in another post. Awesome!
What does the rr stand for?
rr = round robin
âRound Robinâ Distributed Hash Table (rrDHT)
Oh, of course! Thanks
Weâve also heard people calling it âroadrunnerâ because theoretically itâs very fast
cool, look forward to hearing more about it⌠is it in the latest builds yet?
Not sure, to be honest â Iâve been trying to track networking development, and itâs hard because I donât understand the language but my general impression is that most of the networking lib is still âfull-syncâ (everyone gets a copy of everything) and there are a few rrDHT-ish components in R&D.
This is a good thread on rrDHT. And i thought i should add more discussion and expand on this topic. I was reading an article on scalability of DHT and it lists certain âideal benchmarksâ that truly performant, scalable and reliable DHT should have. Listed below are the âWhat defines a complete solution?â which i think are very good references for us to compare to especially when holochain/holo/holofuel transit to live testing.
-
The system should scale to tens of millions of active and reachable nodes . Routing scalability should be demonstrated through the right metrics, one of which should be look-up delay.
-
The proposed algorithms should account for both reduced look-up time but also reduced delivery delay . In this respect, IPFS should not only be seen as a storage system, but also as a timely content delivery network , similar in nature to present-day CDNs (without necessarily targetting similar SLAs though). Smart content replication and caching should therefore be considered in order to achieve those goals.
-
The system should be able to deal with churn in the order of hundreds of thousands of nodes . When this volume of nodes leave the system, the routing algorithm should still be able to route to the requested content item. This includes both routing stability, but also content replication.
-
In all cases the system should be able to guarantee 100% success in content resolution , i.e., all content should be reachable at all times (even if look-up time is longer than usual).
-
The system should be able to deal with sudden increase in traffic demand , e.g., an order of magnitude more requests than normal. In dealing with increased demand, the system should be able to load-balance between nodes, i.e., not overload a few nodes when others are underutilised.
Article link: https://github.com/libp2p/notes/blob/master/OPEN_PROBLEMS/ROUTING_AT_SCALE.md
I also believe holochain DHT falls under the category of âstructured but unmanagedâ P2P networks?
I actually pose these âbenchmarksâ questions during the AMA which @artbrock replied to. Here is the youtube link for this particular segment which @LifeSMyth had taken the effort to split up the 1 hour AMA videos into smaller clips by questions.
Well, @artbrock did mentioned that every happs will have its own DHT network. In a sense, the load will be spread out among thousands or more DHT apps across the holochain ecosystem operating in parallel.
Nevertheless, i see that right out of the gate, holo and holo fuel will probably be the biggest DHT networks with the most users/nodes. And therefore, it will be great to see how they perform during live test in terms of scaling of node + ability to have short look up times + short delivery delays + able to deal with large churns + 100% success in content resolution + ability to deal with sudden traffic increase/load balance.
Very interesting
Yes looking forward to this. Very close now⌠the wait is almost finally over⌠just wish I could get my holoport working in time! Lol
is there any documentation about how it works? /will work?