Ideally the tree structure embeds some domain specific granularity
Examples of granularity include:
try to find a way to create buckets of granularity
(That way) we never need to fetch all messages because we can start as deeply down the tree as is appropriate
I get it. Makes perfect sense.
Imagine a clutter-twitter app, where people post tweets, and users should be able to see the latest tweets. It has no concept of Profiles or whatever. Everyone kinda posts in a universal feed.
Given a tree-based (“Path”-based, in HDK’s terminology) solution, users who want to read last hour’s posts can simply retrieve the entry for “1/3/2021: 23.00 to 24.00” and simply do a get-all-links on it and get all the messages posted in that hour (assuming the granularity of partitioning/pagination is 1 hour).
But the problem is that not all use-cases that the end-developers ((h)app developers) have exhibit that sort of a granularity; they simply don’t fit in that domain. Thus, the tree-style solution fails them (as it once failed me)…
Imagine that fresh Ford Model 1 cars are on sale, and that buyers would have to purchase it in exchange for US Dollars. And, for this specific use-case, imagine that no single manufacturer is churning out those cars; rather people who have a 3D-printer are printing those cars (at home) and offering it for sale at an arbitrary price that they think there hard-work is worth for; this transaction has to take place on a Holochain app (Yippee!!!).
Here’s the catch: the user has absolutely no idea what a Ford Model 1 should be worth.
One can see that a tree-based solution won’t be of any help here… For example, you may store “all-cars-for-dollars” entry, and link them to “all-cars-for-0-to-100-dollars”, “all-cars-for-100-to-200-dollars”, “all-cars-for-500-to-600-dollars” and so on… Further down, one may have other subtree-roots that say “all-cars-for-100-to-150-dollars” and so on…
Now what can you, as a rational user, query to this (h)app so as to strike the best deal? Nothing! You’re absolutely hopeless due to such an architecture…
You can make a guess and search for the subtree-root (i.e., an entry) that says “all-cars-for-0-to-100”, in which case you’d just be right on the money (meaning very lucky). But what if those 3D printers costed a billion dollars each? Well then would forever be searching in the wrong places! Plus doing a get-all-links to the rootest-root isn’t an option ('cause what would be the point then; I mean, that way we’d be back to the same problem we started with!)
Imagine another architecture of organizing such diverse prices on the DHT.
There’s an entry that says “best-deal”, to which is linked the Ford Model 1 that’s up for sale for the least amount of money. That car (or rather, the entry of that car) further links to the next best deal, and so on…
The user gets() the “best-deal” entry, gets its link (tagged “next”), and guess what? He’s found the cheapest Ford Model 1!
[Replace Ford Model 1 with, let’s say, Starbucks coffee-coupons, and US Dollars with Amazon-coupons and you’ve got the Vril’s token exchange system. Just so as a side-note…]
[Note that for this use-case, new sellers might have to be placed anywhere within the list… However, for the use case of messages, this is a great approach; in fact, I’d say even better than HDK::Path based approach. The “Latest-Message” entry would, upon the arrival (i.e., the posting) of a new message, would simply delete the previous link that it held, and rather create a new one that points to this newer message, while the new message would now (and forever from then on) hold the link of the last latest message (the previous latest message, in other words). This way, a user can read the latest message, the one posted before it, the one posted before that one, and so on… Trust me, nobody reads the messages posted on “Tuesday, July 21st 2019”; they simply scroll up and up until they reach what they had been looking for. And my experience with distributed databased has taught me to always “store data the way it would be retrieved”, i.e. time-sorted “latest-first” manner, in this example.
One doubt I have is whether deleting and creating (or simply updating, as updates are nothing but sequential deletes and creates) the link on the “latest-message” entry would be an efficient thing to do, especially from a storage perspective. Because there would always be some remnant leftovers (such as the link-headers, etc)… But, from a storage-optimization perspective, your Tree solution also ends up creating a great many “roots” and “sub-roots”, such as “all-cars-for-0-to-100-dollars” which do not hold any valuable information in and of themselves… Plus, you’ve already assured me that it’s best to ignore such silly storage-costs (How exactly does holochain support micro-payments?).
As for the hotspot thing, then yeah, I would mostly be hitting the “latest-messages-between-the-a-man-and-david” entry (substitute the string with a well-formatted chat-room entry that stores the hashes of the participants and not their names; you get the idea…) because that’s what I care about (i.e., I care about the latest messages). But note that it always has just one link. Plus it (or rather, the neighbourhood of that entry) would only be receiving get() requests from me and you; not that hot a spot. Even if the whole world were in that room, the same argument would apply to, let’s say, a famous person’s tweet/message when everyone wants to read it at the same time (i.e., would incur a huge traffic influx on that neighbourhood, so as to make their device’s wifi glow red hot).