I’m trying to make a Zome that enforces uniqueness for an entry type, like “username”. I was thinking that I could use the validation to create an entry for each username, and only allow a single connection of type “user_metadata” to that Entry, and then in validation I’d use hdk::query to see how many connections exist, and if length > 1: err("already exists").
Does that make sense, or are their issues with relying on the validation to enforce uniqueness? I could imagine this could cause an issue if there’s two different users doing this at the same time, but I would imagine that eventually would become consistent.
So query will only check your local chain. Which means that this validation rule will inconsistently pass or fail depending on whether or not you have the entry in your chain. It’s not really possible to validate on the existence of an entry.
There’s a few things they might be confused here:
enforces uniqueness for an entry type
All entries that are the same will have the same hash and there for be the same entry. It’s not possible to have two entries that contain the same value to be different.
use the validation to create an entry
You don’t want to create anything in validation. Basically validation needs to be deterministic. So everyone gets the same result given the same entry. You can’t have side effects in this function.
I’m a little confused about your use case. Are you trying to enforce a unique username for each user?
I think once I understand what your goal is I might be able to help a bit better.
@freesig I’ve definitely come across this use case; in fact username registration is the classic scenario we use when we talk about problems with eventual consistency in a distributed system.
It’s true that entries in a DHT are guaranteed to be unique (it’s called deduplication), but each entry has its own mini-store of metadata. That metadata is used to store author signatures, links, and updated/deleted status. It can have as many author signatures and links as you like. And each of those records is also guaranteed to be unique (that is, if an agent commits the same entry twice, their author signature only appears once). But sometimes you only want one author signature, or one link (and you definitely only want deleted/updated status — doesn’t make sense for an item to both be deleted and updated).
So uniqueness can mean two different things: it means that a validator node only stores identical entries once, but the sense you’re looking for @rlkel0 is that it should only have one unique author.
Here’s an explanation of why it’s hard to do in a decentralised system:
Alice creates a username entry containing the string ladybug123 and publishes it to Bob, the most likely of her peers to be chosen as a validator for that entry. Bob stores it and starts gossiping it to his neighbours.
At the same time, Carol creates the same username entry, but she’s on the other side of the DHT and has a different set of peers. She sends it off to Dave, who starts gossiping it to his neighbours.
Here’s a problem: Bob and Dave have a couple neighbours between them, and they’ve validated, stored, and gossiped that entry before they get news from each other. So they’ve both validated, stored, and gossiped the same entry with different authors.
As soon as they get each other’s copy, they realise there’s a conflict. They can’t revalidate and fail the entry based on information they didn’t have at validation time, and besides, who should they have picked? Alice or Carol?
So this shows the need for conflict resolution as @freesig mentioned. This is a future feature of Holochain, and here’s the little that I understand of it: an application is able to enforce a constraint like “entries of type x can only have one author (or some other metadata field)” and then define a resolution protocol, one of:
Handle the conflict automatically with a function (e.g., oldest timestamp wins)
Send off a signal to both authors to trigger a human-directed resolution protocol (maybe an auction for the username, or a request to register a new username)
In either case, there’s a period of time when the entry hasn’t fully ‘saturated’ the DHT (that is, there aren’t enough nodes holding it for it to be considered confirmed) so it’s in an indeterminate state. From a UI point of view this could look like a message that says “your username is pending registration”.
Interesting, conflict resolution would open a lot of doors. Is there a roadmap or issue tracking when this feature will be introduced or is it more of a long term goal?
My current idea is to bind the application to a blockchain, so things like the username registry could be managed by a smart contract. In a way it feels like cheating though, so I’m trying to see if there’s a way to avoid this.
One other idea is that users can choose a username, and then you can register a metadata and then link all your posts or whatever to that metadata. So let’s say it’s a social network, people will link to the metadata that is linked to your username. Then the most connected metadata connected to a username would be the established one. So if I register the username “pauldaoust” and then you register it later on, but you become more popular with that username, you will become the dominant “route” from that username because you have more reputation.
So that would use hdk::get_links with a connection between the username and the user metadata, and the client could default to the one with the most likes. I think there could be a difficulty requirement as well where you need to provide a hash with a certain amount of leading 0’s to prevent users from claiming too many usernames.
That is a very interesting UX idea; I’m intrigued!
An alternate model is Secure Scuttlebutt, which lets people assign any old username to themselves or others. Sure, there are lots of clashes, but the real world tolerates name clashes without a lot of problems.