Hdk::get_links returning the same address more than once in testing

When I’m testing my happ locally, I’ll make the same entry 3 or 4 times, then when I call get_links, I’ll get the same entry back 3 or 4 times. Shouldn’t get_entries dedupe this? Or is there something that needs to happen to commit entries? I believe it’s because everything is in memory, but wanted to see what I would need to do to get a more realistic test environment.

I believe this is because it’s a testAgent and storage is memory in my config, but I’d prefer to not have to add a dedupe step to get more realistic responses from my queries. Or is there something you need to do to your ZomeApiResult objects to ensure unique entries?

[[agents]]
id = ‘hc-run-agent’
keystore_file = ‘testAgent’
name = ‘testAgent’
public_address = ‘XXX’
test_agent = true

[[dnas]]
file = ‘/workspace/holochain/dist/holochain.dna.json’
hash = ‘XXXX’
id = ‘hc-run-dna’

[instances.storage]
type = ‘memory’

Actually entries are new headers added each time you add an entry. Even if the entry is the same. See this issue https://github.com/holochain/holochain-rust/issues/1979
I’ll check what get links does but this could be why you’re getting multiple entries back.
Is there any reason why you need to keep committing the entry?

Well, I’m just trying to emulate user behavior. Like, if I wanted to have users and allow them to follow each other… I had assumed since it’s a hash table that following a user twice would be idempotent. But it seems to create the link twice. This means a user could click “follow” 1,000 times and create 1,000 links, which would obviously cause bloat and would be annoying for the network as a whole.

Anyways, I will check out that issue, thanks for the quick response!

I think what you could do is just check if the entry exists and then only commit it if it doesn’t then link from the same address. That shouldnt add a link but I’ll have to confirm this. I’ve also been playing around with these ideas with the anchor crate. I’d like to get this super well documented and suggest a good pattern as you’re right it would be a common thing for an entry or link to be committed more then once.

Will get back to this when I’m at a computer.

so I think some utilities could be added to the sdk for this. For example:

pub fn get_or_create(entry: EntryType) -> ZomeApiResult<Address> {
    let address = entry.address();
    if hdk::get_entry(&address)?.is_none() {
        Ok(hdk::commit_entry(&entry)?)
    } else {
        Ok(address)
    }
}

This issue with this is if you are connected to another node that has the entry you will never store it locally.
I’ve been trying to come up with a solution that quickly checks our local chain before making the network call.

This is surprising to me too. When you create a link, three things are happening:

  1. A LinkAdd entry is added to your source chain.
  2. The LinkAdd entry is published to the DHT, along with its header.
  3. The nodes holding the link’s base entry are instructed to add a link metadata to the entry, containing roughly the same content as the LinkAdd entry.

@rlkel0 are you saying that, when you commit three links of the same type from the same base to the same target, then call get_links, you get the target back three times? If so, I think that might be incorrect behaviour and different from #1979 (although maybe it’s caused by #1979).

I’d have expected that the LinkAdd entry is stored multiple times on the source chain (with separate headers each time), but deduped in storage — in the user’s local storage, as well as on the DHT and in the links metadata on the base. My understanding is that the links collection on each entry is like a mini-DHT — a separate hash-addressed storage for each entry.

In other words, committing the same link twice isn’t idempotent in reality — it shows up as two identical entries on your source chain, with different headers — but in practice is almost idempotent because the link gets deduped (ignoring the fact that extra headers are created). Or at least that’s what should happen, and it sounds like it isn’t.

I can make a quick .mov tomorrow. I think it’d be easier to explain with a demonstration.

1 Like

I made a .mov but this kind of shows what happens. I basically set up a few anchors based off of a zome I found online, and was working to modify it to make a throwaway DHT.

Here’s the javascript:

function create_username() {
  console.log('creating username');
  holochain_connection.then(({callZome, close}) => {
    callZome('test-instance', 'hello', 'create_username')({
      name: 'test',
    }).then(result => console.log(result)).catch(
      err => console.log(err)
    );
  })
}
function retrieve_usernames() {
  var address = "HcScjN8wBwrn3tuyg89aab3a69xsIgdzmX5P9537BqQZ5A7TEZu7qCY4Xzzjhma"
  holochain_connection.then(({callZome, close}) => {
    callZome('test-instance', 'hello', 'retrieve_usernames')({
    }).then(result => console.log(result));
  });
}

and the zome logic is basically:

#[zome_fn("hc_public")]
pub fn create_username(name: String) -> ZomeApiResult<Address> {
    let anchor_entry = Entry::App(
        USERNAME_ENTRY.into(),
        Username {
            name
        }.into()
    );
    let address = hdk::commit_entry(&anchor_entry)?;
    hdk::link_entries(&username_root().unwrap(), &address, ROOT_USERNAME_LINK, "")?;
    Ok(address)
}

and the get usernames function calls:

pub(crate) fn get_usernames() -> ZomeApiResult<Vec<Username>> {
    let root_username_entry = Entry::App(
        ROOT_USERNAME_ENTRY.into(),
        RootAnchor {anchor_type: ROOT_USERNAME_ENTRY.into()}.into()
    );
    let root_username_entry_address = root_username_entry.address();
    hdk::utils::get_links_and_load_type(
        &root_username_entry_address,
        LinkMatch::Exactly(ROOT_USERNAME_LINK),
        LinkMatch::Any
    )
}

Let me know if you have any more questions, it didn’t feel novel enough to setup a repo but if you’re curious I’d be happy to.

Thanks; the screenshot and function defs are enough to give me a picture of what’s happening. So this shows me that links are indeed not getting deduped, and this surprises me. I’ll try to bring it up with core team today to get clarification on whether this is intended behaviour; maybe it fits into the related issue #1979 that @freesig mentioned.

I got some more information on this: apparently links contain timestamps, so a link commit is not as idempotent as you’d expect. This has something to do with being able to re-add a previously deleted link. This is different from how adding and deleting entries work, so I guess the conversation now is, what is the most likely expected behaviour and should we make link behaviour consistent with entry behaviour? (That is, an add is idempotent, and a deletion once is a deletion forever.)

I’m not sure if this is relevant but even the exact same entry being committed multiple times produces multiple unique headers.

For sure, and that’s something for devs to be aware of… the entry won’t take up any extra space, but the headers will. (Whereas with links it seems that both the header and the link are unique every time.)

1 Like