Inquiry: Entry and link type namespacing

thedavidmeister · December 9, 2019, 3:48pm

there’s more than names to consider when wanting to link across zomes as link validation depends on base validation

pauldaoust · December 9, 2019, 5:42pm

heh, @freesig and I were just walking through this very issue on Thursday… to me it would seem that including a crate into your zome, you should have the privilege of namespacing its entries and functions however you like. Functions is easy: you can only define zome functions inside the mod marked with #[zome], and you can choose to expose as many or as few as the lib crate’s functions as zome functions as you like, with whatever name you like.

Entry/link types are harder, especially because (a) that mixin crate’s helper functions need to know what namespace you’ve given everything, and (b) you can’t opt in/out of this or that entry type definition cuz you don’t know which ones are needed internally. That’s the risk of allowing a mixin crate to write its own data, I guess. To those with better Rust chops than I have (which is probably everyone here), can anyone think of a tidy way to do this, with or without macros? And as @freesig said, maybe allowing a mixin crate to write its own data is irregular compared to other stacks (app + relational DB) where the data layer is separate?

Good quote I think this is a useful framing that might help guide the outcome of this discussion.

@freesig did you determine if this was actually true vs me just talking through my hat?

@thedavidmeister without knowing what the “more [things] to consider” are, this feels important to dig into. Could you elaborate? I’m picturing two scenarios:

link validity depends on validity of base, doesn’t depend on knowledge of the base’s content
link validity depends on validity + content of base, which would probably require some cross-zome code sharing (struct def + deserialisation at least) to get working

thedavidmeister · December 11, 2019, 11:55am

validation needs to be deterministic and reproduceable by everyone, if you need to do validation across zomes, and end-users are deciding what those zomes are at runtime, then different people will have different opinions on validity

also the validation logic is happening “over there” which needs to be handled, probably at the subconscious layer, so it doesn’t just fail when the current zome fails to find the base entries

pauldaoust · December 11, 2019, 3:54pm

I thought it’s the developer who chooses which zomes go into the DNA, and the user can only make choices about whether to swap one bridged DNA for another (and then only when the developer has specified that a bridge dependency can be satisfied with a trait rather than a specific DNA hash)? To clarify, here we’re talking about entries and links that all live on the same DHT but whose types are defined in separate modules.

thedavidmeister · December 11, 2019, 4:22pm

@pauldaoust if there’s any difference at all then the validation could be different though right?

also i didn’t realise we were talking about everything in the same DHT so maybe what i’m saying is not relevant

pospi · December 14, 2019, 1:38am

@freesig @pauldaoust

The hard part is how do you link to an entry in another zome. Currently you need to redefine it in your zome.

I don’t think that’s accurate. AFAIK linking entries in different zomes (but same DNA) works just the same as it does when they’re in the same zome. No additional complexity involved.

@thedavidmeister @pauldaoust yes let’s scrap those last couple of comments because this entire conversation is specifically about multiple zomes inside the same DNA, as far as I know (;

thedavidmeister · December 14, 2019, 11:23am

mhmm, i was thinking of something else

pauldaoust · December 16, 2019, 10:19pm

Ah, never mind then; I told @freesig that I thought this was the case but I didn’t actually test it — it was based on my misreading of the source code that defines the link_entries action.

So if you can link to a foreign-ly defined entry type, and if we’re talking about namespacing all zomes’ entry types, as @freesig said this creates problems, because you won’t know the namespace of the other zome’s stuff at authorship time — it’s gotta be applied at compile time, I think. Either it’s based on the zome’s hash or the DNA developer (not the zome developer) gets to name it. But the dependent zome has to define its dependencies and give them internal handles that are then satisfied whenever link!() and call() are called. I see parallels with hApp bundles — the manifest file defines a bridge dependency, then assigns the handle that the dependent DNA is expecting.

Unlike bridges between DNAs, I imagine dependencies between zomes in a DNA will be static — no runtime dependency creation. Should be simple enough to put this into app.json for the time being, something like

{
  // ...
  "zomes": [
    {
      "id": "alpha",
      "location": "zomes/alpha",
      // ... here's where all the stuff generated by the zome gets mixed in --
      // name/description, code, function exports
    },
    {
      "id": "thing_that_depends_on_alpha",
      "dependencies": {
        "internal_alias_given_by_zome_author": "alpha"
      }
      // ... zome-generated stuff; in the code, you always refer to alpha
      // by the internal alias you've given it, the same way you `call()` a
      // bridged DNA not by the instance ID but by the bridge name.
    ]
  }
}

More thoughts on this:

For zome calls, the namespace resolution from caller’s internal handle to callee’s DNA-defined name can happen at dispatch time.
But we also care about entry type names in dependencies (for the sake of link definitions). I don’t know where that’s actually used — is it merely for the sake of generating the link type definitions in the zome’s block in dna.json? If that’s the case, maybe the aliases defined in app.json simply replace the link type defs in zome.json? Is there any base/target type checking that happens subconsciously at validation time, or does it just check that the base and target exist? It doesn’t look like base and target are actually passed into the validation function, despite what the documentation says.

thedavidmeister · December 17, 2019, 11:14am

@pauldaoust the base was definitely a dependent validation at one point (maybe this has changed, i did not check the code), in that case the problem isn’t that you need the base or target for the current validation but that you need the base to have already been validated from the perspective of the current zome before the link validation will even start

pauldaoust · December 17, 2019, 5:20pm

@thedavidmeister okay, cool, so at the subconscious layer, link validation just makes sure that the dependencies (base and target) exist and are validated, but doesn’t check any of their content against the link’s constraints.

Re: validation dependencies, I can see a future where app authors will want to either (a) check that dependencies are valid, or (b) check that they’re valid and use their content in validating the current entry on which they depend. (In the above case that would look like “base and target are both of the expected entry types”; could be handled by the HDK.)

pospi · December 19, 2019, 4:37am

you won’t know the namespace of the other zome’s stuff at authorship time

Yes, I agree this is a significant barrier to this being workable. There needs to be some way to refer to entries between zomes at compile time. That’s why I advocated for zome_name:entry_type_name rather than zome_name:crate_name:entry_type_name, because the former two can be controlled predictably within a single project (read: DNA). It creates a DNA-global namespace for zome names, but I think that’s OK- and if we create some patterns for macro-driven helpers that can inject the name of the zome, we have the best of all worlds.

@pauldaoust’s proposal for defining zome aliases in the DNA manifest file gets my support as a more considered solution that effectively does the above but with a little more structure.

pauldaoust · October 1, 2020, 8:04pm

Update on this subject: The new Holochain RSM has the zome ID and entry type ID of each committed entry, as well as the zome ID of each committed link (links no longer have types). That effectively prevents namespace clashes between entry types from two zomes, which are supposed to be nicely encapsulated black boxes from a modularity perspective.

Additionally, because you don’t have to define types for your links anymore, that means you don’t have the problem of needing to referr to another zome’s entry types using a reliable handle.

So between these two things, the problem is effectively gone.

guillemcordoba · October 1, 2020, 8:33pm

It makes my brain reaaaally happy when by removing an abstraction problems disappear, and a new axis in some weird dimensional mental model appears (more flexibility without link types). In short: yeyy!!

pauldaoust · October 7, 2020, 5:54pm

@guillemcordoba could you unpack why typeless links are so exciting to you as an app developer? I can make guesses, but I dont’ think I’ve got the insight that comes from having worked with both Redux and RSM. (And TBH it surprised me that we removed link types.)

thedavidmeister · November 27, 2020, 1:34am

@pauldaoust just speaking for myself here, but rust works very well on raw Vec<u8> binary data, trying to squeeze certain abstractions into ‘slots’ or ‘strings’ is just more difficult than a plain old vector sometimes

pospi · January 4, 2021, 2:54am

Does this also mean that you can no longer manipulate entries defined in a foreign zome? Because that was an important part of my security model that will have to be reviewed…

pauldaoust · January 6, 2021, 11:44pm

@pospi Hmmm, that’s a good question. So you’re asking if zome A can write/update/delete entries of a type defined by zome B in the same DNA? Looks like it can’t; the zome ID is enforced on the host side of the create call. I think in my mental model I always pictured a zome’s public functions as the way of interacting with its data types, for encapsulation’s sake (ignoring the fact that the encapsulation leaks; any zome can read another’s data or hang links on it… and cross-zome deletes aren’t enforced either).

what’s your security model look like, and in what ways do you anticipate this revelation messing it up?

pospi · January 12, 2021, 9:06am

Just had a chat about this with @guillemcordoba and I think things are workable as before.

He tells me the zome ID is set by the crate that defines the entry def. Any foreign zome attempting to manipulate those entries is actually deferring the calculation of EntryDefId to the host, which is able to make the lookups across zomes.

Anyway, in answer to your security model question: sometimes, there are behaviours that you want to trigger in a foreign zome that you only want to trigger as the result of some logic elsewhere. This makes capability tokens between zomes non-applicable, since assigning such a token would allow the agent to manually call the foreign API method as desired (rather than being forced to call it through some other ‘controller’ method).

There are of course complexities to this when it comes to validating data held in multiple zomes, but that’s a topic for separate threads.

pauldaoust · January 20, 2021, 10:54pm

Ha ha, I love hearing things through the grapevine! So let’s see if I’m parsing this right. if zome A defines an entry type and zome B tries to write an entry of that type, it’ll still get tagged as belonging to zome A and hence validated by zome A’s validation callbacks? I still don’t know how to read the codebase but didn’t see any facility for that; the entry type ID appeared to just be a non-prefixed string rather than a tuple of (zome ID, entry type ID) like I’d expect.

So it sounds like you’re using zomes here for code organisation rather than encapsulation or information hiding?

pospi · January 25, 2021, 12:22am

I believe that’s the expectation, yes. It may be the case that it only works currently by “first come, first served” and that zomes redefining entry IDs might unexpectedly take over responsibility for executing validation rules, which would be bad. Probably worth checking that with someone in core.

It’s kinda both. Hard to explain, but manipulating foreign zome entries within a DNA allows one to write validation rules which coordinate data across multiple cells cleanly. It’s like, even if I have to write separate bits of data in some order between different application cells in order to coordinate a more complex action, I can still write validation rules which safely presume the dependent data is available. It allows one to get transactional guarantees between multiple cell calls where no such guarantees exist natively.