How to bidirectionally replicate data between DNAs?

ViktorZaunders · September 18, 2019, 1:15pm

Yeah, isn’t this risk also there when connecting say three DNAs, cyclic entry creation that could spin out lots of entries by dependency with poor code? So not a viable route anyway?

pospi · September 27, 2019, 2:42am

There’s been some really good indepth discussion on this between @pauldaoust & myself for those who are watching along. Requirements and use-cases crystallizing nicely.

pauldaoust · September 30, 2019, 7:39pm

@ViktorZaunders yeah, that’s exactly the concern that one of our team members raised. Reasoning about two mutually interdependent DNAs is moderately easy, and so is reasoning about three if you wrote them all. But once you have an ecosystem of developers and their DNAs, you could run into circles with unintended consequences.

@pospi

you can?! Wow, I didn’t realise that — it feels like leaky encapsulation to me. Wonder if that was just an oversight and it’s gonna get closed in the future…

Back in my programming days, I used to fret and obsess about circular dependencies. In fact, they were impossible cuz I was writing in C#. Made for some convoluted ways of getting the dependency graph untangled; dependency injection was usually my go-to.

Thinking about it, ambient signal emission doesn’t prevent this either. DNA X could emit signal A that causes DNA Y to take action and emit signal B that DNA X receives, causing it to emit signal A again. It was just my proposal for a way to un-circular-ize hard DNA dependencies in a tidy way. Ambient/declarative/passive signals allow a DNA to be ignorant of its consumers, whereas direct/imperative/active calling/message-passing introduces tight coupling because it requires the caller to know how its callees work.

I’m chewing over your thoughts on signalling, message passing, and validation @pospi. Raises interesting thoughts. Re: validation, I see the value. It could be ensured just as easily with emit as with call, I think, because my conductor can provide assurances that the right DNA emitted that signal.

Re:

Definitely see what you’re saying there. Looking for the base abstraction, call could certainly be seen as a special case of send with a method name and tuple of arguments as its message.

And within one conductor, that’s pretty much how it works — the conductor creates a special public grant type for intra-DNA zome calls, inter-DNA zome calls, and UI calls, then gives the token to the callers. It is nice to have a special affordance for function calls layered on top of message passing though, wouldn’t ya say?

There’s even a pattern for doing it between agents — Alice shares the cap token with Bob, which he then passes back to her whenever he wants to “call a zome function in her running instance” (which actually just looks like him sending her a message consisting of function name, parameters, and cap token). Alice checks the function he wants to call against the conditions of the grant represented by his token, then calls the function for him if it all checks out. Eventually I think this might have its own convenience function in the HDK.

pospi · October 17, 2019, 5:53am

I hope not, because I think it’s a really good time-saver for zome mixins. Think of them as stateful additions to custom business logic… it’s way easier if they can be plugged in to define all the record types, and can be driven by that business logic. I think it’s actually necessary to have that feature to be able to use zomes as functional mixins correctly- otherwise you need to expose all zome functionality over the RPC gateway, which means that external clients would be able to manipulate the zome state without restriction. Unsure if I’m explaining that properly but hope it makes sense…

pauldaoust · October 17, 2019, 2:58pm

Oh, hmmm, I see your point — you’re saying that if you want to use a zome to add functionality to another zome’s stuff (e.g., zome B could hang links on zome A’s entries, and you want to make it generic so that zome A could be anything) either they need to be able to privately talk to each other without exposing their guts to the UI (that is, some sort of exposure level other than hc_public), or they need to be able to access each other’s entries directly. Is that about right?

I’ve always seen zomes as basic units of encapsulation that shouldn’t be able to access each other’s data. This is less important when one developer creates all the zomes in a DNA, but it becomes a big deal when devs start plugging third-party zomes into their own DNAs. You’ve got this issue of zomes serving two related but distinct purposes: encapsulation and modularity.

I only half know what I’m saying; it’s hard to talk about it with concrete examples. Sounds like you’ve got some though; what sorts of mixin-style zomes have you built that depend on access to other zomes’ entry types?

Interesting that you say ‘functional’ mixins, since data hiding isn’t really a thing in functional programming; that’s more of a leftover from OOP days.

@freesig I guess this answers our question about whether entry types are namespaced by zome name or hash! Looks like all data lives in a common pool.

pospi · October 28, 2019, 9:28am

Yep, we are on the same page 100%. So here is my use-case, without which doing REA accounting on Holochain for any non-trivial app would be much more involved:

HoloREA has an “observation” DNA that holds Process, EconomicEvent and EconomicResource records (plus a few others), each in separate zomes.
Business logic in a client app wants to constrain the way that particular events are entered, such that only certain resource types are allowed. Or pick whatever use case here, I think it’s safe to say that businesses will want particular business processes to be carried out without allowing any random arbitrary process to be defined by their users.

I think that’s it. You can see the same pattern in operation currently between the EconomicEvent and EconomicResource zomes- resources aren’t manipulateable directly, only via events. But I wanted to distinctly separate the two, to essentially make people context-switch between thinking about logging events vs thinking about querying resources.

So you could maybe combine both of those into one zome & call it ProvenanceTrail or something, ok, don’t like it but I could live with it.

But what happens when a business wants to do their custom integration over the top? That would mean they have to put both those zomes, along with their custom zome, all together in order for it to work.

Basically it feels to me like such a change would force you to build monolithic zomes rather than being able to neatly separate them.

I think this probably also has implications for zome traits- can’t do those as easily if you’re forced to combine them with other namespaces all the time.

What do you think? Am I making sense or missing something?

data hiding isn’t really a thing in functional programming

Of course it is! What do you think closures are for? In fact, they are better at data hiding than objects are… access specifiers can be bypassed by reflective capabilities a lot of the time… no “backdoor” into a closure’s state from within the language…

pauldaoust · November 4, 2019, 10:51pm

ahhhhhhh the map gets even clearer. So I was picturing the observation layer as something that simply provided an API to create the three building blocks (resources, events, agents) without any higher knowledge of why it’s being asked to create. Which is sort of true, but sort of not, in the sense that it should only create these things in response to legitimate business rules that come from the outside.

In the smaller scale example (EconomicEvent and EconomicResource), I take this to mean that only the EconomicEvent zome should be allowed to call EconomicResource's functions, because only it knows how to do it properly. And in the larger scale example, only the custom rules for a business know how to properly call any of the observation layer’s zome functions — is that right?

The question that comes up for me right away is, how does the observation DNA determine whether it’s being called legitimately or not? I don’t know. There’s no way to say hc_public(but_only_for_these_approved_UIs_and_DNAs). I’ve got thoughts involving dependency injection floating around in my head, but you couldn’t do that cross-DNA… you could do it cross-zome, though, and of course inside a zome.

okay, fair enough I wasn’t thinking of closures when I wrote that up. I was thinking about that immutable, category-theory style that FP encourages — data structures are just dumb objects with no internal state, and state mutation involves mapping/reducing/filtering on data structures and the monads that hold them, to produce new data structures.

pospi · November 5, 2019, 11:56pm

Broadly yes! I think we’re still understanding each other.

Well, this is what I’m solving for. By not exposing the observation API endpoints in a custom integration, and only allowing its data stores to be manipulated via some proxy zome (the “business rules”, in this case), you prevent the observation API from being able to be called incorrectly (indeed, at all).

FWIW in a standard install, the observation API is precisely as you have described- it provides facilities to create building blocks without any higher knowledge of why it’s being asked to create. It’s only when people need custom constraints and logic that it becomes necessary.

Anyway, I think we have veered off topic somewhat. Can we basically agree that there is a need for handlers in one zome to be able to manipulate entries defined in another? I feel as though that is a hard requirement in order for “mixin zomes” to be able to function as intended.

pauldaoust · November 6, 2019, 4:23am

Ha ha, sorry, I just wanted to take the opportunity to understand the need more concretely so you’re saying that the observation API endpoints are not hc_public, correct? Oh, and I forgot to ask: what’s the definition of ‘manipulating another zome’s entries’ in the context of this discussion?

pospi · November 7, 2019, 4:06am

Umm no, I don’t think that’s anything to do with it. If the API endpoints weren’t public and you were expecting capability tokens between DNAs to restrict functionality for you, you’d be out of luck. The agent could just take the cap token and use it against the RPC gateway of the “restricted” zome to pass in whatever parameters they wish. Real security of this sort is only possible if there is no way to call into the zome externally except via the proxy zome.

pauldaoust · November 26, 2019, 8:47pm

@pospi look what I just discovered when I was browsing through the Guidebook! From Emitting Signals:

Future additions will be:

Signal signature description in the DNA ADR 13 describes signals as statically defined properties of a DNA which would enable conductor level binding/connecting of signals with slotes (i.e. zome functions) similar to bridges but with looser coupling.

Reading ADR 13, we find:

Finally, just as you can call any function using the core_api::call() , you can register a listener with core_api::listen() and you and unregister a listener with core_api::unlisten().

This suggests that your desire — to see signals emitted in one DNA to be received by all connected DNAs in the same conductor — is on the map!

pospi · November 28, 2019, 8:06am

Hey I have a friend who wants to contribute some tax accounting standards stuff to HoloREA’s ecosystem… she has dived in to Corda a bit so I was hoping I could share your doc with her in order to get her up to speed, provided I tell her not to circulate? How do you feel about that kind of thing per that document specifically?

pauldaoust · November 28, 2019, 9:52pm

I think that’s fine, esp for the sake of HoloREA

pospi · December 2, 2019, 10:04pm

https://miro.com/app/board/o9J_kx0H2NA=/

pospi · December 2, 2019, 10:30pm

The craziest validation I’ve seen so far- https://github.com/holo-rea/holo-rea/issues/93

pospi · December 2, 2019, 10:50pm

pospi · December 2, 2019, 10:56pm

pospi · December 2, 2019, 10:59pm

harlan · February 1, 2020, 9:03am

This is interesting! Is this only true within the same DNA (I would expect), or across DNA’s as well?

Also want to say thanks @pospi and @pauldaoust for this great thread – I read with interest, followed links, and learned a lot.

pospi · February 2, 2020, 10:10am

Only within the same DNA.

@pauldaoust and I developed a pattern for zome modularisation via this process that seems to be holding up well for me in Holo-REA. One of the requirements there is to split library code and zome internals neatly into related crates, such that you can use defs inside the co-located zome itself but interact with its storage layer by using lib in either that zome, or any others local to the DNA.