How to keep drafts / working copies / session variables?

kristofer · December 30, 2019, 12:49pm

Is there a way to keep something similar to session variables between zome function calls?

I’m prototyping collaborative editing of text files and would like to have a “working copy” while editing without having to commit on each change.

Many agents edit the same working copy at the same time. (node to node messaging, signalling)

The working copy is committed as an entry when the user pushes “save/publish”.

Currently I use an external server for the working copy (and the signalling between editors), which feels genuinely not distributed and agent centric.

Garbage collection as described by @philipbeadle could be one option allowing for many entries to be committed while editing. The working copy entries are garbage collected when document is published. Is garbage collection planned to be included in early releases?

Another option would be to use Throwaway DHT as described by @pauldaoust. Will a DNA instance be able to create a “child DNA instance” anytime soon?

pauldaoust · January 6, 2020, 11:49pm

Hi @kristofer ! Great question, and it’s something we’re already chewing on. The long-term intention is to have a ‘temporary storage’ bucket in each node that lets them maintain application state between zome calls without having to commit anything. Question before I go further: are you thinking about a ‘global’ working copy or a per-user working copy?

kristofer · January 7, 2020, 2:24pm

I’m thinking that local state would go a long way. The working copy is maintained by the current text owner locally. Collaborators access the local working copy through zome functions made available through node to node messaging.

Can global state be avoided as a concept? I say, let’s try But that, I guess would make holochain not the right fit for all applications. If you need global state, use an offchain indexing server (for global search maybe) or rethink architecture. Could the use case be realised in an agent centric fashion instead? Is there really a need to access all data?

pauldaoust · January 7, 2020, 5:28pm

@kristofer ah, I see, the document always has an ‘owner’ that maintains and merges all the state changes? I’m understanding better now. I’m guessing the owner keeps the entry private on their own source chain and mediates all commits? This seems to be a good pattern for fine-grained access control.

I’m wondering about what it would take to allow edits when the owner is offline — looks like YJS is already a CRDT-based system so this could work nicely if you allowed editors to publish transforms directly to the DHT, and the owner’s ‘save’ action would merely be a reference to all the transforms that made it in. Some thought would be needed to avoid bloat and maintain good performance though.

AFAIU the garbage collection will only apply at validation time, when a validator needs to pull in dependencies before validating an entry (e.g., pulling in the base and target when validating a link). After the entry is validated, the dependencies are no longer needed (because they’re stored by other DHT nodes). It could be that GC is being considered for other situations, which would be cool, but I don’t think this is true.

However, we do eventually plan to add some sort of temporary local storage, which could be used for session data. If you’ve only got one UI, this isn’t too useful because the UI can store this state. But as soon as you’ve got multiple UIs connecting to the same DNA, it’s handy to push that stuff into the DNA so all UIs can benefit from that application logic.

kristofer · January 9, 2020, 4:17pm

Sounds great, looking forward to that!

freesig · January 10, 2020, 12:30am

Or you cache it in memory / disk outside of holochain, in your UI or middleware.

kristofer · January 10, 2020, 6:54am

That’s where I started to explore initially. The Holo.txt prototype I created uses a middleware outside the Holoverse to sync clients collaborating in real time on a text. When someone pushes “save”, changes are committed to the DHT. That works great but it would have been cool not having to run a separate server (on Heroku in this case).

freesig · January 10, 2020, 7:23am

I was more meaning caching it on the same machine that’s running the holochain node. If you need anything to survive longer then that machine is up the you probably want to write it to the DHT.

kristofer · January 10, 2020, 7:53am

Caching on one machine / one client is of course no problem. Then, caching in the client makes most sense - window.localStorage etc.

Another situation occurs when more than one user collaborate on a resource (a shared text for instance) and changes are so frequent that writing every change to DHT is a bad move. Saving every keystroke makes no sense and saving (and syncing between clients) every 30-xx seconds provides a bad user experience.

The ability to store temp data from on call to another would eliminate the need for an external sync-and-cache server. A simple key/value store - session variables would do the whole thing.

But if @pauldaoust is correct, this is on the horizon?

Edit: Came to think about it, if holoports in a future scenario also can host/run some kind of middleware, caching/sync most likely can be achieved in that context.

freesig · January 10, 2020, 8:11am

The temporary storage that Paul is talking about would just be memory that is accessible by a zome that would live longer then a single zome call but would be cleaned up when the machine running the some shuts down or the happ is shut down. That’s my understanding anyway.

I think you could still cache in the client each keystroke. You’d probably want to directly message these changes to the other collaborators where they would also cache them.
Then every so often you’d be writing the document state to the dht.
There would be some interesting conflict resolutions to solve.

Also I’m using the word client very loosely here, it could be an application running in node or a browser or even a full native application.

The direct messaging can be done through holochain and then emitted out to your client using a websocket. This should be very fast.
I still don’t think you’d want to send individual keystrokes, probably some smart buffering algorithm.

pauldaoust · January 13, 2020, 10:02pm

@kristofer I totally agree with you about the trade-offs you’re exploring; there are some potentially bad options in what I’m suggesting — just mapping the entire possibility space I imagine that you could have two scenarios — the synchronous/online scenario, where the owner could be responsible for aggregating changes (and perhaps rebroadcasting them to all editors, which gives a modest measure of consistency), and the async/offline scenario, where editors call up the most recent version they can see, buffer their changes, and save them to the DHT in batches. Less real-time; you would likely see a lot of latency in updates this way, unless the existing editors could coordinate with each other to determine who’s still online in the absence of the owner.

kristofer · January 15, 2020, 9:19pm

Thanks for the clarifications @freesig !

I would like to have a go at developing a communications provider for the p2p shared types library y.js using Holochain signalling & node to node messaging. I think that could work really well and as you say… be very fast. Hope to find time to that soon

kristofer · January 15, 2020, 9:23pm

Ha, yes! That would be a fun nut to crack! (I mean it) The working copy being passed to another client when the owner goes offline. And then is saved to DHT when last user disconnects, unless someone hits save first of course.