Proposal for a Feature Update: Chunking Entries

The-A-Man · January 19, 2021, 10:44am

I have had a proposal in my head for a feature update for Holochain for quite some time now… And that has to do with splitting entries into fixed-size chunks (of size 4 KB, for instance), much like what Eth. Swarm does with files… The conductor should automatically split an entry into these small chunks upon entry-creation. There should be a function (in the HDK) for getting the whole entry (this function takes some time to collect all the chunks; only upon having collected and assembled all the chunks should this function return), and another for getting the chunks asynchronously. Moreover, each running DNA instance should be exposed as a gRPC Protobuf (v3) API (as gRPC supports bidirectional asynchronous data streaming over HTTP2; don’t worry, the serialized data would still be binary, thanks to Protobuf); there should be some magic helpers in the HDK to send and receive data over the gRPC input and output streams that get created when the user makes a function call. Even in a single-device Holochain setting (as opposed to Holo), although the user’s device is both the client and the server, the new proposed approach would mean that the UI would ACTUALLY be connected to the (h)app and not just talking to it; the UI/app would be free to do things that presently can’t be achieved, such as uploading insanely huge files, and if the file-format implementation is chronological in nature (such as HLS), it would imply not having to wait for the whole file to be fetched in order to consume the content!

I understand that Holochain wishes to be good at one thing, and one thing alone, i.e., executing code in a peer to peer setting, and that too with safety. However, I think code-execution, chunking and file-storage, and input/output streaming can all come under the same thing!

Any opinions???

guillemcordoba · January 19, 2021, 4:04pm

Hi! Mmm so isn’t this easily achieved at the app layer? I have this example, which does chunking of entries on the frontend.

I see different chunking strategies being used in different apps so don’t really know how holochain core would handle all those cases without being overly complicated. Do you think there is something that can’t be achieved at the DNA level?

The-A-Man · January 19, 2021, 6:06pm

Actually, your work on file-storage is what partially got me interested on this subject… However, as you mentioned, Holochain core would quickly get complicated were this to be done at the conductor level. And yeah, I guess you’re right. Keeping the core simple and generic to the bare minimum sounds like a reasonable argument for why not to fiddle with the core for use-cases like these, especially when they can be implemented at the frontend-layer albeit with a few hacks and code-repetitions (i.e., zome reuses; of course reusing zomes is great, but not so great if every conceivable Holochain project would only benefit from that functionality, warranting that the functionality be rather built into the very core itself; I can’t imagine any project that would be worse-off if chunking were to be baked into the core). The old argument for not doing this was, if I remember correctly, something about Holochain not wanting to compete with the other existing distributed file-storage solutions (like IPFS and Swarm); however, as you have pointed out, the argument for simplicity would suffice (at least for me)…