Holochain networking - Distributed computing - Thesis

th1j5 · March 1, 2021, 9:02pm

A quick question: does Holochain depend on any external networking libraries? After a (very) quick look, it seems most advanced stuff is done in-house (kitsune_p2p)?

(I’m not totally certain what I’m looking for myself, but not just QUIC/quinn (?) - if you want to help me out, keep reading - if you can, this might prove beneficial for Holochain too, once other internet architectures break through )

Reason I’m asking (comments and suggestion appreciated!):
Next (school)year I’m going to do a thesis around Ouroboros (very similar to RINA) and a supervisor sent me a proposal, which is about using this network architecture in serverless environments, where TCP is too much overhead and Redis is suboptimal.

Proposal for more context:

Inter-process communication for serverless application workflows

Serverless computing as possible with AWS Lambda, Google Cloud Functions or Microsoft
Azure Functions enables application developers to write their application logic without
needing to worry about the type and configuration of underlying physical or virtual
infrastructure (VMs or containers). Developers can write an entire cloud application or data
processing pipeline as a combination of functions with short-lived execution time and
limited memory requirements. The underlying serverless platform will then enable the
application to seamlessly scale without requiring any additional input of the developer.
This new cloud computing model has been adopted massively due to these advantages
for developers, but also because it enables the cloud providers to better multiplex their
infrastructure. Despite the tremendous adoption, certain applications still might require
longer completion times due to the suboptimal data communication between function
execution instances. Use of the regular TCP/IP networking stack and associated socket
APIs are not used in serverless environments, because the overhead required to set up
such connectivity for function execution environments which are very short-lived.
Therefore, existing serverless platforms rely on data storage platforms such as Redis to
pipe the outcome of one function execution instance to the input of another. This leads to
suboptimal execution times and inefficient network usage, especially in applications which
involve one-to-many communication patterns.

Objective

The goal of this thesis is to design and evaluate alternative mechanisms for direct
communication between serverless execution instances environments. Ideally, network
communication should be possible between temporary endpoints (execution
environments) using identifiers which can be resolved in ultra-short times, supporting both
uni- and multicast communication patterns between them with negligible connection
setup times.In a first phase, the student will do a preliminary study of scientific literature and network
protocol stack implementations in addition to analysing base workflows in serverless
platforms (e.g., OpenWhisk, Kubeless, etc.).
Next, the student will design novel communication mechanisms supporting rapid setup of
network communication between such short-lived function executions, for example
building further on the concepts of Recursive Internetwork Architecture protocol stacks.
Finally, the student will compare the resulting performance and scalability against
common practice of intermediate storage solutions as well as against a regular network
socket-based approach. Co-design with a serverless application scheduler able to exploit
the new communication paradigm might be part of the thesis. The local data center and
testlab infrastructure of IDLab will be used to experiment at larger scales.

This made me think about distributed environments in general (first Unison, then its influencer Erlang) ending up at thinking, can Rust do this too?

I'm still in the middle of research, but I found already many interesting sources/libraries:

Bastion
Reddit: fearless distributed computing
timely-dataflow
citybound with its back-end kay, also usable in wasm
Reddit: Distributed computing
Amadeus with the constellation framework, which uses quinn, which I think is also used by Holochain?
Ballista
libp2p (?)

So, if possible, I would nudge my subject to write something in the Rust (distributed computing) ecosystem.
Now I was thinking if there are libraries/tools/… (present of future) Holochain uses which might benefit from such a thesis (which is 720 hours nominally).

The subjects are mostly in the domain of networking for the research group I’m in contact with now.
Of course, in the end the supervisors decide if there is enough novelty etc. to allow a subject. (A ‘simple’ refactor of Holochain networking to support Ouroboros isn’t novel enough )

th1j5 · March 2, 2021, 1:53pm

If anyone has some insight/ideas on how a future with distributed computing and Holochain would/might look like, shoot!

@pauldaoust, I’ll mention you because you probably know who to ping if there is interest in this

pauldaoust · March 5, 2021, 5:33pm

Hi, I don’t know the exact details, but I know we’re relying heavily on QUIC (I think you’re right about the quinn library) because:

performance is no worse than TCP
performance gets better than TCP in unreliable situations (because there’s no connection to maintain, reconnects are cheap – even TLS-encrypted ones)

The questions that come up for me immediately are, is there a compelling reason to have persistent connections between nodes when function execution is so ephemeral? For instance, could multiple executions multiplex/stream their output to a worker pool on another machine using one QUIC connection? I’m not sure, because I’m not familiar with the existing research. But that’s how we do it in Holochain, although the situation is different because each node (Holochain runtime) represents a human and their agency so we don’t care about maximising machine utilisation. Still, we have had to invest a lot of effort in timing – eliminating distributed deadlocks, managing backpressure for delivered messages that need to be acted on, etc. In particular the core team has leaned heavily on tokio in ways that are really beyond my capacity to understand

I’ll see if anyone from the core team is able to help you out a bit more than I can.

artbrock · March 5, 2021, 7:33pm

Note: I provided a bit of detail about our networking situation on this thread that you also posted on that you might want to peek at to not repeat here.

Sorry… my answer may not be very exciting.

Yes…, kitsune_p2p is our networking library that we’re trying to keep fairly independent so that it could be reused by others who which to do sharded DHT-based p2p networks.

No… there aren’t really any bleeding edge networking libraries we’re using. Basically just QUIC/quinn because we’re trying to keep Holochain in comfortable/familiar territory for most devs & users.

A number of years ago I looked into RINA, very much liked its approach, and was thinking of using it for Ceptr. I heard of Ouroboros in its early days, but haven’t tracked it’, so it’s good to see it’s made so much progress since then.

Your list of distributed networking resources is cool and probably worth us looking into a bit (although at this stage we’ll KISS and stick with QUIC and look to these for future optimization/expansion).

@th1j5 said: A ‘simple’ refactor of Holochain networking to support Ouroboros isn’t novel enough

I’m not so sure this statement is true. I think it’s possible that an implementation of an infinitely scaleable and highly efficient rrDHT routing layer on top of Ouroboros comm protocols could be real, usable, and novel contribution to the P2P space.

I would not pitch it as “a simple refactor” but as solving a rather intractable P2P problem by converging resilience and scale and truly direct P2P communication capacities. AFAIK, this doesn’t really exist – not because it’s not needed.

Might be worth exploring further…

-art

pqcdev · March 9, 2021, 5:11am

what is the actor model being used?

https://riker.rs/actors/

I should have used the word ‘actor’ instead of “agent” Im thinking

isnt language silly… im still probably confused lol

Ellams using qQUIC in his Oasis protocol for pluggable API’s

for data resiliency & integrity, gossip info, DPKI management, collective governance. I agree with limiting to relevant uses on an as needed basis. I wuld imagine only about 20% at the core is vital.
I’m playing with spiral dynamics and sacred mathematics

pqcdev · March 9, 2021, 5:17am

thanks for your topic post. sounds like a cool thesis idea. I’d be happy to help you make it novel

3-dimensional and intelligent

pqcdev · March 9, 2021, 6:29am

it feels 2D still to me… am I imaging it wrong?

can source-chains transform into a 3D holonic living DAG?

it does look to have come a long way. as @pauldaoust said:

@th1j5 in Ourobos: Message Authentication Code (MAC)s are used for comm. these get encrypted via SHA-256. im pretty sure I read in the original WP that they will also utilize Werner One-time Signtature (OTS) – but dont quote me on that! lol [personally I like XMSS ]
“FRCP is only enabled when needed (based on the requested application QoS). So for a UDP-like operation where packets don’t need to be delivered in order (or at all), Ouroboros doesn’t add an FRCP header. If FRCP is enabled, Ouroboros will track sequence numbers and deliver packets in-order.”

it looks to me they are using a ‘handshake’ methodology for asynchronous exchange and a timestamped sourcing of LWE & RWE hierarchical organizing. similiar to HC using high % node sync for a viable resiliency using source-chain gossip in a DPKI hash table on QUIC.

deviates a bit from Recursive Network
noticed that QoS description was left a bit vague (or maybe elaborated somewhere else, havent searched yet)

Thesis: EID could be linked to biometrics avoiding need to KYC doc storage
Decentralized VPN layers run as smart VM. backend membrane governance.
also checkout: https://eprint.iacr.org/2018/1049.pdf tindermint

-“The Ouroboros ECN field is by default one octet wide, and its value is set to an increasing value as packets are queued deeper and deeper in a congested routers’ forwarding queues. Ouroboros enforces Forward ECN (FECN)”

i dont really understand this what it means. and why a size cap?

hows progress on the Fractal ^[Decentralized] Virtual Machines coming along?

HC allows for capability token management. where can I find more details?

i see distributed associative memory using a 3D holonic hash map. --dont have the thread, sorry – but I remember you saying to be mindful of how much packet traffic is necessary and to limit resiliency requirements (?). I’d like to shard source-chain storage, but I forget what you said about it
also curios, what is in place to maximize resiliency in congruence with latency? how will the gossip protocol work?

randomly store parts of encrypted memory? guess the challenge is to discover which neighborhoods are honest – or better yet, what ‘honesty’ even means. a better way to say it maybe "acting in accordance with DHT health and adhering to collective DNA rules… it makes sense to me to start with irl trusted hosts. self hosting and have your friends/family/etc host the Happ. later on as things scale, then it is relevant to consider how to not give too much power to ‘earn trust over time’. biometrics make issuing warrants easy. just need a rapid immune system response (gossip protocol) to make sure the damage done isnt too severe too fast.
i like the concept “know what your doing and that its unique, but not necessarily who you are” access to the node’s hash link can be capability functions.

what would HC say about flow allocations? that part went beyond me

Kademlia seems legit.
“The default implementation for the DIR component in the Ouroboros IPCP is a Distributed Hash Table (DHT) based on the Kademlia protocol.”
using SHA256.

I like NTRUEncrypt https://csrc.nist.gov/CSRC/media/Events/Second-PQC-Standardization-Conference/documents/accepted-papers/grobschadl-lighteight-implmentation-NTRUE.pdf for now i guess

pqcdev · March 9, 2021, 6:41am

please anyone correct possible mistakes :pray

#learning #noob

also cryptographers out there interested in diving into ring LWEs ?

bullet proof CT-rings? ZKsnarks?

th1j5 · March 10, 2021, 12:15am

Thanks all for the insight, this will certainly help to define my topic!

A particular sentence in the proposal caught my attention:

If I would apply this to a distributed framework (for example constellation for Rust), this would mean the runtime scheduler which decides which closure goes to which free node.
Of course, these Rust solutions (constellation seems like a 1-man project) might not be what they use in datacenters, actually invalidating the usability of such a solution, since the normal internet would be needed anyway (if talking about physically far away nodes).

If I would translate this to something Holochain related, I’d say it’s probably something in kitsune_p2p that would act as an ‘application scheduler’ (the gossiping, the zome calling, the rrDHT routing ? - still have to look more into kitsune) that would need optimizing.
The best translation is probably how you said it:

But I’ll still need to look more into kitsune to even grasp how big the differences would/should be between now and when implemented with Ouroboros.

Anyhow, even if I don’t end up doing something with Rust/Holochain and Ouroboros, I’ll help Ouroboros (and thus RINA) in other ways (for example, I think there are still interesting issues in multicast group management and how Naming works - think DNS)

And the better Ouroboros gets, the more chance that in the future this might be interesting for Holochain too! And about this:

Of course, you need something working on the internet of today, not a future version

@pqcdev, I’ll have to look into your comments more, they’re a bit scattered if I may say so

th1j5 · March 10, 2021, 12:34am

I’m not sure what you mean with this, if you still want to discuss it. If not mistaken, the flow allocation in Ouroboros is pretty lightweight, not unlike QUIC. (but I don’t know enough about both of them, certainly QUIC).

I think the point in this thesis proposal is to create a connection to the node (all inside the datacenter) with spare resources (chosen by the runtime application scheduler), transfer all context needed and then deallocate the flow. This should be much faster and more lightweight than the TCP/IP stack (or QUIC/UDP/IP) and also more optimal than Redis.

pauldaoust · March 11, 2021, 12:21am

Uhh, ha ha, it’s been long enough and I’ve digested enough information in the meantime that I can’t even remember what I meant myself! And if you’re looking for networking technologies that have less overhead than QUIC/UDP, I suspect my inquiry wouldn’t have much to offer. Maybe I was wondering if persistent connections (whether lightweight as in QUIC or heavyweight as in TCP) could be utilised to reduce overhead, but I’m sure that’s probably been researched to death already. And if the scope of the thesis is explicitly to deallocate the flow once work has been passed to the free node (presuming it’s a batch of work and not just a single function call), then I’m just talking through my hat here!

Sometimes, the things we come up with are useful, and sometimes, they’re a passing thought that seemed meaningful at the time… glad Art was able to come in with some concrete info!