Mailbox

pauldaoust · October 2, 2019, 7:24pm

Problem

Agents need to send and receive messages when one party is offline, but node-to-node messaging only works when both agents are online.

Solution

Write persistent messages to the DHT and link to them from an agent’s ‘mailbox’, an entry that they know to check when they come back online.

Implementation

Every agent has a well-known entry on the DHT: their agent ID entry. Just like any other entry, you can use it as a base for a link.

Create an entry type for messages and a link type for mailbox notifications. The link should have agent ID entries as its base and message entries as its target. When an agent writes a message to the DHT, she creates a link from the recipient’s agent ID to that message. When the recipient comes online, he calls a zome function that retrieves all the mailbox notification links, retrieves the linked messages, and deletes the links.

Warnings

This uses the DHT, so it creates a lot of persistent data. This might not be appropriate if messages are large or plentiful — in those cases, node-to-node messaging or a Throwaway DHT for the message content might be better.
If the popularity of recipients follows a hockey-stick distribution curve, it will create ‘hot spots’ in the DHT neighbourhood of their agent ID addresses.
Anyone can read messages on the DHT. If you want messages to be private, encrypt them with the recipient’s public key as in the Asynchronous Private Messaging pattern. But just be careful of the risks of that pattern.
Curious people can find out who’s talking to whom by checking the authorship of links and messages. Don’t use this pattern if that is a problem for your use case.

freesig · October 3, 2019, 5:57am

NIce!

The link type will have to be to! because you can’t define an entry_def for the agent entry.

I’m still having issues getting links to an agents ID to propagate to other agents.

Do we have some suggestions / alternatives if the messages are large or plentiful?

pauldaoust · October 3, 2019, 2:43pm

Good point about not being able to define an agent ID entry type — should that be from! though? (As in, the message entry type has a link from! the agent ID entry type)

Thanks for the caution about agent ID links; I’ll put in an alternative.

I’ll also flesh out the alternatives for plentiful messages — along with how to avoid hotspots.

tats_sato · October 6, 2020, 11:17am

hi @pauldaoust! I’ve been thinkering around how we could implement asynchronous private messaging in E2EE fashion even when message entries are committed to DHT!
I love this idea and by combining this with DPKI to rotate signing and enryption keys per message sent, I think we can even assure forward secrecy in a stream of conversation between 2 nodes in a public DHT with little to no membranes.

My only question with this implementation is that, is there any attack vector with encrypted Message entries being left in DHTs? This is after the fact that we remove the link to it from AgentPubKey and even call delete_entry on the only header for that entry (which makes the entry inaccessible but not completely deleted. And then the receiving agent can just make an identical entry but this time as private entry and commit it to her source chain. The entry can even contain a previous_message_address field to create a sub-chain of conversation inside the source chain) Imo, the only way to access that deleted encrypted entry is to somehow recreate the hash of the entry and simultaneously have the private key to decrypt the entry. The cost/reward ratio of bad actors getting just one message out of a conversation might already be enough to suppress these attack verctors but Im not sure so I wanted to bring it up

pauldaoust · October 7, 2020, 6:09pm

Hmmmmmm… Perfect forward secrecy on a Holochain app would be pretty cool indeed. One thing that comes to mind is that you wouldn’t even need to use DPKI; the keystore will eventually have all the keystore_* API functions that Redux had. Using seed generation and key derivation, all parties to a conversation could rotate keys together and stay in the loop. There are even decentralised ways to rotate keys; Matrix Chat has pioneered it and the Scuttlebutt folks are thinking of ways to incorporate it into private conversations in SSB too.

Attack vectors:

Ed25519 keys are broken by Schorr’s algorithm, so having all that historical data publicly available on a long-lived DHT could create risks when quantum computers become viable. People might not think about that when they post something sensitive.
You could ‘purge’ or ‘forget’ the data once it’s received – Holochain will eventually have a DHT op for that – but it’s based on good faith, so non-complying DHT nodes could keep that data around.
Metadata leak: writing as a private entry + purging the original message will still leave a trail of headers that show who’s received and stored a message, which can expose the network of social connections. Of course, my original pattern talks about posting public links to someone’s mailbox address, so that is a pretty big metadata leak too. Maybe something could be done with rolling mailbox anchors, but the publisher of the anchor would still be visible. There has been talk about being able to attach links to non-existent anchors, and in that case the only metadata leaks are that the node storing the links for that anchor would be able to see who’s checking it because they’ll receive a get_links message.

tats_sato · October 15, 2020, 5:58am

Oh yeah thanks for pointing out that we dont need DPKI for that. I was wrong in thinking that and after doing my homework, it seems like DPKI is needed for the initial DH handshake to asynchronously between parties and not when sending messages. I’ll look into how matrix rotate their keys and find some inspirations for how we could do something similar for sending messages

On attack vectors,

It’s nice to know that there is a plan for purging data on DHT. It’s an essential feature for us as well because of what you mentioned about keeping encrypted data on long-lived DHT. With regards to doing it on good-faith, hmm that’s interesting… I’m thinking if holoports could be regarded as “trustd parties” who can promise to forget the data once it is confirmed to be received by the receiving party. Or maybe we could have some sort of “forge receipt” (assuming we can assure its integrity and it not just being some sort of garbage data created by malicious actors) that the receiving party can collect once she receives the message. Either way, this is an interesting topic and Im excited to know what the core has in store for us for this feature.
Maybe Tom’s idea of just storing entries without headers (as some sort of backup mechanism) would be nice too. Although this feels like an anti-pattern to Holochain’s provenance property…
about metadata… thanks for pointing that out. If this is done synchronously with call_remote and the callee’s agent just commits the entry privately, then the only metadata leak is who stored some unknowing things when but if this is done asynchronously (as the case in this conversation), then yeah links will be a bit of a problem… And it’s my first time hearing of linking to non-existent anchors. when you say non-existent, do you mean the anchor is generated dynamically when the message is sent?
And it just dawned on me after writing this that this setup of storing messages on source chain is great when agents are running holochain natively since source chain is stored locally. However, if the agents are using Holo, then I wonder if there is any additional attack vectors on having agent’s own messages (in source chain) stored in a remote holoport too… Or actually in Holo, is it better to just emit_signal the message to UI in the agent’s local device and decrypt it there? This essentially will only work synchronously but there’s no write here to the source chain so the uptime will be 99.99999% anyway (but maybe troublesome if the agent’s local device is turned off, which makes me think that it would be nice to have Holo to keep that encrypted payload and keep trying to send the message to client until it works). Anyway, I think this last point is out of scope of this thread so I’ll stop here

pauldaoust · November 27, 2020, 9:47pm

@tats_sato

Yes, and it’s even possible to obscure that by having the sender and recipient store different nonces along with their copy of the message so they hashed differently – then they couldn’t be correlated even if an eavesdropper had a copy of every single header in their DHT storage shard.

I was referring to an idea that some of the devs were talking about a long time ago – basically nobody would ever publish the anchor (even dynamically on first use), so there’d be no authorship headers. If you tried to retrieve this anchor you would get a null entry, but you would get other metadata, such as links. So the owner of the mailbox would be hidden because the mailbox anchor wouldn’t even be published. Then, the only disclosure would be who is sending messages to this mailbox (who publishes links on it), which still could potentially be used to correlate this person’s friendships with their real identity somehow.

And – every time the person checks their mailbox, they have to contact the authority nodes, which gives away their identity.

As you know, @thedavidmeister is exploring Signal’s x3dh, trying to figure out if it can be modified to use in a P2P environment. Then, at least, an agent could publish a string of throwaway, single-use mailboxes for senders to send to. Information would still be disclosed, but there’d be so much of it that the burden for eavesdroppers increases a huge amount.

Yes, I think there’d be a big concern with storing messages unencrypted on the Holoport. You’re right that the client would want to decrypt in the browser. Signals would prevent the message sticking around, for the synchronous use case.

tats_sato · December 3, 2020, 9:08am

yeah! I guess then the only thing left exposed in the DHT is who (author) is writing what entry (entry_type) when (timestamp) which wont really be useful for guessing social graph. Also, we’re exploring how we could allow communication to happen with Holo in an E2EE fashion. This means we probably won’t store the message themselves on the source chain or DHT. so just using signals with call_remote basically.

Hmmm, didn’t quite get this… So does this mean there is anchor(or path) entry published to DHT but somehow other agents can attach entries to that non-existing path?

Interesting yeah… right now trying to make x3dh and double-ratchet all synchrnously as well. It’s really interesting to see how key exchange using a distributed cloud that is agent centric would turn out to be different