Where are private keys and data stored if you're using Holo Host?

tats_sato · February 13, 2020, 3:20am

Hi @pauldaoust ! Been wondering about how Holo deals with private key and source chain so glad I found this thread

I did not quite get this part and got a couple of questions

Does it mean host will encrypt the source chain of the agent with the agent’s public key and send it over to the browser of the agent so that s/he may decrypt it with his/her private key? Or are hosts going to encrypt the source chain with their own public key? In that case, I don’t quite understand the entire process of it.
What actually is the “automated process” that the host will be doing on behalf of the hosted enduser and what is it for?
If my understanding is correct, source chains are encrypted with the public key of the owner of that chain and so doesn’t that count as at-rest encryption since private key of the owner is needed for entries inside source chains to be read?

Really great that private key is not stored anywhere! Just a question though, after private key is regenerated inside the browser, what is the exact flow of data in order for lets say Alice to access her source chain stored in the holoport? Is Alice’s private key sent in anyway to the host? (I hope not) If not then my hunch is that the host will send the encrypted source chain of Alice so that Alice can decrypt it with her private key and do bunch of things with it then after Alice is done manipulating her source chain, she’ll encrypt it again with her public key and return it to the host for storage but I think Im wrong and so I hope you can shed light on this matter. Thank you!

pauldaoust · February 13, 2020, 6:39pm

Hey @tats_sato great questions. The short version:

All entries are stored on the host.
If they’re encrypted at rest, they’re encrypted by the host’s private key, not the agent’s.
Alice accesses her private data by calling zome functions on the host, which have access to her source chain. The host sends the function’s return value to her in the clear (but TLS encrypted, of course).
Whenever the host wants to write an entry/header pair to Alice’s source chain, it has to ask her to sign them.
The above three points mean that the data can stay on the host, but Alice’s private key never leaves her devices.
Because the host has access to Alice’s private entries, it has to be prevented from doing bad things by law rather than by technology.
This might change in the future, although the host will always have to be able to read Alice’s headers and public entries (that means the host is assumed to be a trusted member of the DNA along with Alice).

Now the long version:

I don’t quite understand the exact automated processes that Art is thinking the host will need to do on behalf of the user, so these will all be guesses. (Maybe if @artbrock has the time, he can set the record straight ) Here’s all I can think of:

Validating others’ entries
Sharing public entries and headers to other nodes
Responding to direct messages in ways that potentially access private entries

Of those three, the host doesn’t need access to their hosted agents’ private entries, so they can be safely encrypted by the agent’s private key. The only thing that might need access to private entries is the last one, which means Art must be thinking that the agent’s instance is ‘running’ even when the user isn’t interacting with it via the UI. Personally I’d be okay if it required the user to have the UI open in order to decrypt direct messages. However, that means the private entry will be sent back to the host in clear text anyway (so they can forward it to the DHT agent that requested it), which eliminates any advantages of encrypting it with the agent’s private key at rest.

So I would guess that the host encrypts the agent’s private data with the host’s special key, which doesn’t prevent the host from seeing it but does prevent thieves from accessing it (if the host’s owner unplugs the USB drive). You can see that this is a tricky problem, which might be best governed by law rather than technology.

I can see something that might work: encrypt private entries with the agent’s private key, so that when they’re stored on the host they’re inaccessible. Then the receive callback is executed on the browser, not the host, and all calls that retrieve the user’s private entries just return the entries encrypted, and the browser decrypts them and encrypts the receive callback’s return value with the requestor’s public key. This way the host would never be able to access the private entries or data based on them. But all this is pure imagination on my part and not on any roadmap (AFAIK).

I don’t know what the exact roadmap plans are, but I do know that at one point the plan was to have all the Holochain and zome code executed in the browser, except for validation functions. I think this would allow the agent to encrypt all of their private entries with their private key and would permit the above scenario I dreamed up.

tats_sato · February 14, 2020, 10:00am

Hi @pauldaoust! appreciate your detailed response.

pauldaoust:

All entries are stored on the host.

If they’re encrypted at rest, they’re encrypted by the host’s private key, not the agent’s.

Alice accesses her private data by calling zome functions on the host, which have access to her source chain. The host sends the function’s return value to her in the clear (but TLS encrypted, of course).

Whenever the host wants to write an entry/header pair to Alice’s source chain, it has to ask her to sign them.

The above three points mean that the data can stay on the host, but Alice’s private key never leaves her devices.

Because the host has access to Alice’s private entries, it has to be prevented from doing bad things by law rather than by technology.

This might change in the future, although the host will always have to be able to read Alice’s headers and public entries (that means the host is assumed to be a trusted member of the DNA along with Alice).

Made an ultra-simple sequence diagram for this just so Im sure I understood you correctly.

Several questions I have from this part.

In the developer pulse 62, it said,

Each web user is assigned to multiple redundant HoloPorts, distributed across the globe rather than concentrated in a few data centers owned by one company.

I also heard from @guillemcordoba (sorry to pull you in here and please correct me if I’m wrong) that each source chain has a redundancy factor of 5 on Holo Network. The redundancy factor might change in the future but this means that multiple holoport owners will get at least a read access to an agent’s source chain right? I just see a challenge in enforcing the law when your private data is held by 5 hosts living in 5 different countries. Im excited to see how holo will solve this challenge!

In the core concept 03, there was a portion that said,

When the DNA wants to create an entry for you, it first validates its content according to the rules defined for its type. This protects you from accidentally producing bad data.

It then asks your conductor to sign the entry with your private key.

Your conductor adds the signature to a header and attaches it to the entry.

Your conductor saves the entry as the next item in your source chain.

Just want to ask if the process no2 happens a little bit differently in Holo? When the DNA asks the conductor (running in HoloPort), will the conductor send the entry to Alice so that she can sign it then send the signed entry back to the HoloPort?

This is also exactly my dream!! With the short version you have described, I can’t avoid but see the challenge of communicating this part to the user of happ hosted on Holo network. Convincing the end user of happs hosted on holo with something like “Your data is private but x number of other people who will host your source chain can also read it! But don’t worry the law is there for you if they do something bad!” seems to be a real challenge for me. It would be a dream come true if hosts will only store private source chains that are encrypted with the agent public key so that even it is hosted on another computer, we can guarantee users of happs hosted on holo that their private data only belongs to them and no one else can read/write on it without your private key. so my final question is,

Does Holo team see the ability of the host to read private source chain of agents they are hosting for a problem that must be solve? I just wanna know if the team has any intention to prevent hosts to read agent’s source chain down the line regardless of when it will happen.

I believe this is a critical topic especially if developers are intending to (just like we are intending to) create a private and secured p2p communication application on top of Holochain and Holo Network. It’s a challenge to me to claim that a chat application is secured and private when literally even messages you sent to yourself or metadata of your profile can be read by someone else. That’s even when we can say that this someone can be trusted and verified, because that doesn’t really change the fact that they can read your data.

Sorry for a long post and I’m aware that some of the questions I asked need answers directly from the Holo team so I hope we can hear from them in anyway.

premjeet · February 14, 2020, 3:51pm

I can’t understand the need of replicating each source chain to 5 different hosts in the network, why? Though the header of each source chain is already in the network, and by which the entire source chain can be retrieved as the previous entries are inter-linked. So, finding the header is sufficient to access the source chain and that can be saved at the user-end. Please clarify me.

tats_sato · February 15, 2020, 7:28am

I am just making a wild guess so don’t quote me on this one but I guess the source chain needs redundancy because if not, the end user will have no way to access his/her source chain if that one holoport goes offline for various reasons (accidentally unplugged, turned off, etc). But again, not really entirely sure as well

nzharry · February 15, 2020, 10:01am

After reading through this thread, I had the exact same reaction to this.

@pauldaoust you mentioned a couple of times it may be necessary to rely on law rather than technology to ensure an agent’s private data is kept safe from their hosts. I can’t see how someone who has an app that deals in sensitive data would be comfortable building on Holochain if it’s not possible to ensure the security of their user’s data. As this is a distributed network, most end users won’t know their hosts, so expecting them to trust them surely isn’t an option.

@pauldaoust some clarity on this point would be great as it seems fundamental to the viability of Holo. Thank you

gjones617 · February 16, 2020, 7:50pm

This is very interesting (and helpful info/ reply; thank you). It brought to mind the idea of “Zero-knowledge” or “zk” proofs, as they call em - a concept which I find fascinating… sort of “message within a message;” with which I think the possibilities are great, especially within the realm of passwords/ keys…

Any chance something like this might be implemented into Holo/ holochain?

HoloFuture · February 18, 2020, 8:26pm

Paul seems to not know exactly what the answer is here. But via the quote below it appears the team was on the path of allowing everything to be encrypted/decrypted via the browser and seperate from hosts. This is very likely the path the team went down. Further clarification would obviously be nice from the team that’s actually working on this:

"I can see something that might work: encrypt private entries with the agent’s private key, so that when they’re stored on the host they’re inaccessible. Then the receive callback is executed on the browser , not the host, and all calls that retrieve the user’s private entries just return the entries encrypted, and the browser decrypts them and encrypts the receive callback’s return value with the requestor’s public key. This way the host would never be able to access the private entries or data based on them. But all this is pure imagination on my part and not on any roadmap (AFAIK).

I don’t know what the exact roadmap plans are, but I do know that at one point the plan was to have all the Holochain and zome code executed in the browser, except for validation functions. I think this would allow the agent to encrypt all of their private entries with their private key and would permit the above scenario I dreamed up."

tats_sato · February 19, 2020, 5:51am

I really do hope that this is the path we are on. This question has been in our team’s mind the past week since it’s so critical haha. Hopefully we can get an answer from the team soon despite the busy schedule

gjones617 · February 19, 2020, 2:42pm

(from above) “Because the host has access to Alice’s private entries, it has to be prevented from doing bad things by law rather than by technology.” -pauldaoust

I’m a bit confused here. I understand how a Host might be able to read an agent’s data that he or she is hosting; but, seeing as the agent is the only one with access to the private key, the Host should not, and ostensibly is not capable of writing or changing agents’ data, right??

AdriaanB · February 19, 2020, 3:43pm

“Whenever the host wants to write an entry/header pair to Alice’s source chain, it has to ask her to sign them.”

This is the answer to your question, right?

gjones617 · February 19, 2020, 7:43pm

Ah, okay, thank you !

pauldaoust · February 20, 2020, 11:35pm

I’m starting to suspect I got some facts wrong, based on a conversation @artbrock started with me that sounds like it’s referring to this very forum thread. I’ll get clarification from him and then hopefully I can explain a bit better (unless you want to chime in directly @artbrock !)

But @gjones617 yes, you are absolutely correct about:

as @AdriaanB confirms. @GraceR put it very eloquently on Twitter during a discussion on the merits of the Cryptographic Autonomy License:

Conceptually, this is fairly simple. I am saying “this”. I know I said “this” and nobody else can say “this”, pretending it was me. Theoretically if twitter holds my keys, they could change “this” and it would seem I had said “that”

Later twitter could revoke my access to “this” and I wouldn’t be able to digital have the memory or proof of having said “this”. CAL basically means that only I could have said “this” and that I have the right to always remember and show proof that I I said “this”.

BTW, this is why I say it’s “data assault” rather than “data theft”. Someone is tampering with my ability to “remember” what I said. If I know my memory is failing and I want to write them down to remember them, this becomes a kind of extension of my brain function.

So regardless of whether you’re using Holochain natively or via a Holo-hosted app, you should always have the exclusive power to sign your own entries.

pauldaoust · February 20, 2020, 11:51pm

sorry, sometimes I think I’m clarifying but I’m actually just muddying

pauldaoust · February 21, 2020, 12:06am

so just basic (and correct) facts would be best, is what you’re saying?

noone1000 · February 21, 2020, 5:48am

Alot of fud is making its way around the internet based off this thread. I want clarification in the simplest terms on wether or not node operators can view the data they are hosting?

GraceR · February 21, 2020, 7:06am

Thanks for accurately reproducing all of my typos and weird grammatical errors from distractedly tweeting while riding a train and multitasking. So embarrassing.

AdriaanB · February 23, 2020, 9:49pm

In this case, yes please. This FUD was building for a week now and people start to see and communicate this as a fact now. Which is harmful for the project and confusing for a lot of people.

The ‘big’ question (and important one) people are referring to is: “Is private data encrypted with the Holo host special key and can a Holoport owner read/access private data because of this?”.

gjones617 · February 24, 2020, 9:30am

From Dev Pulse 62:

Blockquote Our approach to these concerns is to design everything in Holo with the ultimate goal of decentralizing every component, mitigating the risks involved with giving anyone gatekeeper powers. Principally, this relates to how HoloPorts connect to one another. HoloPorts are connected via a peer-to-peer VPN to form an internal network, which is why Holo Hosts do not need static IPs.
Holo’s gateways then route a web user’s traffic to their assigned hosts. In many traditional client-server architectures, encryption would be ‘terminated’ at the gateway—it would hold the SSL certificate allowing it to decrypt user data and forward it to the server. On the Holo network, encryption only terminates when data arrives or leaves the HoloPort. Each HoloPort provisions its own personal certificate without involving the Holo gateways. The gateways themselves use Server Name Indication (SNI) extensions to route data without needing to decrypt it, making it impossible for the gateways to snoop on users, since everything passing through is end-to-end encrypted. Each web user is assigned to multiple redundant HoloPorts, distributed across the globe rather than concentrated in a few data centers owned by one company. In fact, two copies of one user’s data aren’t even guaranteed to be in the same country. Each HoloPort is only responsible for a subset of users, making the network more resilient and reducing the power of any one hosting provider. Furthermore, Holo’s gateway servers run on globally distributed hosting infrastructure, which makes them resilient to high traffic spikes or denial-of-service (DoS) attacks."

So, one question (for me) becomes: What is a “gateway,” and how do HoloPorts relate to them; and then also, like everyone else is asking, to what degree can Holoport operators “read” (view) users’ data.

Which brings up some very interesting, and crucial/ salient questions and issues not only within Holo, but distributed computing and the “new internet” being built in general…

First of all, let’s compare with some of the systems we are familiar with: Ethereum and Bitcoin. In both of these, “hosts” (i.e., Miners, and, depending on your philosophical bent, Full nodes as well) hosts can absolutely “see” all of the data running through their system - that’s part of the entire point!! This is the irony, in my opinion, of the supposed “fud” to begin with… little do most consider, but, pretty much the same exact thing they are ostensibly fudding Holo Holo for, occurs in Bitcoin and Ethereum!!

A “decentralized internet” might look very different than the one we have now… The intriguing thing which comes up, though, is: Do people even really want the “freedom” and autonomy they speak of - or, is it more just the idea they’re in love with… For example, prima facie, many might answer that “Oh, I don’t want “big brother” sniffing around my data, ‘Controlling it…’” But when faced with the alternatives, such as: yourself, or even your neighbor- these bring with them a whole new slew of problems!! Funny thing being, when you think on it, maybe I would rather have Jeff Bezos “control” my data instead of my next door neighbor!! Or even myself ! (As I’m lazy, not knowledgeable, etc etc.)…

So as sort of bizarre and "scary " of a concept that “gasp! OTHERS will be seeing my data!!!” Low and behold, they never cared before!!! But as the content of said data shifts from a bunch of random numbers to actual “intelligible” tweets, texts, and beyond… Now it’s starting to become a bit more serious.

That being said, continuing with the BTC comparison: although not totally anonymous, BTC is "pseudo-anonymous: i.e., I might see the data, but I don’t know to whom it belongs… This is how I envision Holo to also operate. (But I am not a dev and do not know much about this.) I.e., yes, if a host had such a desire to root around in her machine, maybe they could “see” data… but to them it would probably look like a jumbled mess. (but I don’t know).

And while on the subject, perhaps someone could implement many of the “Zero-knowledge” proof schemas coming out. It’s a fascinating technology, imo, and we’re only seeing the beginning. But if that could be worked into Holo… we’d have a truly “unenclosable carrier!!”

ALSO (lotta alsos): And I know this will likely get a lot of “guff” from the more conservative in the community, but, yes, at a certain point, it is quite likely, in this day and age at least, that you WILL have to “defer to law,” etc. in order to solve some issues: in this instance, quite possibly the very topic we’re discussing: if someone were to tamper with or alter/ mess up/ damage or otherwise harm another’s data, they would be held just as accountable as, say, Amazon, facebook, or the like. In fact, within the Holo(chain) ecosystem, I would venture to say, because of its very structure (decentralized, granular) not only would it likely be even easier to discover malicious acts or actors, but this fact itself would likely act as a deterrent to shenanigans. (In other words, because everything is much more agent -centric, everyone is much more accountable - I think I’m preaching to the choir on much of this, though).

tats_sato · February 24, 2020, 10:04am

Thank you for replying to my post!

Your insights gave me so much more to consider.

But one question I have in my mind is that, isn’t encryption at rest with the private key of the end user already possible with the current technology? And if yes, why shouldn’t we do it?

Maybe Im missing some really important detail here regarding validation and what not but im not really sure