Where are private keys and data stored if you're using Holo Host?

pauldaoust · February 21, 2020, 12:06am

so just basic (and correct) facts would be best, is what you’re saying?

noone1000 · February 21, 2020, 5:48am

Alot of fud is making its way around the internet based off this thread. I want clarification in the simplest terms on wether or not node operators can view the data they are hosting?

GraceR · February 21, 2020, 7:06am

Thanks for accurately reproducing all of my typos and weird grammatical errors from distractedly tweeting while riding a train and multitasking. So embarrassing.

AdriaanB · February 23, 2020, 9:49pm

In this case, yes please. This FUD was building for a week now and people start to see and communicate this as a fact now. Which is harmful for the project and confusing for a lot of people.

The ‘big’ question (and important one) people are referring to is: “Is private data encrypted with the Holo host special key and can a Holoport owner read/access private data because of this?”.

gjones617 · February 24, 2020, 9:30am

From Dev Pulse 62:

Blockquote Our approach to these concerns is to design everything in Holo with the ultimate goal of decentralizing every component, mitigating the risks involved with giving anyone gatekeeper powers. Principally, this relates to how HoloPorts connect to one another. HoloPorts are connected via a peer-to-peer VPN to form an internal network, which is why Holo Hosts do not need static IPs.
Holo’s gateways then route a web user’s traffic to their assigned hosts. In many traditional client-server architectures, encryption would be ‘terminated’ at the gateway—it would hold the SSL certificate allowing it to decrypt user data and forward it to the server. On the Holo network, encryption only terminates when data arrives or leaves the HoloPort. Each HoloPort provisions its own personal certificate without involving the Holo gateways. The gateways themselves use Server Name Indication (SNI) extensions to route data without needing to decrypt it, making it impossible for the gateways to snoop on users, since everything passing through is end-to-end encrypted. Each web user is assigned to multiple redundant HoloPorts, distributed across the globe rather than concentrated in a few data centers owned by one company. In fact, two copies of one user’s data aren’t even guaranteed to be in the same country. Each HoloPort is only responsible for a subset of users, making the network more resilient and reducing the power of any one hosting provider. Furthermore, Holo’s gateway servers run on globally distributed hosting infrastructure, which makes them resilient to high traffic spikes or denial-of-service (DoS) attacks."

So, one question (for me) becomes: What is a “gateway,” and how do HoloPorts relate to them; and then also, like everyone else is asking, to what degree can Holoport operators “read” (view) users’ data.

Which brings up some very interesting, and crucial/ salient questions and issues not only within Holo, but distributed computing and the “new internet” being built in general…

First of all, let’s compare with some of the systems we are familiar with: Ethereum and Bitcoin. In both of these, “hosts” (i.e., Miners, and, depending on your philosophical bent, Full nodes as well) hosts can absolutely “see” all of the data running through their system - that’s part of the entire point!! This is the irony, in my opinion, of the supposed “fud” to begin with… little do most consider, but, pretty much the same exact thing they are ostensibly fudding Holo Holo for, occurs in Bitcoin and Ethereum!!

A “decentralized internet” might look very different than the one we have now… The intriguing thing which comes up, though, is: Do people even really want the “freedom” and autonomy they speak of - or, is it more just the idea they’re in love with… For example, prima facie, many might answer that “Oh, I don’t want “big brother” sniffing around my data, ‘Controlling it…’” But when faced with the alternatives, such as: yourself, or even your neighbor- these bring with them a whole new slew of problems!! Funny thing being, when you think on it, maybe I would rather have Jeff Bezos “control” my data instead of my next door neighbor!! Or even myself ! (As I’m lazy, not knowledgeable, etc etc.)…

So as sort of bizarre and "scary " of a concept that “gasp! OTHERS will be seeing my data!!!” Low and behold, they never cared before!!! But as the content of said data shifts from a bunch of random numbers to actual “intelligible” tweets, texts, and beyond… Now it’s starting to become a bit more serious.

That being said, continuing with the BTC comparison: although not totally anonymous, BTC is "pseudo-anonymous: i.e., I might see the data, but I don’t know to whom it belongs… This is how I envision Holo to also operate. (But I am not a dev and do not know much about this.) I.e., yes, if a host had such a desire to root around in her machine, maybe they could “see” data… but to them it would probably look like a jumbled mess. (but I don’t know).

And while on the subject, perhaps someone could implement many of the “Zero-knowledge” proof schemas coming out. It’s a fascinating technology, imo, and we’re only seeing the beginning. But if that could be worked into Holo… we’d have a truly “unenclosable carrier!!”

ALSO (lotta alsos): And I know this will likely get a lot of “guff” from the more conservative in the community, but, yes, at a certain point, it is quite likely, in this day and age at least, that you WILL have to “defer to law,” etc. in order to solve some issues: in this instance, quite possibly the very topic we’re discussing: if someone were to tamper with or alter/ mess up/ damage or otherwise harm another’s data, they would be held just as accountable as, say, Amazon, facebook, or the like. In fact, within the Holo(chain) ecosystem, I would venture to say, because of its very structure (decentralized, granular) not only would it likely be even easier to discover malicious acts or actors, but this fact itself would likely act as a deterrent to shenanigans. (In other words, because everything is much more agent -centric, everyone is much more accountable - I think I’m preaching to the choir on much of this, though).

tats_sato · February 24, 2020, 10:04am

Thank you for replying to my post!

Your insights gave me so much more to consider.

But one question I have in my mind is that, isn’t encryption at rest with the private key of the end user already possible with the current technology? And if yes, why shouldn’t we do it?

Maybe Im missing some really important detail here regarding validation and what not but im not really sure

gjones617 · February 24, 2020, 10:27am

I’m not really sure either, but, as far as I know, for whatever the technical reason is, at the present moment, “hosts” require data to be unencrypted at least at some stage in the process - or at least that’s my understanding.

Why, I don’t know, exactly – but like you alluded to, I believe it relates something to do with validation.

nzharry · February 24, 2020, 8:02pm

The argument that having some anonymous Holo host read your data is less of an issue than Jeff Bezos or a rogue AWS engineer having a poke around, doesn’t really stack up.

Imagine some bored developer with alt-right tendencies sets up 20 holoports in their basement and they decide to go fishing. They look for anything confidential, embarrassing, or commercially sensitive and they harvest the data - then they go on a troublemaking spree. Sure the victims would be random, but the damage could be huge. Then let’s go one step further and introduce organised crime groups supported by a sprinkle of AI and we’ve got an episode of Black Mirror!

Allowing hosts access to private data simply isn’t an option. Hopefully someone from Holo or Holochain will confirm this for us.

pauldaoust · February 24, 2020, 9:30pm

I want to throw a bunch more educated guesses into this conversation, but that’ll probably just add more gasoline to the fire. So what I’m going to do is hold my tongue until I can talk to Art about this, cuz he’s gently poked me re: some inaccuracies I’ve apparently shared in this thread… We were supposed to talk today; let’s see if I can get a hold of him

artbrock · February 25, 2020, 9:48pm

With regard to the FUD, I think we need to do a reality check about how hosting in general works. We cannot rely on web users to hold ANYTHING. Their expectation of a web hosted system is that they can access it from any browser on any computer. They could wipe their hard drive, install an OS and browser, and expect to reach their hosted app/data.

Given that reality, all data, whether public or private must live on the host. Period.

This is true for all hosted systems. To think otherwise is to not want to use a hosted system.

Luckily, Holochain natively provides the most secure option. You self host. The only data that is shared is the data intended to be public, and even that doesn’t go to some central surveillance corp, but is sharded to other users of your app. Depending on how public or private that app is, determines how narrow or wide the destination for that data is. You can run a private app just to sync data between 5 of your own devices if you want to.

Holo hosting of Holochain apps, serves a different purpose. It is NOT to serve the privacy paranoid, but to reach mainstream web users who aren’t running Holochain to host themselves. They are already in the habit of surrendering their data to unknown parties. (Since it is extremely naive to think you know who ends up with access to your data when sent to companies which make their living from customer data and advertisers.)

If you have data within the context of a Holo hosted application that is supposed to be deeply private, (you should ask if it should be a hosted app at all, but if you need 3rd party hosting,) you should encrypt those specific entries. (The encryption w/ host key that @pauldaoust referenced earlier is for the files at rest as a whole, not for individual entries within the files.)

We cannot universally encrypt web users “private” entries on their source chains because private entries are integral to the security enforcement functionality of Holochain, and hosts need to be able to use those for capability grants or claims to properly function as a host, and enforce the permissions users have created for their data.

TL;DR;

If you don’t want anyone to have your data, host yourself on Holochain.
If you want specific app data to be private from hosts, encrypt those entries.

Also, keep in mind that hosts are under contractual agreement with Holo as a service provider and subject to the restrictions on confidential information. Individual hosts just don’t have much of a honeypot to make these nefarious spying behaviors worth their while.

But basic physics of encryption mean you can’t truly hide from a host, anything they need to serve in the clear. To expect or promise otherwise is unrealistic.

AdriaanB · February 26, 2020, 12:21am

Thanks @artbrock for clearing things up. Makes a lot of sense. Personally, I think anarchy is a good description for how I sense/feel about this. Even when Holo doesn’t come into existence…and maybe it shouldn’t. Holochain + mutual credit system is making me a happy camper. So, thanks for clearing and stir things up a bit

nzharry · February 26, 2020, 1:13am

Hi @artbrock , thanks for your very thorough reply.

My main interest has been around protecting the data of end users who probably don’t even realise the site they are signing up to runs on Holochain. These people will be using Holo Host by default, but they will be placing their trust in the provider of the website to ensure their data is well looked after. I realise in the long term it will be a very different paradigm, but in this transitional period many of these ‘web 2’ assumptions still apply.

In this situation, we will either need to rely on the Holo Hosts behaving with the same, or better, integrity as AWS and other corporate hosting services (with regard to snooping), or encrypt anything sensitive.

If we use the example of simple messaging between users (e.g. the Holochain equivalent of Facebook Messenger or Signal) I feel encryption will be required. How/where would the encryption be done without introducing any UX issues?

Thanks again for your help in clarifying this.

artbrock · February 26, 2020, 5:37am

All messages over the wire are automatically encrypted, so there shouldn’t be concern about eavesdropping. If you’re talking about stored messages… there’s a fairly straightforward approach, but also a whole different problem for Holochain at this stage.

The easy answer is you use the same pattern we use for wire encryption of messages you want to send, using the target users public key to encrypt, and then only they can decrypt it with their private key.

The problems are:

You won’t be able to read that message, so if you want to keep a copy of what you sent, you need to save it to your source chain (which you could do, encrypting it with you as the target).
If your target user is not online, even though the host may be online, they don’t have a way to save the message to the receiver’s source chain because the host doesn’t have the private key required to do so.

Holochain doesn’t currently have an ephemeral place to hold data that is waiting for a user to come back online to receive, except possibly with the sender who could occasionally retry sending.

However, if you were an actual Holochain user (not Holo-hosted web user) where the keys reside on the device with the source chain, there are more options… like providing a capabilities grant for senders to put private messages on your chain which you could grant selectively or broadly.

But again… there aren’t easy answers to the hosted scenario. In your example of Signal, you are storing the messages locally on your phone, not accessing them via a web UI. If you use a web UI app where the messages are stored remotely they need to be encrypted as I suggested above.

When we release a light client (or mobile full client) so that you can do app processing and chain storage in browser or on phone, there will be more options which behave similar to the apps like Signal, but at this stage, Holo hosting functions more like apps on a web server, they’re just not centralized to any server.

I hope that helps clarify.

-art

artbrock · February 26, 2020, 5:46am

And to clarify this much older part of the thread…

I also didn’t realize at first how Philip intended Personas (formerly known as HoloVault) to be run. It was his intention that you run your own unique DNA/DHT instance only across your own devices (e.g. phone, laptop, HoloPort) so you can update your identity data no matter which device you’re on, or which chain it was created on. That’s why it the data is being published to the DHT, because it a DHT with only your devices as members.

I like this approach, especially with the conductor services we’re working on for Holochain, your local apps could interact with your Personas app via a conductor API call without even needing to know it’s actual DNA hash.

-art

ldwm · September 19, 2020, 8:51am

This is find interesting. I’m guessing there is documentation somewhere dealing with the redundancy factor in Holo hosting but I haven’t found it yet in written form.

How is this redundancy factor configured? Is it decided by the Holo hosting contract? Is it a hard-coded setting?

pauldaoust · October 7, 2020, 5:52pm

The redundancy factor for public data on the DHT is configured per-DNA and works the same on Holo-hosted cells as it does on self-hosted/native Holochain cells. The redundancy factor for source chains is something we’re hard-coding to 5 based on the requirement that all hosts have an uptime of 90%, to result in a cumulative uptime of 99.999%.

The source chain hosts will engage in a really simple hard-consensus, the details of which escape me right now. I believe leader selection is basically “whichever host the user is currently connected to when a source chain element is written, plus an n second lock to avoid accidental concurrent writes”. Then an element isn’t considered written until three of the five hosts accept it. But don’t quote me on that; it’s been a while since I’ve talked to Alastair about it.

pqcdev · December 28, 2020, 1:37am

Is there a way to shard and randomly distribute amongst available hosts? then collect the pieces to form entire puzzle (host cant access whole file). I would like to use Hierarchical Shamir Secret Sharing. Please advise, thanks.

artbrock · December 29, 2020, 5:55am

This kind of secret sharding (actually called secret sharing, so I’m going to call it sharing so we don’t confuse it with sharding the data across nodes in the DHT) is fine for rebuilding a private key – a task you do very infrequently (hopefully never), but really doesn’t work well for each entry in a database. There’s a couple of big issues with this approach for data:

Reassembling shared secrets is expensive in terms of data size, speed/responsiveness, network load, and processing load. Take the example of elemental chat and a chat message that says “Hello Friend!” You already have the Holochain overhead of adding a header with timestamp, author, signature, pointer to previous header, etc. Now if you want to add secret sharing, you would have to interact with many hosts to give them their version of the data, or to retrieve their version of the data so you are now subject to many more points of failure and much more complex orchestration.

What started out as a small phrase now has grown to include not just a header, but meta information about about the list of share holders not to mention that their versions of the data are at least the same size as the source data (or larger when you include padding), so we’ve blown up the size of data by significant amount.

And since users of hosted apps expect to open any browser on any device to access their data, we can’t count on the client having ANY of the data required for reassembling the shares. So either a) the hosts have to be able to reassemble the message on behalf of web user (which if they can do that means we haven’t actually hidden the data from them, so all this work has created no new privacy), or b) Holo has to centrally track the sharing lists to hide it from hosts which turns our distributed apps into centralized ones.
Not to mention the massive performance delays because of the many network requests required to reassemble the data and processing overhead to reconstruct it. Just try to imagine scrolling in a private chat when it takes 8 to 15 seconds to reassemble every message.

This is not really how Shamir Secret Sharing is intended to be used. The list of share holders is really part of the secret that only you know, but in the case of Holo where we can’t rely on anything to be stored with the web user, where can that list be stored? How can it be maintained as hosts come and go from the network over time?

Which brings us to the second issue, secret sharing is impossible to manage with ephemeral hosts (meaning we don’t know how long they will continue hosting).

Each share of the data is unique, so when a host goes offline you can’t assemble the shares to reproduce the original secret, and Holo can’t just replicate data held by one of the other hosts which is still online to recruit a new host on your behalf.

Also, previously having multiple hosts provided redundancy and data backup, now each holds unique data, so if you want 5x redundancy, you now need at least 25 hosts if you’re sharing it 5 ways.

And as mentioned above, as long as someone other than the web user knows the list of who’s holding the shares, they can just reassemble the data, so this is just functions as expensive insecure data obfuscation rather than introducing any new levels of security or privacy.

TL;DR; No. We can’t just shard the data among hosts.

MaidSafe does this, but it functions as an encrypted file system, not a database for running apps, so their hosts don’t have to serve any data in the clear, and you have to install the client to access the key to reassembling the pieces. Holo hosting is serving a completely different purpose and has to be able to serve to any browser, not just a client. If you want higher security, install our client (Holochain) and host the data yourself. You could still encrypt backups and push them to a file storage hApp to accomplish the same effect if you want decentralized storage of those files.

pqcdev · December 29, 2020, 6:29am

Thanks for the detailed response, all that makes alot of sense.

Especially

and

Very much appreciate the clarification