Questions about DHT, cryptography, and security

jmday · August 27, 2019, 6:18pm

Since any author could have been issued a warrant before + there is a possibility a author could somehow delete their previous transaction from their chain - would a validation done by any public nodes contain a mandatory process to 1st check if author was flagged as malicious before + check with other peer nodes who contains author’s previous transactions? They definitely cannot check the author’s chain alone to confirm if the current transactions is valid or not.

Deleting a previous transaction from a source chain would lead to a source chain fork. Fork detection and the resulting warrant will allow hApp developers to define how those situations should be handled. In the case of a currency, I should think the immune response actions should be rather harsh.

Sol · August 28, 2019, 5:06am

@jmday @pauldaoust

if a previous entry was deleted, what would be the process for fork detection? And would it be easily detected and known to public peers?
If a transaction is published by the author + public nodes who validate/stored author’s past transaction are required to be updated --> will it be part of the process where other public nodes (who are selected to do validation) not just check the author’s source chain but also with the validators who did the validation of the last transaction?
If a previous transaction was deleted by the author, and the author published the latest transaction --> does it now means the author does not need to publish to validators who validate/store the author’s last transaction?

Sol · August 28, 2019, 5:13am

I really like your description of how robust holo fuel transaction/validation process will be. While it is definitely robust + not easy to deceive every validators in every neighbourhood, would that means the “finality” of the holo fuel transaction will take longer + more expensive? Or based on holochain’s network/agent-centric approach (already tested?), it is still very fast and cheap?

Sol · August 28, 2019, 5:15am

If there is a network partition of a DHT to say, 2 network. Any potential attack vectors?
If a malicious actor is in 1 of the DHT partiion, it is not possible for him to attack another DHT partition anyway right?
When 2 partitioned DHT network regain connection, how do they sync and managed conflicting transactions, if any?

pauldaoust · August 28, 2019, 8:24pm

@Sol

We’ve been talking about doing it one of two ways:

Each validator who is storing the previous header is supposed to get a notification from the chain’s author when a new header is published. The notification is then stored as a link to the new header. If a validator receives a second notification, they know the chain has forked.
The same, but the notifications go to the validators that are holding the agent’s ID.

It sounds like you’re digging into the “trust the last transaction” design, but I’m not quite sure. Could you explain more?

Depends on what you’re making a comparison to ‘finality’ is always a probabilistic thing in eventually consistent systems, so it depends on level of trust vs risk involved in the transaction. Finality would probably mean “a quorum of validation signatures has been collected for the entry”, where ‘quorum’ = the resilience factor. Validation can be parallelised, but you’d still need to consider the time involved in making a network connection to each validator and waiting for their response.

Yes, a few possible attack vectors:

Create a partition involving a bunch of malicious agents. ‘Launder’ a bad transaction so far back in the transaction history and collect a quorum of validation signatures from those malicious agents. When the partition reconnects, no honest agents bother to revalidate anything because they see all the signatures. Possible solutions:
- Have agents revalidate entire transaction histories, either randomly or for high-value transactions. This works like a security spot-check.
- Have Holochain measure the ‘health’ of validation signatures at the subconscious level. If a validator notices that the pattern of signatures is irregular and doesn’t match their view of the DHT, they can choose to revalidate, which could trigger a cascading revalidation that causes all the entries from the partition, honest or invalid, to be pulled into the honest partition.
An agent engineers two partitions, each containing one of their targets. They do a transaction with each of their targets and get the transaction validated by each of the peers. Then they allow the partitions to rejoin, and their transactions with both targets are invalidated. But they now have the goods they paid for, so they can throw away their now-blacklisted identity. This would be very hard to engineer — essentially they would have to mount an ‘eclipse attack’ against both targets, which partitions them both into their own separate networks, away from the main network. Possible solutions
- Before accepting a payment, detect whether you’re in an eclipse by pinging ‘beacon nodes’ that are known to be part of a ‘good’ part of the DHT.

Syncing would likely happen as people on either side of the partition start interacting with each other and noticing that their counterparty’s data was previously held in a partition. The nodes will ask validators that were once part of their own partition for the data, those validators will notice that they don’t have it, and will ask their previously partitioned peers for it.

As for conflicts, that depends on the type of conflict and on how the application should handle it. A big subject for another time!

pauldaoust · September 3, 2019, 6:35pm

A post was split to a new topic: rrDHT vs Kademlia

Sol · August 31, 2019, 6:12am

What i mean is that besides checking the author’s source chain, is it also a “mandatory” that validating nodes need to check other peer’s nodes who had store the author’s last transaction? Because the author could always delete his past transaction entry.

Yes, i understand what you meant. Holo fuel transactions is meant to be robust, and therefore more steps - which is always good to know. You also mentioned that validation can be parallelised. Nevertheless, is there already test or measurement done on holo fuel transaction/validation so far with “certain level of resilience factor” ? And if yes - how’s the results so far?
Not sure if you had answer my earlier question. But i can rephrase it: If a DHT network partitioned into 2 and a malicious node is in 1 of the partition, that means he is unable to launch or coordinate an attack on the other partition network right? Or an attack is still possible?

pauldaoust · September 3, 2019, 8:44pm

What i mean is that besides checking the author’s source chain, is it also a “mandatory” that validating nodes need to check other peer’s nodes who had store the author’s last transaction? Because the author could always delete his past transaction entry.

Depends on the app. Each app will specify a ‘validation package’ — a bunch of entries that the new entry depends on. All of those entries must also be valid: accessible on the DHT, sufficient number of validation signatures, no warrants. If any of these criteria are not true, we don’t even bother to validate the new entry.

Currently the validation package can consist of:

nothing,
all the previous headers leading up to this entry’s header, or
all the previous headers and public entries leading up to this entry’s header.

In the future it’ll also include dependent entries — in the case of HoloFuel that would mean an acceptance requires a valid proposal, and a confirmation requires a valid acceptance (which already required a valid proposal).

Right now we’re working out details on how all of this should be designed, so validation is only done locally and fork detection isn’t implemented. But once the lower-level details are finalised, we’ll get the behaviour you’re talking about for free, as I understand.

I have to apologise; I don’t know all the details. I want to say that the work on HoloFuel has been on proving correctness of the algorithm itself, so that when the validating DHT stuff is finalised, HF will ‘just work’. But maybe things are further along than I know?

My idea of an attack vector for that was the second one — about an agent engineering two partitions and existing in both. Nothing to stop them from doing that and avoiding detection in both partitions, because neither of the honest agents in the two partitions can see each other. Some of Bitcoin’s double spend attacks are dependent on this sort of thing. But for an agent trying to join an already healthy network and engineer a partition, it would be a nearly impossible battle. They would almost have had to control the network from the beginning. It’d be like a troublemaker trying to get into your circle of friends and family and cause them to split into warring factions that never talk to each other — possible, but not everybody is going to be perfectly loyal to one side or the other, so information would always leak. And that’s what you want — you want some sort of partition breaker.

premjeet · September 11, 2019, 7:09pm

Is it possible to find the nodes having certain “trait” at the low level in the network? Or, you have to search it at “holovault/persona”… For example, as an app provider I want the list of all nodes who can host my data for not more than X holo.

Sol · September 18, 2019, 6:28pm

@pauldaoust some new questions

Should a holochain node/host become popular - how do they prevent ddos attacks?
any ways to prevent external hacking of a node’s chain? or that is not possible but is at least detectable?

Sol · September 25, 2019, 3:32am

@pauldaoust looking forward to your advice

pauldaoust · September 30, 2019, 9:06pm

This will be a feature of Holo Host, but not Holochain proper. About all you can tell about a Holochain node, if they’re not also a Holo hosting device, is how big a chunk of the DHT they’re willing to store.

With Holochain, that’ll be their responsibility — if they’re popular, there’s probably a good reason, such as being the node for an important individual. So it’s probably in their interest to set up good protection — like an OS-level firewall. Certain kinds of attacks, like gossip flood attacks, can probably be stopped in Holochain itself.

With Holo Host, the story changes, because hosts can’t control whether they’re hosting for an important person. They might not have the system in place to protect themselves. On the public side, we’ll be using CloudFlare’s DDoS protection. But I don’t know if the hosts themselves can be accessed on a public IP. I haven’t talked to the HoloPort devs in a while, but I understand they were going to be setting up good firewall rules. Don’t quote me on any of this, cuz I’m just guessing

Because each chain entry is signed by its author, there is only one way to hack their chain: you’ve got to steal their private key. This is a vulnerability for sure — it’s hard to protect against leaving your device on the bus, getting infected with malware, or falling victim to the $5 wrench attack — but it’s a vulnerability shared by all authentication systems, not just cryptographic ones.

So external hacking is impossible. The best that a malicious agent could do is stage an Eclipse attack or refuse to store data that they’re supposed to store — and all that would do is give an incomplete source chain, not a corrupted one.

Sol · October 20, 2019, 2:12pm

Hi @pauldaoust another question on the topic of randomness which affects security/validation.

I have been reading 2 great articles on randomness used by Ethereum and Near Protocol to choose the validators to produce blocks. The objective is the random process cannot be biasable nor predictable by malicious agent.

I understand that in holochain, the public validator get chosen to validate entries based on how “similar” is their public key hash is to entry’s hash.

I like to know how robust is this random validator selection process? Is it in any ways still possible to be biasable and predictable by malicious agents trying to get selected?

2 reference articles below:

pauldaoust · October 21, 2019, 6:53pm

Let’s first take a look at why a malicious agent might want to influence randomness. In blockchain, if you can influence a seed such that you could be selected next as a validator, you could do all sorts of nasty things to that block. But with Holochain, validators tend to not be interested in the data they receive, because it doesn’t involve their economic interests — e.g., in a currency app, their account balance isn’t involved in the same chain of transactions as the transactions they’re validating.

But in a Holochain app you as a data producer may be interested in influencing the randomness that chooses a validator for you. If you want to get bad data to pass as good data, you need to find a validator to collude with you.

Depending on the data and the size of the DHT, this might not be very hard. If you control the entire contents of the data structure, you can totally influence the validator selection. A transaction might have ‘memo’ fields, with which you can ‘mine’ the hash into a bad neighbourhood that’s under your control. There wouldn’t be much value in introducing a verifiable-delay function here, because it’s meant to slow down a group of people, not a single person.

But there are some nice statistical things that make it kinda worthless to bother with this:

You can’t control who’s in the neighbourhood. Our next DHT design will collapse all addresses into a 32-bit space. “Wait,” you say, “that makes it even easier to mine an entry into a dishonest neighbourhood!” True, but it also makes it much more difficult to keep honest nodes out of that neighbourhood.
Most apps that require high level of validation confidence will be transactional apps, where a number of entries are produced for each transaction (I think HoloFuel produces up to five). As mentioned in a previous forum thread (can’t find it), that exponentially increases the difficulty of engineering enough bad neighbourhoods. Both the initiator and the receiver could be in collusion with each other for this to even work, and by the end of a five-step transaction flow (like with HoloFuel), you have to be in control of five neighbourhoods, which is 2⁵ (16×) more difficult than controlling just one. And you have to do it for every bad transaction!
If only one party is corrupt, they don’t have any influence over the data their counterparty produces in response. This means that the best they can do is create an invalid first entry, get it validated by corrupt peers, and then have the honest counterparty immediately catch the fraud.
I don’t know the exact plans, but I believe that in the future we’ll have automatic detection and revalidation of entries that were created in a partition, so this makes it more difficult for colluding nodes to create a partition for the purpose of ensuring that all their validating peers are corrupt.

GreatDragonian · November 16, 2019, 3:42pm

Hi @pauldaoust! Have seen you around in the currency design course. Its great!! I’ve felt its like i’m learning some deep secrets of the Universe that are mostly hidden in our culture! What do you think?

I want to ask a question, because I got really excited with the Holoports announcement and started to think again about the technical parts of Holochain:

Imagine Alice wants to publish an invalid entry for certain hApp, which requires 10 validations, so she follows this procedure:

Simulate the entry creation process to obtain the hashes of both the header and the body.
Create two neighborhoods of 10 sybil nodes each, that will respectively validate the body and the header.
Submit the body and header respectively to one node in each neighborhood and get the entry validated.

Note that:

In step 2, if there exist, say 1000 nodes in the DHT, the address space is splitted in 1000, so its easy to construct the first node of each validating neighborhood: probability 1/1000 via brute force to make it look reasonable, in the sense that at most 1 node that should contain the transaction (the nearest honest node) may possibly not contain it. Also, the probability of finding the second one is also low (1/1001 because now the address space is divided in 1001 parts) and the third one (1/1002), etc… so in total I just expect to need around 1000+1001+…1009 = 10,045 Brute force searches to find each neighboorhood (20,090) in total for the 2 neighborhoods.
Also in step 2, precomputing the hashes and afterwards creating the validating neighborhoods is much easier than creating the neighborhoods first and then mining a nonce that will make the validating entry fall into both neighborhoods (in this case, the complexity is indeed multiplicative, not additive).
In step 3, the sybil nodes may have hacked their addresses tables, so they only propagate entries to the (just created) peers.

How could Alice’s attack be stopped and detected? Thanks for your answer!

GreatDragonian · November 18, 2019, 3:51pm

Update:

I believe I had my doubt solved at the Telegram group!

As far as I understand, Holochain uses the concept of storage arcs for each node, in which every node keeps up to date regarding transactions on a certain address space: jmday

That way, if ONLY the sybil nodes validate a transaction, and there exists a honest node whose storage arc contains a transaction’s hash (so it supposedly should have validated it as well), 2 things: can happen

The transaction gets invalidated when the honest node receives it and all involved nodes are blacklisted.
The honest node never knew of the transaction (because the sybil nodes hacked their address tables). In that case, the DHT is in an inconsistent state regarding that transaction (because only some nodes know about it) and so, its easily detectable and proper measures may be taken (for example, a full recursive check of ledgers involved).

I believe that’s why uptime is key to the proper functioning of high security apps on Holochain, and in the specific case of HoloFuel, this is achieved by means of the holoports.

Also, I am starting to think that holochain does not use global consensus, but a more natural, “holographic” consensus, that depends on the scale. Sounds awesome!! For example, we know time is relative in the Universe… but here on we have a local consensus about it.

What do you think?

sidsthalekar · November 19, 2019, 2:24am

@pauldaoust - just curious if you guys have explored reputation backed validation circles? I think the reputation economy would help that tremendously. (I can share more)

Sol · November 19, 2019, 5:12pm

@pauldaoust another question!

Everytime, a entry is created, its header includes a time-stamp.

The question is, how is this time-stamp derived? By the time of the machine of the author?
If yes, does it factor in timezone of where’s the author is based? As 1200 for someone based in Japan is different to 1200 for someone based in Singapore.
how does the counterparty or validating nodes determine the accuracy of the time-stamp of the entry relative to their own time?
What if the author’s try to “fake” the time-stamp of the entry?
If the author tries to do 2 conflicting entries at the same time to double spend - how does the counterparty or validating nodes able to objectively know the correct order sequence of the 2 entries?

Basically, my end question is - for a agent-centric network like holochain with no global ledger or no “global-notion of time”, how do nodes able to objectively and easily tells the proper sequence of events? I think this is very important for security. Look forward to hear your views.

pauldaoust · November 19, 2019, 6:17pm

@sidsthalekar that would be pretty cool, and it’s essentially what Secure Scuttlebutt does — propagation happens through friend networks, and only valid data gets propagated. So far though I’ve only heard about validation via random hash-based selection. An old article by Art, though (pre-Holochain), talks about using trusted notaries rather than random peers. You could define ‘trusted’ as ‘I trust this person’s validation result because of their reputation graph’, although for the foreseeable future it’d be an application-level validation, not something at Holochain’s subconscious layer.

@Sol

Yes, it’s derived from the machine’s clock.
I believe it does factor in the TZ, it’s an ISO8601 timestamp which includes TZ information. (Of course, the accuracy depends on what TZ the owner of the machine has chosen.)
Validating nodes typically don’t trust the timestamp as reliable (though I suspect they’d want to see an increasing series of timestamps). If your app’s validation rules require accurate timestamping, there are two options:
1. Set up trusted nodes that do signed timestamping. This is a centralisation point.
2. Use ‘network time’, which is the average of the timestamps of the first R validation signatures (where R = the resilience factor of the app). The expectation is that a random selection of validators will be likely to have system clocks that are within a range of correctness. This isn’t available at validation time though — only afterwards, for use in validating subsequent entries. And right now it’s just an imaginary feature.
See 3.
It depends on the implementation of the currency. There are a few scenarios:
1. A currency that finalises transactions synchronously through node-to-node messaging. The entry is created and passed to the recipient, but not written to any source chain until both the initiator and the recipient sign it. At that point it’s written to both of their source chains. I’m not an expert on this and would need to think about it more, but it would involve both parties committing to ‘lock’ their chains until the transaction were finalised, and giving the recipient time to check that the initiator hadn’t completed some alternate transaction at the same time. The part I haven’t quite figured out yet is how the recipient would discover the alternate transaction.
2. A currency in which each transaction step is written to the DHT. Holochain’s design is closer to this, and it has two advantages: you can safely initiate multiple transactions, and it’s easy to detect double spends — all the recipient needs to do is make sure the initiator’s entry exists on the DHT before confirming it.

Generally global notion of time isn’t necessary or even reliable when you’re trying to construct a correct sequence of events. It’s better (and easier) to simply prove that B happened after A. This is all blockchain does — its global block clock simply establishes happened-after relationships between transactions.

The difference with Holochain is that it recognises that you don’t need to construct a complete global order of events — all you need to do is construct an order of the events that you cared about. Sometimes that means parallel trees of history:

Let’s say you’re Hector, the purple guy at the very end. Do you care when Charlie sent money to Eve in relation to when Alice and Bob sent money to Diane? No; all you care about is that each bit of coin you’re being given has a valid trail of transfers behind it, all the way back to the original money creation events. In other words, you care that Alice had money to give before she gave it to Diane, and likewise when Diane gave to Frank and so forth. Timestamps and total ordering don’t matter, but logical ordering of the stuff you care about does.

Here are some more thoughts from the FAQ: How are timestamps and ordered timelines of events achieved on Holochain?

pauldaoust · November 19, 2019, 6:35pm

@GreatDragonian I missed your earlier post on this thread. I like the possibilities you’re exploring, and the consequences of those possibilities. I never thought of creating the entry first and then post-mining the Sybils to validate it. Some thoughts:

Yep, storage arcs say “I’m taking responsibility for this section of the DHT” so the evil Sybils are required to propagate the invalid data to the honest nodes in their neighbourhood. But you’re also right that the evil Sybils are under no obligation to actually share their data with the honest ones. They can hack their peer tables, or just simply be selective about what they gossip to their peers. FWIU here’s how it will look: I, an honest node, look at evil Alice’s source chain and say “hm, I wonder what my peers think of each of Alice’s entries.” So I’ll ask the DHT for the validation certificates of the entries on Alice’s chain. I’m going to go to the peers that I assume to be trustworthy, which Alice and her Sybils have no control over. The ones in the neighbourhoods of the invalid entry and its header will say “sorry, ain’t got that”. I can decide to either reject the entry, or do a deep validation of Alice’s chain and the people she’s transacted with, etc.
Unlike with PoW/PoS blockchains, Holochain doesn’t have a built-in mechanism for reducing the impact of Sybils. For things like currencies, app creators should probably design membership validation rules that connect a public key to a real human ID somehow, and limit the number of agents that a human can generate.