Questions about DHT, cryptography, and security

pauldaoust · December 2, 2019, 8:13pm

@raphisee Your example (restoring to an earlier backup that doesn’t contain the latest commit) is a perfect illustration of what I’m worried about. I can’t think of any more examples, just a general unease about things that are supposed to be atomic but can get screwed up by babies unplugging power cables or pouring juice into keyboards

pauldaoust · December 2, 2019, 9:28pm

@PekkaNikander a bit of disambiguation:

The case in question is about the source chain, which is a data structure with precise validation constraints: for each application, each agent keeps their own journal of writes in an hash chain (effectively a very simple Merkle tree). The only two constraints are that it be unbranched, and that each entry is signed with the private key whose public component is found in entry #2 of the chain. From an application perspective, it’s not a very interesting structure — it’s about as sexy as the write-ahead log (WAL) in a relational database.

Why does it need to be unbranched? My guess is that it has something to do with the fact that some applications (currencies, etc) pretty much demand a linear history, and it’d be too hard to reason about whether an agent is in a consistent state if you had to worry about branches. Why isn’t this an application-level constraint then? Not sure, but probably because it doesn’t really hurt anyone to enforce that constraint on every application.

Holochain relies on the CALM theorem for coming to consistency — Merkle trees and DHTs can both be modelled CALMly. However, AFAICT proving the non-existence of something (e.g., an alternate branch on a source chain) is not something that can be done CALMly. Holochain does the next best thing – it makes it hard to keep secrets within a group of agents, thereby increasing the chance that someone will discover the conflicting branches. When this happens, I guess you could say it’s like a CRDT that ‘tombstones’ both branches (marks them both as deleted) and ‘seals’ the source chain as it existed before that point.

So if the individual agents’ source chains are equivalent to a WAL, where does the materialised state (equivalent to SQL’s rows/tables) live? Uhh, not exactly anywhere, because the implementation would be dependent on the application’s needs.

So to make a short story long @PekkaNikander the application author would be responsible for creating a CRDT algorithm (or importing a lib) that would get the semantics they’re looking for. Holochain has built-in support for updating and deleting an entry, and I think their CRDT-ish semantics are “delete wins, otherwise first write wins”. I suspect that, as an un-branch-able history, they’re too basic for most collaborative work and are subject to the problem you bring up. Instead, they seem most useful in cases when an agent wants to update/delete their own entries and therefore doesn’t have to worry about conflicts/coordination. I’m excited about seeing libraries that implement useful CRDTs for collaborative editing applications. So there has been lots of thought on how to implement this functionality on Holochain, but the only code written has been in support of this most basic use case.

PekkaNikander · December 3, 2019, 1:30pm

Just to clarify: Having the full history is also the case with (state-based) CRDTs. At the semantic level, you never edit the history. When you “edit,” you insert to the history a time change that “deletes” or “modifies,” whatever that means at the application level.

Hence, adding (at least state-based) CRDT algorithms even on the top of the source chain should not be that outlandish, I presume.

What I was (probably foolishly) envisioning and talking about is a case where there is a CRDT-based app on the top of the source chain, and then someone goes offline for a longish time (hours, days) and wants to do app-level edits. With CRDTs, the edits end up as new entries in the history. Consequently, the data structure will be (for a while) more like a bush, or tangle, lattice, or a DAG if you wish. It won’t be a linear history. There will be events that have are semantically parallel.

The tricky part is when that someone comes back online. Then those app-level edits — i.e. the parallel history — need to be folded back to the rest of the history. That is where the CRDT “magic” helps. At the CRDT level you can always merge such “parallel” histories. By definition, the CRDT just guarantees you that. Otherwise it is not a CRDT.

The difficulty that I spoke about is at the app level. There, thinking of a git merge conflicts helps. When there are app-level conflicts on the top of a CRDT, you need to do the equivalent of such a merge. Sometimes you have to rely on humans on that.

But, in most cases you can define app-level semantics to take care of such merging. For example, you can compare a git merge, where a person has to do the resolution, to an editor that displays the conflicting edits as alternative versions of the text.

PekkaNikander · December 3, 2019, 1:50pm

I do understand this, I think. But this is also what I was (perhaps foolishly) questioning. Given the existence CRDTs, why couldn’t branching in a CRDT-DAG-manner be supported, even in the source chain, whenever allowed by an app?

Maybe there are good reasons for the current design. Or, maybe, perhaps nobody has properly thought about this.

There are no such “delete wins” semantics in state-based CRDTs. The app level has to decide. At the CRDT level, the data structure simply preserves the whole branching history, leaving it to the app-level when two edits “conflict.” At the CRDT level, there simply aren’t such conflicts. In the worst case, such “conflicts” just create histories that are “complex.” (Or, that is at least how people defined the thing in 2015. Maybe the acknowledged wisdom has changed since then.)

Digression on currencies

I may digress, but what comes to currencies, it is BTW not at all clear if all currencies need linear histories. If you have an IoU based currency with well-known credit line caps, the worst that can happen is someone to go on debt N times their credit line, where N is the number of parallel histories on can create in practice. As long as the credit line caps are small enough and the parties are either known to each other, bound to official juridically binding identities, or have reputation that they consider valuable, the risks can be managed.

pauldaoust · December 3, 2019, 4:55pm

I do wonder if this requirement of unbranching source chains is a consequence of Holochain’s initial use case of modelling currencies, which do admittedly benefit from a nice simple linear history even if they don’t absolutely require it. I’ve often thought about suggesting that it could be an app-level constraint rather than system-level.

If you remove some of the constructs we have right now (links, update history, source chains + headers) you reveal some underlying primitives that are really quite expressive and interesting. Essentially it boils down to content-addressable storage + metadata (which itself is just another content-addressable storage, albeit attached to an individual entry). Some metadata should be allowed to cancel out other metadata (e.g., apps that need unbranching headers should never allow two pieces of “next header” metadata on a header entry); in those cases you want to manage conflicts with either a coordination protocol or CRDT-like semantics, opting for the latter whenever possible.

With these primitives, you could pretty much build any of the higher level stuff, plus more. The ‘unbranched source chain’ case could just be a drop-in library.

I’d love to bring these things up with the core team, but I feel like I ought to wait — I think they’ve settled on a ‘minimum usable/comfortable toolset’ for app devs and are pushing to support the alpha release of the flagship apps.

Ah, well I obviously have a lot to learn… Hafta confess I don’t understand state-based CRDTs as well as I understand operation-based CRDTs. I was picturing a slightly more complex version of a 2P-set or ‘tombstone’ set, which combines two G-sets in which the latter set accumulates removals and takes precedence over additions.

Thanks for your digression on currencies. I like the idea of having a per-branch credit cap that’s the actual credit cap ÷ the practical number of branches. Another place where branches may not matter is a LETS-style mutual credit currency which has no cap. (Although in practice, people often make decisions about whether to transact with someone based on their current balance, so you’d still want some way of discovering all the branches.) Any ideas on how to determine N in practice?

pauldaoust · December 3, 2019, 5:12pm

Oh @PekkaNikander I missed your first response. I’ve responded to it on the local-first software thread.

Wladoo · December 8, 2019, 9:17am

@pauldaoust,@artbrock

this algorithm https://github.com/Kkevsterrr/geneva can be used?

in good form to improve network throughput, as protection against blocking?
in evil, in opposite logic, as “enhancing” the reputation of nodes for malicious acts?
excuse my English

pauldaoust · December 9, 2019, 6:18pm

Interesting initiative @Wladoo ! Sounds like Geneva tests a state agent’s censorship by doing censor-able things and seeing what gets through to the public Internet? Sounds risky but also quite adaptive.

This topic is getting really long and I can’t recall what part of the conversation this suggestion is in relation to. Could you remind us? Thanks!

Sol · December 22, 2019, 7:57pm

@pauldaoust questions time

what if malicious actors want to sabotage a holochain network by deliberating deploying lots of nodes to store data until a significant load is accumulated, and then all of the malicious nodes simutaneously go offline? Will this create a huge blow to the network’s data availability? Any ways to mitigate this?

Sol · December 22, 2019, 8:55pm

@pauldaoust @artbrock
2) I can see holochain’s agent-centric approach + no global state + no consensus + global discovery of local (and correct) states with redundancy makes sense for a lot of use cases. But how about defi use cases (which is a proven and growing) ? I can see for use case like simple payment (holo fuel), holochain should have no problem handling that. But i am not sure for more complicated defi use cases (which it make sense + natural fit to blockchain’s global ledger + smart contract).

I can see the point of a global ledger (and all nodes) that tracks/store info say, on all ETH staked with Maker’s smart contract / all DAI that are minted / liquidation ratio of each and all CDPs / outflow of DAI to other defi apps / inflow of DAI to staked and lend out etc.

Does holochain with its agent-centric design able to handle more complex defi use cases efficiently?

Going back to holofuel - with it’s agent centric approach (where every node store 1st and foremost their HF balance + subset of other nodes’ HF info BUT not the entire HF record), is it easy for a "central authority (perhaps the app provider)"to keep track the “global status” of holofuel total supply (which could goes up and down because of burning and minting) at any 1 point of time accurately? I would imagine that the HF app must need to “talk/query” EVERY NODES to get “individual data” and collate them in order to get the “global data” ?
The concept of staking/collaterizing assets to a smart contract is very common in blockchain system. Is this something that is doable in holochain agent-centric approach? If yes, can you describe how can it be done with results similar to what blockchain achieve but without sacrificing on security + performance?

4.2) I am just trying to think through the process. Let say, for the holochain equivalent of Maker DAO - i want to staked my HF to open a “CDP account” and create a “holochain stablecoin”, where do i send my HF to “a smart contract address” or say a “counterparty” to stake it? Or do i just interact with the application logic that reside on my node and it recognize part of my HF in my balance is now “locked” to mint a “stablecoin” ? And then i send this info to subset of nodes to validate my HF balance? and also “stablecoin” balance?

4.3) “How” and “who” will ensure if my “staked HF” goes below the collaterization ratio, my CDP will get liquidated and pay a penalty fee? If let say, a subset of nodes store part of my CDP info, will they be responsible for executing the liquidation process on my CDP account?

Or is it my “own application logic” on my node that does that? If yes, will there be other nodes that ensure i follow the process correctly?

I think it will be a great topic if holochain can write a blog on defi use cases that holochain can enable and why is it even better than blockchain for this purpose. That is, IF holochain is suitable for defi… Look forward to your thoughts!

artbrock · December 22, 2019, 10:19pm

TL;DR; Not likely a problem, but there are a couple sticking points.

Sounds like strange approach to attacking a system: “Provide lots of free reliable hosting on lots of machines, and then run away hoping the leave some gaping holes in the system.”

As always, a disclaimer: We recommend that publicly joinable apps implement some membranes/barriers to rampant Sybil production. With those in place you don’t even start to have this kind of threat scenario.

Keep in mind, it doesn’t matter if you try to focus your Sybil node addresses in a particular neighborhood to take over ALL data storage for that neighborhood, All the honest nodes will still be randomly distributed throughout the hash space, and it only takes one honest node to be holding the data for it to remain intact after you take your nodes offline, and then it can gossip it to other remaining nodes. When you take into account that some honest nodes may even volunteer for larger coverage spans in the DHT than the minimum required, it may be very hard to find any segments of the address space with no honest nodes storing data.

But there’s an honest node with the original data – the author’s node still has it in their source chain. I don’t think we’ve implemented this yet, but we intend to have some kind of periodic (weekly?) process where a node goes through its chain and tries to retrieve those entries from the DHT. If the DHT fails to find them, then the node simply republishes them.

So let’s also consider how big your attack scenario would need to be to work. If an app has a minimal DHT redundancy factor requiring 5 copies of all data online at all times (and take into account that 5 ONLINE, meaning if nodes in a neighborhood have only 50% average uptime, then there will be 10 copies of data in that neighborhood). You’d have to control over 80% of the nodes in the network to control 4 of 5 nodes for a DHT entry. And given the regularity of nodes coming on and offline, therefore even more nodes holding it than the minimum number, realistically you’d have to control over 90% of the network to get down to a single honest node holding the DHT data. Because until you replace ALL honest nodes with the data, your withdrawal from the network isn’t a threat to data integrity.

Sticking point: Of course this relies on the statistical randomness of the hashing algorithm. It is possible, that by fluke you might control some pieces of data at a lower percentage, and that you couldn’t control others until having over 99% of the network. But again, the self-healing of republishing from the author, re-propagating from cached data reduce your ability to wreak any major damage.

With a higher redundancy setting, the percentage of control requirement rises asymptotically toward requiring 100% percent control. So you’d have to spend a whole of computing power to implement an attack that would likely result in no real loss since authors can always republish their data.

Also, you frequently seem to be talking about currencies on Holochain. A currency transaction has TWO countersigning authors, both of whom have the original data in their source chain and can republish it. So there’s native redundancy before transactions are even published to the DHT.

Of course, PoW Blockchains have even bigger problems with this kind of attack. If a huge mining pool goes offline, then another huge mining pool might suddenly have over 50% of hashing power and full control of the chain.

Sol · December 23, 2019, 6:20am

Thanks @artbrock i mostly have no questions regarding security on holochain. This thread have very comprehensive discussion on it.

Would like your thoughts on other defi question above? Thanks in advance and your insights (along with @pauldaoust) are always educational!

PekkaNikander · January 6, 2020, 3:06pm

I may be missing something, but AFAICS many current financing use cases are implicitly based on the assumption that the used monetary units (the “money”) must necessarily have their value based on artificial scarcity, the same way that the official fiat currencies need artificial scarcity.

However, if one uses direct debt, instead of (artificial scarcity based) money, as the medium of exchange, the situation is different.

When creating a debt relationship, one can use anything as the unit of account. We can agree that I owe you 10$, 10€, 10 beers, or 10 hours of work. The value of such debt relationship depends on trust between us. It may also depend on some external enforcement, if there is covering jurisdiction and suitably managed, reliable IDs.

The trick is to extend such direct debt relationships into something more liquid. There are several attempts to create community currencies based on such systems, e.g. Trustlines and Transitive, perhaps Basis (now defunct) and Xank, and, of course, the original Ripple idea (before their pivot). Historically, many monetary systems were created in this way, rather than having bullion as collateral.

For such a system, the unit of account could be similar to Lietaer’s Terra, using a basket of commodities.

With such pure, debt-based IOU money, where the unit of account is made clearly distinct from the medium of exchange, the so-called “money supply” problem no longer exists. The amount of outstanding debt in circulation depends on the need, trust, and the system’s ability to cancel debts, and does not affect the value of the money, i.e. the book value of the debt relationships.

Now, w.r.t. this discussion, the important point is that for such a money, you don’t need to have a global state, as “double spending” is a much smaller problem. If someone “spends twice” their credit line, they just enter up being more indebted. As long as they cannot exit the system cheaply (e.g. due to their real life relationships being tied to their credit lines, or some other mechanisms), that is not nice but still manageable.

Hence, AFAIK, the ability to handle more complex decentralised finance use cases mainly depends on what kind of media of exchange you assume.

As may be clear from the discussion above, in an IOU-based monetary system the equivalence of such “collateralisation” is bonding your trustline balances. This is similar to surety bonds in traditional lending.

As far as I can see, such IOU based systems are fundamentally more reliable than anything based on artificial scarcity. While algorithmic scarcity is better than fiat in the sense that it is less vulnerable to political manoeuvring, it is still based on faith. Any such money has value only as long as people think it has value. In a way, it is brittle. It may lose its value very fast, similar to bank runs.

An IOU based currency is more likely to lose its liquidity than value. The outstanding debt relationships are still there even if people refuse to start new ones. As long as they are based on real life trusted relationships — and/or are backed up by a working juridical system — they still have value even if the currency itself stopped working. Furthermore, if their value is bound to a commodity based based unit of accounting instead of a something intrinsic or something fiat, the system stopping working doesn’t crash their value.

But I digress. And I’m not holochain. I don’t even know if IOU currencies apply to holochain.

pauldaoust · January 6, 2020, 11:16pm

@PekkaNikander I think you’re right on the money here (ha ha, didn’t even realise my own pun ). Holochain’s primary use case was never artificially scarce assets; it’s always been intended for that other sort of money that goes by names like IOU-based, promise, direct debt, barter, or mutual credit.

@Sol Mutual credit can be thought of as “When I spend money, I go into debt to the market at large, not to any one person”. But perhaps a more useful description is “When I spend money, I am actually creating my own money, backed by my promise to deliver value back to the economy in the future.” I got this perspective from currency designer @mwl who has a lot of experience with this sort of money. He can correct me if he likes, but my understanding is that my promises become a liquid currency because you might not want the value I provide (maybe I sell eggs and you’re a vegan) but you can hand off my promises to someone else who might want eggs. By this time my credits are decoupled from me, but eventually things should balance each other out if people want the services/products that others are offering. And people can use my history of spending/income to gauge how ‘valuable’ my promises really are and decide to not accept my money in the first place.

This isn’t such an unusual form of money. It’s how banks create money up to the limit defined by the central bank; it’s just that they have a legal monopoly on debt-based money issuance which is not great for the rest of us. It’s also how businesses that extend credit to customers work, although they operate with closed books so customers can’t trade their debts among each other. Closest in practice is business-to-business barter networks.

The money in this sort of system is still scarce, but it’s not artificially scarce — it’s created at time of need and it’s limited by real things like a person’s ability to deliver on all the promises they’ve made, as evidenced by things like their current balance, reputation, spending/earning history, etc. But because you are the one creating your money, only you need to keep track of the money you create. And when you can keep manufacturing new money, there’s less of an incentive to ‘fork’ your source chain and forge new money.

However, even in a no-credit-limit you could hide your history and make it look like you’ve spent less than you really have. That’s what peer validation is for.

I don’t understand finance well enough, especially DeFi, to comment on things like staking/collateralisation/bonding, but it sounds like @PekkaNikander knows what he’s talking about.

mwl · January 7, 2020, 3:52am

Thanks for the nudge here Paul - I’ve been away from this conversations for a few months. By my read, you and @PekkaNikander are thinking much as I do, with a few slight diversions and missing some parts I consider essential. Some quick comments –

Both “debt” and “credit” are words that assert that payment has not yet been made, that there is something outstanding and still owing between agents. And IOU has no applicable in this context. There’s no value in this confusion, nor is it necessary. I’m increasingly drawn to “mutual commitment” and “common issue”.

Paradox - everyone can issue and use their own money when/if they realise they can’t have their own. An issuer, in a negative balance, has commitment to serve, NOT expectation of or claim on value. Others can have your money, and you can have others’ - but you can’t have your own. See printing money and links therein for more on what any business can do if they do it right.

Agency (and REA accounting) may give rise to currency systems, but systems that collect agents are supposing carts before horses. https://www.abelard.org/e-f-russell.php shows the elements of what’s necessary and sufficient. But we need to think fractal to prosper.

Pattern is all. What patterns of agent actions are supported by any particular currency? Patterns that persist (in mutual commitment systems) do so because they enable the transactions that are already latent in that community of agents, the money enables what is already there, if it is. But who sees patterns?

If there is hope it lies in the business model, and if there is result it will come by business plans that make sense. Ethical considerations cannot be evaded.

https://kumu.io/mwl/50-slides-of-beer - is a map of some possibilities in progress.

I’m happy to work with anyone who sees something of what this is about and wants to see more, but I’m very reluctant to use this or any other forum to explain (and/or) argue with anyone else. But if you want more information, and more outcome, please do contact me.

Sol · January 7, 2020, 4:44am

Hey @PekkaNikander @pauldaoust @mwl thanks for all your insights. The concept of mutual credits is still something new to me. I need to digest it.

Also, is great that holochain can also start a series of blogs on their alternative currency approach. I think it derserve wider explaination to the holochain community. Not many truly understand it compare to traditional finance.

pauldaoust · January 7, 2020, 5:21am

Agreed. Art, Zippy, Jean, and Ferananda have already written a lot about mutual credit pre- and post-ICO but there is more to write, I’m sure. Much paradigm shifting to be done.

sidsthalekar · January 8, 2020, 1:40am

Great conversation @Sol @pauldaoust @mwl @PekkaNikander
For those interested we’ve been pushing conversations on ‘reputation backed issuances’, where credit issuance limits fluctuate on the basis of reputation held by the agent.
This has interesting implications, because people can derive a sense of sufficiency in the reputation itself, without always having to ‘monetise’ their worth.

Sol · January 8, 2020, 2:34am

By 1st reading/reaction, I feel issuance by reputation can be intangible and subjective. Any objective measures and process to “grade” a person’s reputation in the context of intereaction in holochain apps? Do you have any articles detailing more comprehensive information on this?

sidsthalekar · January 8, 2020, 2:43am

Yes, reputation is highly subjective, which is why it needs agent-centric environments. Our libraries are being built in a way that fosters highly contextual reputation designs.

Basically, each app/collective will have the freedom to design reputation as per their context, and also decide what monetary potential it holds.

So if you create a community of tech-entrepreneurs and reward them with reputation, it may have a high monetary footprint (based on your judgement). Someone else may create a book-readers-club, which may have reputation scores with zero monetary footprint.

Existing models for reputation based money supply exist in South Asia and the Middle East in traditional entrepreneurial communities. But to be blatantly honest, we understand this is a whole new frontier, so we’re taking things slowly.

We’re holding these discussions as part of Reputation Labs, working with apps who’re using our libraries. Some of them are supply-chain communities, crowdfunding platforms (Investor Engine), incubator networks, government led projects, social networks etc. May consider making some of these discussions public.