Best Practices for Validation

jakintosh · May 8, 2021, 8:52pm

As I’ve gotten deeper into learning how to actually build a Holochain app and consuming all of the information on this forum, one of my main takeaways is that “validation is tricky”. Being new to the agent-centric paradigm but not new to software development, I see this as a flag that reads “Warning: turbulence ahead.” I’m hoping this can be a thread where the community can discuss the best practices for validation such as:

How to think about validation in agent-centric systems
Traps beginners may fall into when building a hApp
Real-world examples of validation problems (and solutions)

The goal is really to help get developers up to speed with knowing what they don’t know, to paint a rich picture of the complexity of the problem, and give some tools to navigate the problem. Though most of this language is beginner-friendly, I hope that it doesn’t prevent more particular technical gotcha’s from being discussed as well.

Connoropolous · May 9, 2021, 3:37pm

Good point @jakintosh… I’ve been reflecting on similar things. I recently wrote a fairly vast set of validation rules for Acorn… (just go here and search for validate https://github.com/h-be/acorn/find/main)

Before writing them, I wrote an abstract version to guide myself:

github.com

h-be/acorn/blob/main/dna/zomes/projects/validation_rules.txt



goal
- create: `user_hash` must be the agent committing,
          `user_edit_hash` should be None
- update: `user_hash` should be the same as author of create (meaning must resolve the original_header_address),
          `user_edit_hash` should be Some, and match the agent committing
- delete: allowed by anyone

edge
- create: `parent_address` can't equal `child_address`, no self-referential edges
          `parent_address` and `child_address` depends on the `Goal`s at those addresses
- update: not allowed
- delete: allowed by anyone

entry_point (structurally same as goal_member)
- create: depends on goal at `goal_address` existing,
          `creator_address` must match the agent committing
- update: not allowed
- delete: allowed by anyone

This file has been truncated. show original

One thing I regret is using agent addresses in entry bodies themselves. I should have kept to just pulling those agentpubkey addresses back out of the affiliated headers, which save it anyways. But the UI held assumptions about those agentpubkeys being on the objects themselves. I would rather have changed the backend so that it pulled the agentpubkeys back out of the headers, and also added them during ‘writes’, so that I didn’t have to write all the validation rules around them, which if I didn’t have could lead to abuse of other people’s agentpubkey values. E.g. “I said you said something you didn’t”. “I said you edited this and you didn’t”.

I wrote about (and still am updating this) this article about it: Unit testing Zomes (especially validation) with MockHDK

being able to unit test in Rust instead of trying to via tryorama also helped. That’s because all the Rust types are just there for use, instead of constructing everything in JS.

The other main thing in my validation rules is ‘dependencies’. We want to express “this Comment on a Goal can’t exist unless it references a Goal that exists”. When we need to have this type of dependent validation, it leaves an interesting situation because in decentralized system you can’t be sure that something absolutely doesn’t exist, just that it may not be visible to you at this time. For that Holochain has a system for repeatedly checking for that dependency, but by doing it with a strategy called “exponential backoff”, which is where it retries at intervals which are exponentially larger in duration between checks. This helps keep the system from over-spending computer resources on looking for some dependency that really isn’t “out there”. Anyways, it is nice that in validation all you have to do is respond to holochain with ValidationCallbackResult::UnresolvedDependencies and provide a list of unresolved hashes, and Holochain will take of the rest. It will even retry your validation function.

So to summarize:

use of other hashes in entry contents can be tricky. If referencing an AgentPubKey address, it’s good to check for validity of that usage in your app context. If referencing another entry address, maybe you want to know a simple yes/no does it exist or not, or maybe you want something more… like I wanted in one case to validate that certain fields didn’t change from their original values during an ‘Update’ operation, while allowing others to be changed.

I’m sure there are many other primary considerations for app devs, these are from my recent experience