Unit testing Zomes (especially validation) with MockHDK

[Update: I’ve not finished the post, but done something else instead, I’ve set up an example of this over in a repository on holochain-open-dev.
I’ve added additional explanations there. It also shows how to run the tests, and to set up CI to run the tests. ]

For Holochain applications dealing in sensitive data, or shared data, or almost any application, there will be a need for performing peer to peer “validation” of data that is incoming over the network, that peers are being requested to “hold”. Writing these validation rules in our Holochain “Zomes” is among the most important parts of the code that we write in our applications. This post is meant as a guide to writing validation rules, and using the relatively new “MockHDK” to ensure that your application validation rules function as expected via “unit testing”.

What is unit testing?

Unit testing is where we call each function of our code that we define in a “test” of that code, passing it a set of arguments, and defining our expected result. This is because our functions should be written in a way that makes them “deterministic”. They will be “deterministic” if they are “pure functions” meaning they have no unexpected “side effects” outside of the function, a.k.a. by altering global variables and such. We also may have to define multiple tests for a given function, if if has “logical branches” in it (any “if” statement, or something equivalent to an if in logical terms), to test each that each logical branch also happens correctly given a set of inputs that should result in that logical pathway being taken. If we test all these logical branches, in all of our defined functions, then we’ve done what is called “unit testing”, and we have complete “code coverage” or “test coverage”. If we only test some of the functions, or some of the logical branches in those functions, then we have what is considered “partial code coverage”.

What are validation functions in Holochain and how do they work?

Validation functions are defined in your Rust/wasm code, and so exist as a “hook” that Holochain calls into whenever any source chain “write” or “DhtOp” is being attempted. A “DhtOp” is a “Distributed Hash Table Operation”. That means that you, or someone else, is attempting to make a change to either the “source chain” or “DHT” data stores, and that is why it is usually good to validate the data at that time. Along with some built-in ways that Holochain validates the operations, the application developer can also define some customized ways to validate. For different types of operations, there are specific hooks which can be triggered. Since there are different “entry types” that the application developer can also define, these “entry types” are likely to have their own unique validation rules that differ from each other. We can set up the hook, by following the pattern:

validate[_${cudOperation}][_entry][_${entryType}]

We interpret this as a hook which has various definable levels of specificity. We can target validation of ALL operations of ALL entry types, with just validate as our hook name.
We can target validation of any cudOperation, which is standing for Create-Update-Delete Operation, by adding it to the name of our validation hook, such as validate_create, validate_update or validate_delete. This will be called specifically for any create operation, update operation, or delete operation, respectively. Thirdly, we can limit our validation to only entries that are of the “App” entry type, meaning that they are defined by you the application developer, by using validate_{${create|update|delete}_entry. Lastly, we can target a specific entry type, by adding its “entry id” to the end, such as validate_create_entry_edge.

Interestingly, we can define many or multiple of these hooks, and they will ALL have to pass in order for the operation to pass validation. If a hook is omitted from application code, it will be assumed to be a valid “pass”.

example
If I have an entry type defined like this:

#[hdk_entry(id = "edge")]
#[derive(Clone, PartialEq)]
pub struct Edge {
    pub parent_address: WrappedHeaderHash,
    pub child_address: WrappedHeaderHash,
}

Its’ “entry id” is edge, defined by (id = "edge").

Note that Holochain will only know about your entry type if you pass it in the entry_defs! macro, like:

entry_defs!(
    Edge::entry_def(),
);

Then, I might have a validation hook that has the following function signature:

#[hdk_extern]
fn validate_create_entry_edge(validate_data: ValidateData) -> ExternResult<ValidateCallbackResult>

Notice how we have validate, then _create (so it will only be called during a “create” operation, not update/delete), then _entry_edge which limits this hook to operating on an edge app entry type.

It is necessary to include #[hdk_extern] in order for Holochain to be able to have access to the exposed function.

For a more general validation pattern, you could have this:

#[hdk_extern]
fn validate_create_entry(validate_data: ValidateData) -> ExternResult<ValidateCallbackResult>

ValidateData and ValidateCallbackResult

Regardless of which validation function name you use, and its specificity, the function signature will be the same, it is the contents of the data passed that will differ, as it will be more or less constrained by which contexts Holochain calls it in.

The signature will be:

(validate_data: ValidateData) -> ExternResult<ValidateCallbackResult>

First of all, you can view these two ValidateData and ValidateCallbackResult structs in the HDK documentation here and here, but I will explain them myself.

First we can mention ExternResult and get it out of the way. ExternResult is a classic Rust Result type with Ok and Err. The most important thing to note here is that even if the data you are validation is to “fail” validation, you should not return an Err result for the ExternResult. You will use a variant of the inner ValidateCallbackResult instead. Only produce an Err if your code really experienced something faulty during its execution.

In regards to ValidateData, there are two sub-structures it consists of, both of which are used for, you guessed it, providing metadata to validate against. There is the .element property which can be accessed, where an Element is a combination of a Header and an optional Entry. There is the .validation_package which can be accessed which is an optional additional bundle of data that can be passed from the operation author to the operation validator, that I am not going to talk more about right now. Let’s assume a typical, and basic use case is that you want to verify against some contents of the entry. You might want to convert back from the raw “app entry bytes” that are in the .element (maybe) to the Entry type struct of your definition. It can seem a bit hard. My code for performing a conversion from the “Element” (reference) back in to an “Edge” looks like this (note this isn’t in the validation callback:

impl TryFrom<&Element> for Edge {
    type Error = Error;
    fn try_from(element: &Element) -> Result<Self, Self::Error> {
        match element.header() {
            // Only creates are allowed for a Edge.
            Header::Create(_) => Ok(match element.entry() {
                ElementEntry::Present(serialized_edge) => match Edge::try_from(serialized_edge) {
                    Ok(edge) => edge,
                    Err(e) => return Err(Error::Wasm(e)),
                },
                _ => return Err(Error::EntryMissing),
            }),
            _ => Err(Error::WrongHeader),
        }
    }
}

Now, in my validation callback, we see this:

#[hdk_extern]
fn validate_create_entry_edge(validate_data: ValidateData) -> ExternResult<ValidateCallbackResult> {
    let proposed_edge = match Edge::try_from(&validate_data.element) {
        Ok(edge) => edge,
        Err(e) => return Ok(ValidateCallbackResult::Invalid(e.to_string())),
    };
    ...

Notice two things:

  1. if Edge::try_from fails, then we do not return from the function with an error, but with an Ok(ValidateCallbackResult::Invalid(e.to_string())) matching my original comment that if we deem an operation to be invalid, we don’t return Err, but this ValidateCallbackResult::Invalid which should include a human-readable string explaining why this entry or operation is being flagged.
  2. If deserialization from the “SerializedBytes” back to the Edge entry type succeeds (via the inner Edge::try_from(serialized_edge)) then proposed_edge is now at least a potentially valid Edge candidate for writing and storing to our slice of the DHT. Think of this stage as just a “sanity check”, although a necessary one, while we get on the real substance of validation. We don’t return just yet, we proceed with proposed_edge to other parts of the validation.

In summary there is most likely an “entry” of interest nested under the element, unless you are handling a delete operation not an update or create. It is to be found, if it is present (and that’s an important IF) under validate_data.element.entry() which is itself an enum, and you have to check if it is equal to ElementEntry::Present to get at the inner entry, which still needs to be deserialized at that point! Yikes…

The other important part of the element is of course the Header. Unlike Entry there is guaranteed to be a Header (phew). There are a lot of different kinds of checks and matching we might wish to do with the Header, but let’s just look at a quick one, for example.

If we had an Entry type like:

#[hdk_entry(id = "entry_point")]
#[derive(Clone, PartialEq)]
pub struct EntryPoint {
    pub creator_address: AgentPubKey,
}

and creator_address could only be validly written with a value equal to the AgentPubKey of the person sharing the data. (A.k.a. agents can only write this data about themselves, not one another).
Here is some validation to check the header:

#[hdk_extern]
fn validate_create_entry_entry_point(
    validate_data: ValidateData,
) -> ExternResult<ValidateCallbackResult> {
    // this is like the first example, to check presence and deserialize
    let proposed_entry = match EntryPoint::try_from(&validate_data.element) {
        Ok(entry) => entry,
        Err(e) => return Ok(ValidateCallbackResult::Invalid(e.to_string())),
    };

    // creator_address must match header author
    if proposed_entry.creator_address != validate_data.element.header().author.as_hash().clone() {
        return Ok(ValidateCallbackResult::Invalid("Should not use an AgentPubKey other than your own for creator_address".to_string()));
    }
    Ok(ValidateCallbackResult::Valid)
}

Here we see how it can be very relevant to know the AgentPubKey “public key” of the operation “author”. The actor/agent who triggered this event. It can be accessed as an AgentPubKey type via validate_data.element.header().author.as_hash().

Through all this, you’ve also seen how ValidateCallbackResult can be used. We will discuss it specifically now.

When declaring an operation as “valid” we don’t need to give a reason. It has passed all and any validation rules we’ve defined, that’s why it’s valid. We just return from the function Ok(ValidateCallbackResult::Valid). Chances are this is the last thing you will write, or the last logical branch, since you have to eliminate all the options that make it invalid first.

There is a third variant of ValidateCallbackResult to cover. It’s called ValidateCallbackResult::UnresolvedDependencies. It’s likely the most complex of the three variants.

View Complete Source Code

There is some work in progress source code, of validation rules and unit tests with mocking to test, that you can view, for a comprehensive view of different rules.

Special For VS Code

I had a bit of a hard time configuring it, to fully use the Rust analyzer convenience access to the test runners. (where it says “Run Test” in the screenshots)

You might see these issues, if you are using VS Code, and Rust analyzer extension.

There is a Rust “feature” that must be enabled

This was visible to me in the bash script he had set up for executing the tests, but I was struggling to figure out where all to set it, to get my VS Code playing nicely

This was the script:

cargo test -j 2 --manifest-path happ/zomes/points/Cargo.toml --lib --features="mock"

You will need this in your Cargo.toml for one thing:

[features]
default = []
mock = ["hdk/mock"]

But there is also this hidden feature of Rust analyzer where in preferences you can toggle on Rust features, so we can use this. Click Add Item

type in mock and hit ok

Then reload your VS Code

No more problem with MockHdkT not found

Extra Reading

8 Likes

@Connoropolous thanks once again for smoothing over the cracks in my work :sweat_smile:

hope you enjoy mocking your validation rules!

3 Likes

@thedavidmeister I’ve gone and done some pretty big updates more like writing an article on validation and (I’m getting there) how to use the MockHDK and do fixtures and stuff. So if any one bugs you about it you can point them here. And also if you can give a read and correct me if I’m wrong anywhere!)

3 Likes

@pauldaoust have you seen this? it’s not 100% complete, the parts about ‘how to unit test’ still aren’t in, but the parts about validation are, seen as the first half of this.
if you want to use this in the developer documentation, it can be, if there’s a way to have credits?

3 Likes

I’ve not finished the post, but done something else instead, I’ve set up an example of this over in a repository on holochain-open-dev.
I’ve added additional explanations there. It also shows how to run the tests, and to set up CI to run the tests.

4 Likes