WASMI vs. WASMER

thedavidmeister · January 23, 2020, 6:21pm

I started a branch that replaces wasmi with wasmer

github.com/holochain/holochain-rust

WIP: 2020 01 19 wasmer

holochain:develop ← holochain:2020-01-19-wasmer

opened 09:15PM - 22 Jan 20 UTC

thedavidmeister

+4426 -9445

## PR summary ### HDK progress - [x] globals - [x] `can_use_globals` ru…ns - [ ] call - shows that there is a bug in the way serialization works for results, requires weird hacking, not sure about ramifications of this - ideally i'd like to include a validity check on JsonString that doesn't accept invalid json so i don't have to manually audit this - [ ] capability - **no tests in the hdk tests that i can see** - [x] commit_entry - [x] `can_commit_entry_macro` passes - [x] debug - [x] debug works - [ ] decrypt - **there are no tests for decrypt in hdk** - [x] emit_signal - [x] test_signal passes - [ ] encrypt - **there are no tests for encrypt in the hdk** - [ ] entry_address - [ ] `can_check_app_entry_address` - [ ] `can_check_sys_entry_address` - [x] entry_type_properties - [x] `test_get_entry_properties` passes - [x] get_entry - [x] `can_get_entry_ok` - [ ] `can_get_entry_bad` - [ ] get_links - [ ] `test_links_with_load` - [ ] keystore - **there are no tests for keystore in the hdk** - [ ] link_entries - [ ] `can_roundtrip_links` - [ ] property - **there are no tests for property in the hdk** - [x] query - `can_check_query` - [ ] remove_link - **there are no tests for remove_link in the hdk** - [ ] send - **there are no tests for send in the hdk (there is one elided by the compiler as broken-tests)** - [x] sign - [x] `test_signing` passes - [x] sleep - [x] smoke test seems to work, returns ok - [ ] update_entry - **there are no tests in the hdk for update_entry** - [ ] remove_entry - **there are no tests in the hdk for remove_entry** - [ ] show_env - [x] `show_env` passes ## testing/benchmarking notes ( if any manual testing or benchmarking was/should be done, add notes and/or screenshots here ) ## followups ( any new tickets/concerns that were discovered or created during this work but aren't in scope for review here ) ## changelog - [ ] if this is a code change that effects some consumer (e.g. zome developers) of holochain core, then it has been added to [our between-release changelog](https://github.com/holochain/holochain-rust/blob/develop/CHANGELOG-UNRELEASED.md) with the format ```markdown - summary of change [PR#1234](https://github.com/holochain/holochain-rust/pull/1234) ``` ## documentation - [ ] this code has been documented according to our [docs checklist](https://hackmd.io/@freesig/Hk9AmKJNS)

i’ll keep trucking through the technicals until i get all the tests passing but i thought i’d open the meta process up for visibility and discussion

pros

starting with the motivation for why we do want to move towards wasmer…

better performance

wasmer promises ~2 orders of magnitude faster execution of wasm logic

The wasmi runtime is interpreted and we measured it to perform between 150–190x native speeds. It was therefore omitted from the charts as the large difference made the graphics difficult to view the smaller runtime values.

how much of this translates to what we are doing (a lot of logic sits in the rust core, outside wasm anyway) remains to be seen, but i’m looking forward to running the profiler once it is executing wasm correctly

simpler codebase

wasmer offers the use of native rust closures and better macros for “importing” rust functions and access to core more generally into wasm functions that happ developers can use

importing functions is what makes it possible for a happ developer to call a “commit entry” function inside wasm and have that run rust functions outside wasm

with wasmer we should be able to more easily capture the execution context that holochain is running at the point that we import all the wasm functions simply by using lexical scoping - wasmi on the other hand is based more on structs and trait implementations.

the wasmi code works OK but it pushes us to wrap several abstractions around what we are doing (let a happ call a core function), that make it much harder for new developers who want to contribute to understand what is happening in the code (i expect to delete several thousand lines of code and tests by the time i finish porting to wasmer)

to be fair, some of the simplifications could have been done anyway but it all seems more straightfoward with the wasmer tooling

more standard

wasmer is maintained by a dedicated organisation that exists to push wasm into many ecosystems in a standardised, high performance and ergonomic way

they maintain runtimes for go, c, java, rust and c# and a dedicated wasm package repository for wasm-native reusable code

wasmi is maintained by parity, a company dedicated to building blockchain technology (e.g. on ethereum and polkadot)

the focus, funding and ongoing development of wasmi are ultimately driven by the target to have wasm running “on blockchain” (whatever that means long term)

cons

looking at what we would be losing (and if/how this might be relevant)

determinism

the main reason wasmi exists at all is because of “non-determinism” in all other wasm implementations

i haven’t found a clear spec or outline explaining comprehensively all the non determinism wasmi has identified and explicitly addresses that other wasm engines do not, but my research has identified a few high level areas.

deterministic wasm execution return values

this is the most obvious type of non-determinism that people think of, and the wasm specification outlines known sources of non-determinism in any implementation that follows the spec (as far as i know, wasmer follows the spec)

intuitively we can think of this as “1 + 1 always equals 2, right?”

and luckily “1 + 1 = 2” is indeed always true but there are sources of nondeterminism listed

different features in different wasm versions: this essentially boils down to the same problem as any other breaking API change in the conductor or HDK…
threading (future feature): generally i don’t think it makes sense to include code that leads to the type of non-determinism/concurrency that threads introduce into the places (e.g. validation logic) that non-determinism is most dangerous in holochain, also this is a future problem as sane threading models in wasm aren’t really “a thing” yet
NaN handling is non-deterministic in how the bits of the NaN are handled, which also means that doing things like if maybe_nan > 0 { ... } are probably non-deterministic either: this really needs to be made clear to happ developers and some best practises established but it seems (probably) entirely manageable with native rust functions like f32.is_sign_positive() f32 - Rust or our own equivalents in the HDK
fixed width SIMD has nondeterminism: at this stage i don’t think this impacts anybody, may be a longer term consideration somehow
environment resources can run out: e.g. memory could be used up on one machine where it would not on another… i don’t see wasmi solving this either as the wasm spec allows for up to 4GB of ram to be allocated per linear memory, memory usage is a combination of core and happ planning
any other non-determinism in the language that compiles down to wasm: e.g. something in Rust that is not deterministic is not going to be fixed by wasmi OR wasmer

so it’s not clear to me how much of this wasmi actually solves…

wasmi can’t fix higher level language concerns, it can’t prevent resource exhaustion, changing wasm features, concurrency concerns from threading…

potentially it could define some NaN and SIMD behaviour that is deterministic but it’s not a silver bullet and both of these cases should be manageable in the happ zome layer

deterministic VM etc.

This post explains in length the thinking behind wasmi here.

A lot of it boils down to what is needed to safely superimpose WASM on top of a blockchain.

The determinism discussed here talks about complexities from targetting different architectures and the ad-hoc optimisations introduced by JIT compilation.

All of this is a problem because on a blockchain nodes are not able to opt-out of executing malicious code.

So this means that a single “compiler bomb” could bring an entire blockchain to its knees in one nasty black swan event.

V8 and SpiderMonkey are not just theoretically nonlinear: real-world “compiler bombs” (pieces of code that cause the compiler to take an exponentially long amount of time) have been found and there is no reason to believe that even if they are fixed that more will not be found in the future.

In the holochain world I don’t see this as an issue as every happ has an isolated/dedicated DHT/network and every user is free to participate or not participate in running every WASM (zome).

A compiler bomb could certainly be coded into a WASM and exist in the world but it seems impossible to cause users to suddenly start running it, almost by definition.

There may be a concern for delegated node execution (a.k.a. holoports) if users can simply force ports to run things arbitrarily, but this general problem is neither introduced not exacerbated by the potential for compiler bombs as a happ developer can much more easily write malicious code straight into the WASM and deploy that.

Deterministic execution (gas) cost

Blockchains charge fees for their usage, the only blockchain widely used that seriously considers WASM is the one wasmi was designed for: Ethereum.

Ethereum has the concept of “gas cost” that MUST line up 1:1 with the actual execution cost forwarded to the end-user, at risk of potentially critical security vulnerabilities or scalability problems.

Holo might want metering to be as close as possible to real execution costs for obvious reasons (raising holofuel invoices) but it’s not so critical as in the ETH world (e.g. there are no time-bound global blocks with gas limits to be managed). In practise it seems to me like allowing 100x performance optimisations is far more significant than bean counting CPU cycles, even in the Holo context - Holoports should get faster/better as code and hardware improves over time, not treat CPU as a shared/scarce resource.

If i’m wrong about this, it would seem to be an argument for supporting multiple WASM backends rather than enforcing the lowest common denominator. IMO there is no reason or justification to force holochain conductors run natively by end-users to run 100x slower simply because holoports have some domain-specific metering concern.

cost of change

cost to happ devs

it’s important to understand whether we expect changes to the behaviour of existing WASM code

in short, other than the specific determinism issues outlined above there should not be any differences in WASM execution

both wasmi and wasmer are implementing the same WASM spec

swapping out one for the other 1:1 without changing any underlying core workflows that handle the imported AP functions should not change anything from the perspective of the WASM

the main thing that could cause some change that might actually impact an existing WASM is the handling of floats (because integers don’t support NaN there is only an opening for mistakes when floats are used)

one thing to note is that we don’t actually support floats as arguments to/from WASM functions natively because there is no From implementation between floats and JsonString - holochain-serialization/json.rs at develop · holochain/holochain-serialization · GitHub

that automatically mitigates a lot of potential problems as happ developers right now cannot accept or return floats without implementing custom serialization logic for it, so i’d expect that most people are just working with integers at the moment

cost to core devs

the other cost of change is the work of doing the refactor to convert between wasmi/wasmer

realistically though, if the simplifications to the code that i’m expecting/hoping for do materialise, the conversion should pay for itself relatively quickly by making the code easier to work with

the exception to this is if we want to try and support wasmi and wasmer in parallel right now

i don’t feel confident providing an API wrapper (like we have for networking and persistence) that adequately wraps both wasmer and wasmi, while still achieving the goal of code simplification - at least not right now, not in a reasonable timeframe in context of everything else that needs doing

for that reason i’m presenting this as an either/or scenario, if we want to move forward with wasmer i think we will need to drop wasmi for the short-mid term and only re-introduce it if we can show it is critical to resolve a well defined problem (e.g. if we find we need better CPU metering on holoports or something…)

thedavidmeister · January 23, 2020, 6:22pm

CC: @lucksus @artbrock @zippy @maackle @jmday

thedavidmeister · January 24, 2020, 2:32pm

update: wasmer doesn’t seem to support being compiled into wasm itself - https://github.com/wasmerio/wasmer/issues/217

this is something that wasmi does support, and we used this in our implementation of memory management (to calculate pages per bytes etc.)

this means that memory handling will probably change, this could impact any happ developers bypassing the HDK and doing their own memory management with wasmi

the workaround for happ developers would be to copy or import just the wasmi page/bytes calculations into their wasm directly

pauldaoust · January 24, 2020, 10:01pm

I would suspect that, at this stage, the number of hApp devs this will affect is hovering somewhere around zero

thedavidmeister · January 24, 2020, 11:35pm

for sure, just being clear as i can be as i work through it

marcus · February 20, 2020, 4:53am

I didn’t even know I wasn’t supposed to be using floats.

thedavidmeister · February 21, 2020, 6:55am

@marcus it’s not that you can’t use floats, it’s that you need to be aware that NaN exists when you use floats and that deciding whether NaN is positive or negative is non-deterministic

thedavidmeister · March 2, 2020, 2:08pm

update here, i got the wasmer closures compiling today, so next step is to get the tests working

thedavidmeister · March 8, 2020, 9:17pm

update: tests are generally running (mostly passing, but also a bunch failing due to serialization differences), but i’ve introduced breaking changes to the HDK that will need to be phased in carefully to get this into production

main thing in the wasmer approach is that everything going into and out of wasm must be a data type that implements standard serialization logic, e.g. no String values to be thrown over the fence and expecting the other side to know what to do with it (should it literally be a string? is it serialized data? something from websockets? base64 binary? who knows?)

that means that (for example) validation logic cannot simply return a non-empty string and have that imply failed validation with the string being the message, it needs to return a ValidationResult::Pass or ValidationResult::Fail(String) style enum, that both the host and the guest can agree on at the compiler level (there’s a few other places, like callbacks that get similar treatment)

i did some basic benchmarking on how long it takes to “boot up” a wasmer module and run it for a function call. Using the HDK test wasm which is 39 mb .wasm file (fairly large) i see 2 seconds for a cold call (needs to compile the .wasm), using the default file cache from wasmer brings it down to 500ms, then adding in a memory cache for the modules brings the per-function overhead of loading modules down to about 1ms (which can be re-used across DNAs that share the same wasm)

given that our conductors are relatively long-running, and that we never change the wasm once we hash it for a hApp, our cache-hit percentage should be really high, like well over 90%, so the 1ms turnaround for a function boot is a reasonable target i think (most of this 1ms is wasmer creating import functions and setting up memory and whatever else it does, so i do see this as some kind of theoretical lower limit unless we try to re-use instances as well, but that could be hard on memory since it is not possible to de-allocate wasm memory pages according to the wasm spec)

i don’t have numbers on the wasmi version but with 1ms to boot an instance, that’s likely much faster to both boot and run wasmer

i also did some basic benchmarking by throwing an old test showing data moving between the guest and host, that wasmi handled very poorly (took several minutes to copy about 1mb of data) at the wasmer setup and it really flew, i see several GB of bytes moving back and forward between the host and guest in a few seconds, and i can put several GB (up to roughly the 4GB limit from the spec) both as input args and output values from wasm function calls (where wasmi tends to die after just a few hundred kb of data at times)

i have been running into some issues around the current setup using strings in some places, custom json in others and default json handling in others… just makes it confusing to get the tests all passing when my wasmer setup basically forces everything to be bytes

i might need to look at updating the serialization layer to be byte-oriented rather than string-oriented, and then the wasmer stuff should be very easy to layer on top of that

also, subjectively the new setup should be much easier to manage going forward, i collapsed the whole thing down into some macros, so adding and maintaining things in the HDK and on the host side is just a matter of lining up the rust types and using the right macros

cc: @pauldaoust @marcus

thedavidmeister · March 8, 2020, 9:24pm

cc: @Connoropolous

thedavidmeister · March 8, 2020, 9:38pm

cc: @guillemcordoba

thedavidmeister · March 26, 2020, 3:54pm

ok so this is basically ready to go as a standalone crate (build on top of the new ‘serialized bytes’ crate)

the readme there explains how it all fits together

the only question is how to roll it out into an HDK as it’s incompatible with the current HDK