Problem
Not all historical data is interesting, but a Holochain DHT never deletes an entry. This can cause the DHT’s storage requirements to increase over time and overburden people’s devices.
Solution
Create child DHTs for temporary content storage. Allow them to be created and destroyed ad-hoc. Offload such content from a more persistent DHT to this DHT, referencing its content by address. When people lose interest in the data, they leave the DHT. When the last person leaves, the data disappears.
Implementation
Create a DNA whose only purpose is to store ephemeral data of the kinds that your more persistent DHT needs. This DNA can be used as a template for as many short-lived DHTs as are needed. Individual separate DHTs are created by creating a new DNA from this DNA, changing one insignificant parameter such as the UUID or a special value in the properties
section of the DNA. The persistent DHT understands how to reference data in DHTs created from this DNA, using some sort of scheme like (dna_uuid, address_hash)
for foreign addresses.
Currently (as of holochain v0.0.30-alpha6) there’s no way for an agent’s persistent DNA instance to talk directly to their throwaway DNA instances. This is because:
- A DNA instance can’t bridge to multiple child DNA instances unless it knows exactly how many it needs and what their names are, which thwarts our need for ad-hoc DHT creation.
- A DNA instance can’t add bridges after it’s been created.
This means the front-end has to take responsibility for storing and retrieving data in the throwaway DHTs on behalf of the persistent DNA instance.
Introduce a mechanism for recognising when a DHT can get thrown away. For instance, if an entity is marked as deleted in the persistent DHT, it can tell the throwaway DHT to mark it deleted as well. If you keep track of what data exists and what data has been deleted in the throwaway DHT, you can ‘garbage collect’ the DHT (instruct the conductor to remove the instance and delete the DNA) once all tracked entries are marked as deleted.
It’s easy to create a new DHT with the same rules as an existing DHT by changing an insignificant detail in the DNA, such as the UUID or a value in the properties
section. This ‘forks’ the DNA. You can do this three ways:
- If you’re a developer, you can pass a new
properties
JSON object tohc package
using the-p
flag. - If you’re an end-user, you can set the
uuid
property in thednas
config section for a DNA, creating many DNAs from the same file. - The reference conductor’s admin API function
admin/dna/create_from_file
lets you install a DNA file many times over, every time with a new UUID and/or set of properties. The front-end can secretly call this function whenever it needs to create new temporary data.
Warnings
- Because you need to move data through the front-end rather than a bridge, you lose some guarantees over data integrity because the front-end can’t provide the same assurance that the conductor can.
- Decoupling connected concerns into separate spaces forces you to think hard about your dependency graph. It’s easy to introduce a tight two-way coupling between both DNAs; is there a way to design it such that one DNA uses signals to broadcast information so it can be ignorant of the DNAs that depend on it?
- Decoupling also introduces the opportunity for data to get out of sync. Your app will have to manually manage referential integrity for related data in either DHT.