next-generation programming platform, currently in development
help fund the project

Twitter . GitHub . RSS

Node containers and lightweight, zero-copy node provisioning

We are working on the distributed programming API implementation. After finishing the guts of it, some questions lingered about how to tie everything together. I went off and did some pondering, and now several things are more clear. I’m not sure how much sense this post will make to other people but wanted to do some kind of writeup.

Let’s start with the basics. What is a Node? As in, what is the runtime representation of a Node value in Unison. Well, let’s look at its API:

at : Node -> a -> Remote a
here : Remote Node

instance Monad Remote
-- some other effects can be lifted into `Remote`

Note: In the past I’ve made a distinction between Remote! and Remote (no ‘!’) but I’m going to ignore that here for clarity.

Using the Monad Remote, a computation can bounce around from Node to Node. We can think of Node as a location where computation occurs, and Unison takes care of serializing / syncing up any missing dependencies when moving a computation between nodes.

A very simple representation of a Node might be a host name and/or IP address, plus a public key or hash of a public key. This is all the information you need to be able to contact the node via an encrypted channel. (Unison does not deal directly with the question of tying that public key to some particular entity, though that might be handled with a separate layer in pure Unison code). For easy sharing, we might encode this as a URL like where the fingerprint of the public key appears after the hostname.

When contacting a node, we assume that at the node’s host a server process is running, say HTTP for now. I’ll call this server the node container, and it will administer multiple Unison nodes. So far, we haven’t talked about how to create or destroy Node values. Here is a proposal:

spawn : Lifetime -> Sandbox -> AccessControl -> Budget -> Remote Node

data Budget
data Sandbox
type AccessControl = List Node

data Lifetime
  = Ephemeral -- node resources get deleted on garbage collection of node reference
  | Linked -- node resources get deleted when parent node goes away
  | Resetting Seconds -- node resources get reset to zero after period of inactivity
  | Leased Seconds (Remote Bool) -- node invokes the Remote Bool periodically and shuts down if false
  | Root -- sticks around forever, unless explicitly destroyed

What are all those parameters passed to spawn?

With this model, spawning nodes is meant to be extremely cheap and involve no copying of data, basically just generating a key pair and that’s it. You can generate hundreds of thousands of nodes if you wish. We also get rid of any per-connection notion of sandboxing—the Node is the sandbox. Any nodes in the access control list that can contact the Node gets access to the full set of capabilities determined by the Sandbox. This might seem like a limitation (what if I want Alice to get a different sandbox than Bob??), but it’s not—just create separate nodes for each of your client nodes that you wish to sandbox differently.

Notice that any capabilities granted to the Node spawned must be granted quite explicitly.

Now here’s some interesting common use cases:


For the implementation of the node container:

The container has some ephemeral state. Here’s a description of that state and how it gets updated:

The container also has some persistent state:

For each node, N, we have some persistent state:

Nodes respond to a few messages:

data Message = Destroy Proof | Eval e

Eval e evaluates some expression on the node. This may trigger messages being sent to other nodes, but note that these messages are all logically routed through the container server.

A Node given a Destroy will check the Proof for evidence of knowledge of the node’s private key (perhaps a hash of that private key, or even the key itself). Assuming it checks out, it shuts down, which terminates the process and deletes any persistent state allocated by the node. As its last action it notifies the container that it has shut down.

Some notes:

The idea here is that we don’t ever really have to worry about explicitly destroying nodes. They go to sleep if inactive, and get destroyed automatically under the conditions we specify.

I’ll be implementing this next week. Stay tuned.

comments powered by Disqus