next-generation programming platform, currently in development
about
help fund the project
swag

Twitter . GitHub . RSS

About



Last updated: March 2016

Unison is a new programming platform that rethinks just about every aspect of the programming experience. This page shows a little bit of what’s possible (or will be possible soon) with Unison. We’ll start with the editor, which has to work differently given Unison’s simplified approach to building distributed systems and dealing with persistence. (More on that later!)

Unison programs are edited in a semantic editor. Interactions in this editor are highly structured—we get rich guidance from the typechecker and are prevented from making various silly mistakes that are possible when plain-text editing. Let’s have a look. Here we have an editor session, displaying just a single expression, a bit of text, which we’ll edit:

In this video, we replaced the selection with a new piece of text. Notice the editor knows the type of what we’ve selected for editing (Text in this example). And we don’t have to remember to close the quote; the editor does that for us. It’s the same with any other delimiter—the editor constrains us to producing programs that are syntactically valid.

Pretty simple. Let’s see what happens if we open the selection and try to type some garbage:

We’re still allowed to type whatever we want, but if what you type is invalid (because it is syntactically invalid, as in this case, or ill-typed, which we’ll see later), it won’t be selectable. Thus, Unison programs are syntactically valid (well-formed) and well-typed by construction. There’s also no separate compilation phase (learn more) and code is always immediately available for execution, as soon as you’ve finished editing it. The programmer should never have to wait around for their code to compile.

By the way, this thing that pops up when opening an expression for editing is called the explorer. Let’s look at the explorer a bit more closely now—it has a lot of useful information in it.

First, when popping open the explorer, notice again it tells us the type of the expression we’ve opened. No surprise, that expression (the ‘target’) has type ‘Text’. It also gives us a list of valid replacements for the target. At the moment, there aren’t any constraints on the type of any replacements, so anything goes. The list shown isn’t necessarily complete, and we can refine the results just by typing more characters. In this video, we select the + function, applied to two arguments (the option _ + _), and advance to editing the first argument. Notice that the editor now gives us a goal type of Number (inferred from the type of +), which is in bold at the top of the explorer.

Aside: In Unison, there are no ‘import statements’. Everything in the Unison code store is available and searchable via the explorer; if there are too many results, just refine by providing more information. We can imagine having some notion of ‘imports’, but this would be a property of the editing session, it’s not attached to the code in any way. Two people could open up the same piece of code and apply different imports to their editing session!

Let’s look at what happens if we try to choose something ill-typed in the explorer:

We still learn the type of what we’ve entered (helpful for understanding where we’ve gone wrong), but since it doesn’t match the goal type of Number, the editor won’t let us select it! This experience of not being allowed to select a search result in the explorer is the closest we get to a type error. This seems limiting but it’s not really; the programmer can insert blanks anywhere in their program (deferred bits of code), build up fragments of code ‘off to the side’, and generally work in sketches, gradually bringing together bits of code. (FAQ: how do we make large-scale edits to a codebase without breaking things? See this post for details)

Aside: Type errors are a usability catastrophe, for both beginners and experts, and language implementators invest huge amounts of time and engineering effort figuring out how to report errors ‘well’ (for some definition of ‘well’). But when you think about it, it’s a bit odd that the default way of editing programs is totally unstructured raw text editing, which allows all kinds of edits that make no sense whatsoever. Why should we accept this assumption, which dates back to the punchcard era of programming when no other options existed?

Okay, let’s build something a little more interesting which shows off some more features. Things happen fast, you may want to pause to see the info shown in the explorer at various points:

Several new things are shown here:

All right, so this editor is pretty neat, and we can imagine adding lots of bells and whistles to it (documentation and usage examples that slide out from the current explorer selection, for instance). But the editor is just a tiny piece of the overall project, and it’s actually more like a byproduct of a much bigger, and more important design goal—we want it to be completely trivial to persist arbitrary Unison values, in a database or some persistent store, and we want it to be completely trivial for Unison nodes anywhere on the planet to exchange data and computations.

Why is this so important? Well, let’s look at the web as it exists today. All over the world, millions of software systems are sitting atop the web, and even within a single company (like say Google), there might be a fleet of different servers that communicate in some way to present some higher-level service (like a search engine, for instance). It is probably not an exaggeration to say that 90% of the code programmers write is just dealing with the particulars of communication—how to send values between nodes in these distributed systems, details of encoding and decoding, networking, dealing with persistence, etc. And then sandwiched in between all that is a tiny bit of ‘interesting’ computation or business logic.

A good way to see how much overhead there is to think about how much code it would take to implement something like a search engine if the search index and all the documents being indexed fit in memory and existed as some in-memory data structures (might be a couple hundred LOC for something simple). And now compare that to how much code you need to represent conceptually the same data structure and algorithm, but distributed on 10,000 machines. Suddenly we seem to need 100x or even 1000x the amount of code, and an army of programmers. Is that really the best we can do?

It’s true that writing multi-machine / distributed systems has some additional essential complexity, but in Unison, we’re trying to simplify things as much as possible and let programmers focus on the essence of what they’re trying to do. We envision a world in which the set of Unison nodes connected to the internet form a global web, and any of the nodes in this web may trivially exchange absolutely any values in the Unison language, including functions. Doing this in a scalable way requires that we do something really radical.

Before we get to the ‘how’, let’s look at a piece of the Unison API for building distributed systems:

at : Node -> a -> Remote a

That is, we can transport any value at all to another Unison node, obtaining a handle to the computation so we can extend it further if we like. A Remote a is a description of a computation that may proceed on one or more nodes, and it comes equipped with a rich set of operations. Let’s look at map:

map : (a -> b) -> Remote a -> Remote b

Let’s see an example of this in action. We’ll transport a value to another server, then continue the computation from there, using map:

In this video, we create a computation which runs on server1. The first part of the computation produces the list ["Alice","Bob"], and we continue the computation on server1 using the function onserver1, declared in the let block. There are no restrictions on the function we pass to map or the values we pass to at; the values might have a million transitive dependencies; any missing dependencies will be synchronized to server1 when the computation is evaluated and cached for later.

Aside: For more details on this API, as well as discussion of how it can be made secure, check out this post which walks through the design in detail.

With this API we get to be very precise about where computations take place, so in that sense it’s very explicit, but we’re also totally freed from the usual boilerplate and plumbing code we’d have to write when moving values and computations around in our distributed system. No JSON parsing or encoding, no low-level networking code, etc, we simply declare that we want a value transported to some other node, and that’s it! We can continue our computation there, hop to any other node, and so on! Building distributed systems does have more sources of essential complexity… but the idea is that Unison lets us focus on the essence of the problem, not the mundane details.

Aside: The exact same principles apply to dealing with persistent state. We should be able to trivially persist arbitrary values (including functions), rather than having to deal with boilerplate converting to and from SQL or to and from JSON or whatever ad-hoc format is consumed by some NoSQL store.

All right, now how is this achieved? The idea is simple, but it has far-reaching consequences. In Unison, values are identified not by name, but by a cryptographic hash of their definition. That is, suppose I write a function: sum, for adding up a vector of numbers. In Unison, we identify that function by a hash of its implementation—the name sum is ‘just’ human-readable metadata associated with that hash (which the editor of course uses for display purposes and lookup). This has a number of very positive benefits:

But it also raises new challenges and questions, which Unison has needed to address:

Modern software is built on layer after layer of foundational technologies and assumptions. In foundational work, the details matter. The Unison platform is tossing out some of these layers (many of which haven’t been seriously questioned or revisited in ages) and growing something new, atop a very different set of foundational assumptions.

What’s next?