So, listening a bit to “The Civil Wars” this afternoon cooked up the rest of eVenti. This blog post is brought to you by Maurice Ravel, and Morcheeba however.
The eVenti system is a simple Erlang-implementation of the venti(8) system famous from Plan9. The idea is very simple: We can store data in Venti and the data becomes addressed by its SHA1 checksum.footenote:[I should probably change this later, but for compatibility reasons we stick to it…] This is called a Content Adressable Storage (CAS) because Content is—you guessed it—addressed by the checksum of the content.
Storing the same data twice has no effect, deduplication is automatic. You can’t accidentally destroy data in a Venti-store, because writes are idempotent and happens at most once.
Data storage is forever. And it is excellent for archival storage and backups.
Security is peculiar. I can give almost everyone in the world read access—and yet they will be unable to guess a key (called a score in venti-speak). This makes it possible to use the store as a dead-drop for messages and have others retrieve them at a later stage. Write access is more problematic. While nobody can destroy my data in venti, I can’t protect against an enemy adversary filling up my store with junk. Yet, the properties of implicit integrity is quite nice to have.
The Erlang implementation is very simple. First, we employ the ranch acceptor pool to handle many simultaneous TCP connections. The protocol it binary, but we can use Erlangs ability to match on binary patterns to handle that easily. To handle the backend storage, we employ eleveldb which is LevelDB bindings to Erlang.
Whenever a client connects, we spawn a process and loop over standard
request/reply patterns for that client. A single
runs the calls to and from the database. All code we wrote, including
source, empty lines and comments are 435 lines of code. Erlang tends to
be wonderfully succinct.
An interesting notion is that all commands in the protocol have tags which identifies the message. A tag in a request is reflected in the response. It is the same as correlation id’s in the AMQP protocol for instance. And it allows for aggressive pipelining of requests. If only more protocols had this nice construction built in.
What I love about the model is that venti(8) is a very simple storage endpoint—yet it plugs so well into other subsystems. There is a unix-feeling to Plan9 tools, except they are one-level-up in the distribution chain. The key insight is that with venti, you have a self-contained system you can use to build other systems on top of. And by picking a system that already exists, you leverage the advantage that there are already tooling out there which can read and write from the store.
There are some obvious extensions to eVenti. First, we can trivially exchange leveldb for riak. This will provide proper secure backup of data by running a cluster of machines and by copying data out to multiple machines. Note that the fact that Riak is a AP store is no problem to us: We store immutable data with idempotent writes. Conflict resolution is simply “pick one at random since they are equivalent by construction”.
My intention is to use eVenti as a building block. I have a weakness for Merkle-tree constructions and immutability in general. Plugs to systems like Datomic, Dropbox, and Bitcoin. Note that you can run an event-log as a Merkle tree, in the same sense as a Bitcoin block-chain.
Suppose I and a friend run eVenti servers. If I have write access to his server, I can make encrypted backups at his place, not entirely unlike Colin Percivals tarsnap project. Co-incidentally, it will also provide a nice foundation for a secure encrypted chat service with long-term storage. It will be so simple to link attachments into the chat. Or Chess games. If I know the SHA1 of our game, I can extend it with a move and send the new SHA1 to you. Nobody else knows our game. But if I send you a SHA1 of another game, you can walk the chain and get at all moves. Over time the DAG in venti will form a nice opening book as well.
A system like venti is a necessary tool nowadays. The pendulum has swung totally to the point where every service we use is centralized. Ten years ago, we were running lots of P2P systems: Bittorrent for instance. It is time we—the internet—push the pendulum in the other direction. This means we need to be the arbiters of our own data again. Not push the data to some irritating insane third party company who will like to sell the data for money.
No company will build the decentralized structure, unless they have a business plan for it. And it seems very few companies in this space have any other business plan than to sell your data back to you for money.
Next, I want to see how I can leverage the Plan9 9p protocol and FUSE to turn data into a filesystem I can mount. Imagine having a chat conversation where you can cd into a directory in the filesystem and automatically have access to all the attachments of that chat.
But I do think I need more than 435 lines to do that. Perhaps 1000. Embrace Erlang :)