Skip to main content
  1. All Posts/

earthstar

Tools TypeScript

Earthstar

Sync stuff you care about with people you know. A specification and Javascript
library for building online tools you can truly call your own.Build
offline-first, decentralised, and private network applications in the browser,
server, or command-line.
Clearly there’s a lot to unpack, so check out
Earthstar’s website for an introduction to
Earthstar’s concepts, an API tour, technical documentation, videos, and more!

Usage

To use in Deno, add the following:

import * as Earthstar from "https://deno.land/x/earthstar/mod.ts";

To use with Node or apps built with NPM dependencies:

npm i earthstar

And then import in your code:

import * as Earthstar from "earthstar";

Development

Setup

You will need Deno installed.
Instructions for installation can be found here.
You may also want type-checking and linting from Deno for your IDE, which you
can get with extensions
like this one for VSCode.
To check that you’ve got everything set up correctly:
make example
This will run the example script at example-app.ts, and you will see a lot of
colourful log messages from the app.

Scripts

Scripts are run with the make command.

  • make test – Run all tests
  • make test-watch – Run all tests in watch mode
  • make fmt – Format all code in the codebase
  • make npm – Create a NPM package in npm and run tests against it (requires
    Node v14 or v16 to be installed).
  • make bundle – Create a bundled browser script at earthstar.bundle.js
  • make depchart – Regenerate the dependency chart images
  • make coverage – Generate code test coverage statistics
  • make clean – Delete generated files

Orientation

  • The entry for the package can be found at mod.ts.
  • Most external dependencies can be found in deps.ts. All other files import
    external dependencies from this file.
  • Script definitions can be found in Makefile.
  • Tests are all in src/test/
  • The script for building the NPM package can be found in scripts/build_npm.ts

Uint8Arrays and Buffers

We use Uint8Arrays throughout the code to maximize platform support. Some of the
node-specific drivers use Buffers internally but the Buffers are converted to
Uint8Arrays before leaving those drivers.
For convenience, variables that hold Uint8Arrays are called “bytes”, like
bytesToHash instead of uint8ArrayToHash.
util/bytes.ts has a bunch of helper code to do common operations on
Uint8Arrays and to convert them back and forth to strings and to Buffers.

Platform-specific tests

Drivers are tested against the runtimes they’re intended for. When tests are
run, they pull the correct scenarios from ‘src/test/test-scenarios.ts’, where
the current runtime is inferred during runtime.

Classes

The Replica is the main star of the show. Classes to the right are used
internally for its implementation. Classes to the left stack on top of an
Replica to do extra things to it (subscribe to changes, cache data, etc).
Each Replica holds the Docs for one Share.

Names starting with I are interfaces; there are one or multiple actual classes
that implement those interfaces.
The orange classes are “drivers” which have multiple implementations to choose
from, for different platforms (node, browser, etc).
Blue arrows show which functions call each other.
Thick black arrows show which classes have pointers to other classes when
they’re running.

Source code dependency chart

A –> B means “file A imports file B”.
For readability this hides /test/ and /util/ and *-types.ts.

And again with 3rd party dependencies as brown boxes with dotted lines, and
including *-types.ts

Run yarn depchart to regenerate this. You’ll need graphviz installed.

Platform-specific drivers

There are two parts of stone-soup which are swappable to support different
platforms or backends: IReplica and IReplicaDriver. Everything else should
work on all platforms.
Crypto drivers:

  • ReplicaDriverChloride – only in browser, Node
  • ReplicaDriverNode – only in Node
  • ReplicaDriverNoble – universal

Storage drivers:

  • ReplicaDriverMemory – univeral
  • ReplicaDriverLocalStorage – browser
  • ReplicaDriverIndexedDB – browser
  • ReplicaDriverSqlite – Node, Deno

Users of this library have to decide which of these drivers to import and use in
their app.

Documentation

We use JSDoc for user documentation. You can view docs for the whole codebase at
https://doc.deno.land/https://deno.land/x/stone_soup@v8.0.0/mod.ts, or by
running the following from the root of the project:

deno doc mod.ts

JSDocs are intended for end-users of the library. Comments for contributors
working with the codebase — e.g. notes on how something is implemented — are
better as standard JS comments.
If possible, use a single line for the JSDoc. Example:

/** Does something great */
export function doSomething() {
  // ...
}

You can use markdown inside of JSDoc block. While markdown supports HTML tags,
it is forbidden in JSDoc blocks.
Code string literals should be braced with the back-tick (`) instead of quotes.
For example:

/** Import something from the `earthstar` module. */

It’s not necessary to document function arguments unless an extra explanation is
warranted. Therefore @param should generally not be used. If @param is used,
it should not include the type as TypeScript is already strongly typed.

/**
 * Function with non obvious param.
 * @param nonObvious Description of non obvious parameter.
 */

Code examples should utilize markdown format, like so:

/** A straight forward comment and an example:
 * ```ts
 * import { Crypto } from "stone-soup";
 * const keypair = Crypto.generateAuthorKeypair("suzy");
 * ```
 */

Code examples should not contain additional comments and must not be indented.
It is already inside a comment. If it needs further comments it is not a good
example.
Exported functions should use the function keyword, and not be defined as
inline functions assigned to variables. The main reason for this being that they
are then correctly categorised as functions.

Publishing to NPM

  1. Run make VERSION="version.number.here" npm, where version.number.here is
    the desired version number for the package.
  2. cd npm
  3. npm publish

Changes from Earthstar v1

Splitting Storage into Replica and ReplicaDriver classes

Think of this as IStorageNiceAPIFullOfComplexity and
IStorageSimpleLowLevelDriver.
I want to make it easier to add new kinds of storage so I’m splitting IStorage
into two parts:
The Storage does:

  • the complex annoying stuff we only want to write once
  • set(): sign and add a document
  • ingest(): validate and accept a document from the outside
  • user-friendly helper functions, getters, setters
  • an event bus that other things can subscribe to, like QueryFollowers

The StorageDriver does:

  • simple stuff, so we can make lots of drivers
  • query for documents (this is actually pretty complicated)
  • maintain indexes for querying (hopefully provided by the underlying storage
    technology)
  • simple upsert of a document with no smartness

Possibly even you can have multiple Storages for one Driver, for example when
you’re using multiple tabs with indexedDb or localStorage.

“Reliable indexing / streaming”

This shows an implementation of the “reliable indexing” idea discussed in
this issue.

The problem

We have livestreaming now, over the network and also to local subscribers, all
based on onWrite events.
If you miss some events, you can’t recover — you have to do a full batch
download of every document.
Events also don’t tell you what was overwritten, which you might need to know to
update a Layer or index.

The solution: localIndex

Each Storage keeps track of the order that it receives documents, and assignes
each doc a localIndex value which starts at 1 and increments from there with
every newly written doc.
This puts the documents in a nice stable linear order that can be used to
reliably stream, and resume streaming, from the Storage.
When we get a new version of a document, it gets a new localIndex and goes at
the end of the sequence, and the old version vanishes, leaving a gap in the
sequence. It’s ok that there are gaps.
The localIndex is particular to a certain IStorage. It’s kept in the Doc
object but it’s not really part of it; it’s not included in the signature. It’s
this IStorage’s metadata about that document. When syncing, it’s sent as one of
the “extra fields”
(newly added to the specification),
observed by the receiving peer, then discarded and overwritten with the
receiving peer’s own latest localIndex number.

Use cases

  1. Streaming sync between peers, which can be interrupted and resumed
  2. Layers and indexes that use a QueryFollower to subscribe to changes in a
    Storage. These might store their indexes in localStorage, for example, and
    would therefore want to resume indexing instead of starting over from
    scratch.
  3. React components that need to know when to re-render

For use cases where the listener will never have any downtime, they don’t really
need to be able to resume, they can just listen for events from the Storage
instead and it may be more efficient. For example a React component could listen
for events about a particular document instead of making a whole QueryFollower
that has to go through every single change to find changes to that particular
document.

Properties of the localIndex sequence

The docs, sorted by localIndex on a particular peer, have these properties:
Properties

  1. The docs are in a stable order that does not change, except:
  2. Newly added or changed docs go at the end, increasing the highest
    localIndex by 1.
  3. When a doc is updated (same author and same path, but newer timestamp), we
    discard the old version. So we leave a gap in the sequence where the old
    verison used to be, and the new version goes on the end of the sequence.
  4. The first doc has a localIndex of zero, unless it was later changed, in
    which case there will be a gap at zero.

Why do all this? The goal is to be able to catch up to changes since the last
time we looked at a Storage. We can do that by remembering the highest
localIndex we saw last time, and now getting all the docs later than that in
the sequence. We always…