kappa-core
Advanced tools
Comparing version 3.0.2 to 4.0.0
@@ -5,3 +5,3 @@ { | ||
"author": "Stephen Whitmore <sww@eight.net>", | ||
"version": "3.0.2", | ||
"version": "4.0.0", | ||
"repository": { | ||
@@ -19,6 +19,6 @@ "url": "git://github.com/noffle/kappa-core.git" | ||
"dependencies": { | ||
"hypercore": "^7.2.0", | ||
"inherits": "^2.0.3", | ||
"multifeed": "^3.0.6", | ||
"multifeed-index": "^3.2.2" | ||
"hypercore": "^7.4.0", | ||
"inherits": "^2.0.4", | ||
"multifeed": "^4.0.0", | ||
"multifeed-index": "^3.3.2" | ||
}, | ||
@@ -28,5 +28,5 @@ "devDependencies": { | ||
"standard": "~12.0.1", | ||
"tape": "^4.10.1" | ||
"tape": "^4.11.0" | ||
}, | ||
"license": "ISC" | ||
} |
167
README.md
# kappa-core | ||
> Minimal peer-to-peer database, based on kappa architecture. | ||
> kappa-core is a minimal peer-to-peer database, based on append-only logs and materialized views. | ||
## Introduction | ||
kappa-core is built on an abstraction called a [kappa architecture](kappa), or | ||
"event sourcing". This differs from the traditional approach to databases, which | ||
is centered on storing the latest value for each key in the database. You might | ||
have a *table* like this: | ||
|id|key|value| | ||
|--|--|--| | ||
|51387|soup|cold| | ||
|82303|sandwich|warm| | ||
|23092|berries|room temp| | ||
If you wanted to change the value of `soup` to `warm`, you would *modify* the | ||
entry with `id=51387` so that the table was now | ||
|id|key|value| | ||
|--|--|--| | ||
|51387|soup|warm| | ||
|82303|sandwich|warm| | ||
|23092|berries|room temp| | ||
This table now, once again, represents the current state of the data. | ||
There are some consequences to this style of data representation: | ||
1. historic data is lost | ||
2. there is exactly one global truth for any datum | ||
3. no verifiable authorship information | ||
4. data is represented in a fixed way (changing this requires "table migrations") | ||
In contrast, kappa architecture centers on a primitive called the "append-only | ||
log" as its single source of truth. | ||
An append-only log is a data structure that can only be added to. Each entry in | ||
a log is addressable by its "sequence number" (starting at 0, then 1, 2, 3, | ||
...). In the case of kappa-core, which uses [hypercore][hypercore] underneath, | ||
each log is also identified by a cryptographic *public key*, which allows each | ||
log entry to be digitally signed with that log's *private key*, certifying that | ||
each entry in the log was indeed authored by the same person or device. A | ||
single kappa-core database can have one, ten, or hundreds of append-only logs | ||
comprising it. | ||
kappa-core still uses tables like the above, though. However, instead of being | ||
the source of truth, these tables are generated (or *materialized*) from the | ||
log data, providing a *view* of the log data in a new or optimized context. | ||
These are called *materialized views*. | ||
The twin concepts of *append-only logs* and *materialized views* are the key | ||
concepts of kappa-core. Any kappa-core database does only a few things: | ||
1. define various materialized views that it finds useful | ||
2. write data to append-only logs | ||
3. query those views to retrieve useful information | ||
Let's look at an example of how the traditional table from the beginning of | ||
this section could be represented as a kappa architecture. The three initial | ||
rows would begin as log entries first: | ||
``` | ||
[ | ||
{ | ||
id: 51387, | ||
key: 'soup', | ||
value: 'cold' | ||
}, | ||
{ | ||
id: 82303, | ||
key: 'sandwich', | ||
value: 'warm' | ||
}, | ||
{ | ||
id: 23092, | ||
key: 'berries', | ||
value: 'room temp' | ||
} | ||
] | ||
``` | ||
These might be written to one log, or perhaps spread across several. They all | ||
get fed into materialized views in a nondeterministic order anyway, so it | ||
doesn't matter. | ||
To produce a look-up table like before, a view might be defined like this: | ||
``` | ||
when new log entry E: | ||
table.put(E.key, E.value) | ||
``` | ||
This would map each `key` from the full set of log entries to its `value`, | ||
producing this table: | ||
|key|value| | ||
|--|--| | ||
|soup|cold| | ||
|sandwich|warm| | ||
|berries|room temp| | ||
Notice `id` isn't present. We didn't need it, so we didn't bother writing it to | ||
the view. It's still stored in each log entry it came from though. | ||
Now let's say an entry like `{ id: 51387, key: 'soup', value: 'warm' }` is | ||
written to a log. The view logic above the table dictates that the `key` is | ||
mapped to the `value` for this view, so the a table would be produced: | ||
|key|value| | ||
|--|--| | ||
|soup|warm| | ||
|sandwich|warm| | ||
|berries|room temp| | ||
Like the traditional database, the table is mutated in-place to produce the new | ||
current state. The difference is that this table was *derived* from immutable | ||
log data, instead of being the truth source itself. | ||
This is all very useful: | ||
1. log entries are way easier to replicate over a network or USB keys than tables | ||
2. the log entries are immutable, so they can be cached indefinitely | ||
3. the log entries are digitally signed, so their authenticity can be trusted | ||
4. views are derived, so they can be regenerated | ||
\#4 is really powerful and worth examination: *views can be regenerated*. In | ||
kappa-core, views are *versioned*: the view we just generated was version 1, | ||
and was defined by the logic | ||
``` | ||
when new log entry E: | ||
table.put(E.key, E.value) | ||
``` | ||
What if we wanted to change this view at some point, to instead map the entry's | ||
`id` to its `value`? Maybe like this: | ||
``` | ||
when new log entry E: | ||
table.put(E.id, E.value) | ||
``` | ||
With kappa-core, this would mean bumping the view's *version* to `2`. | ||
kappa-core will purge the existing table, and regenerate it from scratch by | ||
processing all of the entries in all of the logs all over again. This makes | ||
views cheap, and also means *no table migrations*! Your data structures can | ||
evolve as you program evolves, and peers won't need to worry about migrating to | ||
new formats. | ||
Lastly, a kappa-core database is able to *replicate* itself to another | ||
kappa-core database. The `replicate` API (below) returns a Node `Duplex` | ||
stream. This stream can operate over any stream-compatible transport medium, | ||
such as TCP, UTP, Bluetooth, a Unix pipe, or even audio waves sent over the | ||
air! When two kappa-core databases replicate, they exchange the logs and the | ||
entries in the logs, so that both sides end up with the same full set of log | ||
entries. This will trigger your database's materialized views to process these | ||
new entries to update themselves and reflect the latest state. | ||
Because this is all built on [hypercore][hypercore], replication can be done | ||
over an encrypted channel. | ||
Thanks for reading! You can also try the [kappa-core | ||
workshop](https://github.com/kappa-db/workshop) to use kappa-core yourself, or | ||
get support and/or chat about development on | ||
- IRC: #kappa-db on Freenode | ||
- [Cabal](https://cabal.chat): #kappa-db on `cabal://0201400f1aa2e3076a3f17f4521b2cc41e258c446cdaa44742afe6e1b9fd5f82` | ||
## Example | ||
@@ -255,1 +419,2 @@ | ||
[git-shallow]: https://www.git-scm.com/docs/gitconsole.log(one#gitconsole.log(one---depthltdepthgt) | ||
[kappa]: http://kappa-architecture.com |
19437
420
+ Addedis-options@1.0.2(transitive)
+ Addedmultifeed@4.3.0(transitive)
+ Addedonce@1.4.0(transitive)
+ Addedrandom-access-memory@3.1.4(transitive)
+ Addedwrappy@1.0.2(transitive)
- Removedmultifeed@3.0.8(transitive)
Updatedhypercore@^7.4.0
Updatedinherits@^2.0.4
Updatedmultifeed@^4.0.0
Updatedmultifeed-index@^3.3.2