Socket
Socket
Sign inDemoInstall

kademlia.js

Package Overview
Dependencies
0
Maintainers
1
Versions
4
Alerts
File Explorer

Advanced tools

Install Socket

Detect and block malicious and high-risk dependencies

Install

    kademlia.js

A complete Javascript implementation of the Kademlia distributed hash table for Node.js.


Version published
Weekly downloads
0
Maintainers
1
Created
Weekly downloads
 

Readme

Source

kademlia.js decentralized network graphic
kademlia.js

An implementation of the Kademlia distributed hash table, written in Javascript for Node.js.

OverviewSecurity ConsiderationsInstallationExampleDocumentationCreditLicense

Overview

Kademlia.js is a Javascript implementation of the distributed hash table Kademlia, originally designed in 2002 by Petar Maymounkov and David Mazières. A distributed hash table (DHT) is a key-value data store which can operate distributed across multiple nodes (or computers) on a network. The Kademlia DHT is a peer-to-peer network, and completley decentralized. A Kademlia network in which anyone can participate is a public network, a Kademlia network in which only certain people can participate is a private network.

Nodes on the network share network-wide constants , and . Each node also has an of length bits. defines the amount of nodes a Kademlia instance can keep in each bucket in it's routing table, and the number of nodes each data should be replicated across when setting it to the network. defines the number of simultaneous queries a node performs in the lookup stage. defines the length in bits of node ids and data keys. As requiring keys to be exactly bits long is inconvenient, data keys are hashed with a hashing algorithm with digest size bits. The original paper and many implementations of Kademlia use SHA-1 as the hashing algorithm, and a value of 160, however by default this implementation uses the SHA-3-256 with a value of 256, to increase the key space and address security concerns with the SHA-1 hashing algorithm. The hash function must be the same across all nodes in the network.

Kademlia can efficiently fetch and set data to and from the network, with set and get operations scaling with , where is the number of nodes connected to the network. Kademlia uses a recursive algorithm, with maximum concurrent queries, to traverse the network to find the nodes with the smallest distance between the key of the data and each node's id (where the distance between two IDs is defined as the XOR of two IDs), and then uses those nodes to either get data from or set data onto the network. After setting data to the network, by default we also store data onto the closest node that we queried that didn't return a value (caching).

Every piece of data stored on each node by default has an expire time in milliseconds of , where the function returns the smallest of the parameters passed, is by default 24 hours, and is some function that returns a value expontentially inversely proportional to the number of nodes in the storers routing table closer to the key of the data to store than the storer (). The number returned by the function will be rounded. By default the function is:

which can alternatively be written:

This behaviour can be disabled, and the time for keys expiring can just be the value of , however in this case if caching is enabled, this may cause over-caching.

Security Considerations

Kademlia is a great solution for storing data on decentralized networks, however users do have some security considerations to take into account. First off, the integrity of the data being retrieved from a Kademlia network in which adversial nodes could potentially participate (a public network) is not guaranteed. Therefore the integrity of all important data entered into the network should be authenticatable - possibly with a cryptographic signature.

By default Kademlia nodes communicate unencrypted over UDP. However in private Kademlia networks it may be desirable to encrypt communications between nodes. In this implementation it is possible to encrypt communications between nodes by setting encrypted as true, and passing custom encrypt and decrypt functions to encrypt and decrypt data.

Kademlia lookups are also vulnerable to manipulation by adversaries. If an adversary is encountered during a lookup, they can manipulate the lookup, and likely compromise the lookup so either the wrong data is returned or no data is set to the network. Eclipse attacks or sybil attacks could also be attempted by adversaries to manipulate network operations.

Kademlia nodes also must be bootstrapped with a non-adversarial node, otherwise every node on the network could easily be controlled by an adversary.

These problems can largely be offset by using authenticatable data, bootstrapping to a non-adversarial node, and S/Kademlia.

Installation

You can install kademlia.js through NPM, with the command:

$ npm i kademlia.js

Example

Example: Create a Kademlia node, set data on the network, and then fetch it again.

const Kademlia = require("kademlia.js");

// create a Kademlia node
let node = new Kademlia(5533);

// bootstrap the new node onto the network using the details of a node already in the network
await node.bootstrap({
  ip: otherNodeIp,
  port: otherNodePort,
});

// set data onto the network
await node.set("test-key", "test-data");

// fetch the data we just set back off the network
let fetchedData = node.get("test-key");
assert(fetchedData === "test-data");

Documentation

You can import the library after installing it like so:

const Kademlia = require("kademlia.js");

Creating a new Kademlia node, providing the port for the Kademlia node, with an optional dictionary containing options for the node:

/**
 * Create a new Kademlia node.
 * @param {Integer} port - Port to bind Kademlia node to. Must be valid, unbound port number.
 * @param {Object} [options] - Options dictionary.
 *
 * @param {Function} [storeFunction=] - Function which determines how values are stored locally. Function receives new data to be stored under a key, data currently being stored under a key and must return what should be stored under the key.
 * @param {Function} [serializeData=JSON.strinify] - Function which deserializes data for transmission over the network. Receives unserialized data and must return serialized data.
 * @param {Function} [deserializeData=JSON.parse] - Function which deserializes data received over the network. Receives serialized data and must return deserialized data.
 * 
 * @param {String} [options.id] - Hex string of node ID to use. If none provided, defaults to random hex string.
 * @param {Integer} [options.k=20] - K-Value to use for the node. Must be integer greater or equal to 1.
 * @param {Integer} [options.alpha=3] - Alpha value to use for the node. Must be integer greater or equal to 1.
 * @param {Integer} [options.B=256] - Bit size of keys and IDs on the network. Must be a positive integer, and a multiple of 8.
 * @param {Boolean} [options.cache=true] - Whether or not after a value lookup to cache the value on the closest node we queries which did not return the value.
 * @param {String} [options.hash=sha3-256] - Hash function to use. Must have digest size of b and be a valid algorithm compatible with the algorithm parameter for https://nodejs.org/api/crypto.html#crypto_crypto_createhash_algorithm_options
 *
 * @param {Boolean} [options.encrypted=false] - Whether communicates between nodes are encrypted. If true, options.encrypt and options.decrypt must also be passed.
 * @param {Function} [options.encrypt] - Sync function to encrypt outgoing data (only used if options.encrypted==true).
 * @param {Function} [options.decrypt] - Sync function to decrypt incoming data (only used if options.encrypted==true). Should return null if data couldn't be decrypted.
 *
 * @param {Integer} [options.ttl=86400000] - The maximum time (in millseconds) a value has to live before expiring.
 * @param {Boolean} [option.scalettl=true] - If true, the ttl of keys is inversely proportional to the number of nodes in the storer's routing table closer to the key than the storer.
 * @param {Function} [options.scalettlFunction] - Alternative function to scale ttl. Must accept number of nodes in the storer's routing table closer to the key than the storer and the k-value of the storer, and return a ttl in millseconds.
 *
 * @param {Boolean} [options.republish=true] - Should nodes republish data they're storing.
 * @param {Integer} [options.republishInterval=3600000] - Repeating interval after being set, which key-value pairs should be republished to the network by nodes.
 * @param {Integer} [options.timeout=5000] - Time a RPC request is willing to wait for a response before expiring.
 */
let node = new Kademlia(5533, {});

To bootstrap a node onto the network, providing the details of a node:

await node.bootstrap({
  ip: otherNodeIp,
  port: otherNodePort,
});

You can then use the node you created to set data to the network:

await node.set("key", "value");

You can also retrieve data from the network for a specific key. If that key isn't on the network, null is returned.

await node.get("key");

If you wanted to use a different hash function, e.g. reverting to the original paper's SHA-1, it could be done when creating a node like so. Note SHA-1 has a digest size of 160 bits, so B is also set as 160, which is okay as it is a multiple of 8.

let node = new Kademlia(5533, {
  hashFunction: "sha-1",
  B: 160,
});

The data transmitted over UDP by Kademlia must be in string form, however it may be useful to be able to store more complex objects. Therefore we have a serializeData and deserializeData. serializeData is called whenever a Kademlia node needs to transmit data, and is passed the data that is to be transmitted, and must return the serialized form. deserializeData is called whenever data is received by Kademlia, is passed the received data and must return the deserialized data. By default serializeData is JSON.stringify and deserializeData is JSON.parse. These functions can be changed like so:

let node = new Kademlia(5533, {
  serializeData: (dataToSerialize) => {
    ...
    return serializedData;
  } 
  deserializeData: (dataToDeserialize) => {
    ...
    return deserializedData;
  }
});

The way Kademlia stores values locally can also be changed. By default when Kademlia receives a store request, the data sent to be stored is stored under the specified key in the database, replacing any data already under that key. If a different functionality is required, such as adding the new value to a list, that can be done with the storeFunction. When Kademlia receives a store request the storeFunction is passed the new data to be stored under the key and the data currently stored in the database under the key, and must return the value that should be stored in the database.

The following example maintains a list under the key, and appends new values to the list:

let node = new Kademlia(5533, {
  storeFunction: function (newData, oldData) {
    if (oldData === undefined) {
      if (Array.isArray(newData)) {
        return newData;
      } else {
        return [newData];
      }
    } else {
      if (Array.isArray(newData)) {
        return newData;
      } else {
        oldData.push(newData);
        return oldData;
      }
    }
  },
})

Lookups are generally pretty fast, however if the timeout value is large and many offline nodes are encountered during the lookup, this may slow down the lookups greatly. To decrease the effect of offline nodes on the lookup speed, the value of timeout can be lowered, however if timeout is two low, queries may expire before online nodes get a chance to respond to them.

let node = new Kademlia(5533, {
  timeout: 1000, // 1 second in milliseconds
});

It may be desirable for data to be flushed out of the system if not being republished by a user. To achieve this you should disable node's republishing values they're storing and caching. In this scenario it may also be useful to disable ttl scaling.

let node = new Kademlia(5533, {
  cache: false,
  republish: false,
  scalettl: false,
});

On private networks you may want to encrypt communications between Kademlia nodes. You can achieve this like this so:

let node = new Kademlia(5533, {
  encrypted: true,
  encrypt: (data) => encryptionFunction(data),
  decrypt: (data) => decryptionFunction(data),
});

Credit

Implementation Author: Tom.

Original Kademlia Designers: Petar Maymounkov and David Mazières.

License

MIT

FAQs

Last updated on 15 May 2021

Did you know?

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts

SocketSocket SOC 2 Logo

Product

  • Package Alerts
  • Integrations
  • Docs
  • Pricing
  • FAQ
  • Roadmap

Stay in touch

Get open source security insights delivered straight into your inbox.


  • Terms
  • Privacy
  • Security

Made with ⚡️ by Socket Inc