Huge News!Announcing our $40M Series B led by Abstract Ventures.Learn More
Socket
Sign inDemoInstall
Socket

trie-louds

Package Overview
Dependencies
Maintainers
1
Versions
3
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

trie-louds

Readonly but memory-sufficient data structure for dictionaries

  • 1.0.0
  • Source
  • npm
  • Socket score

Version published
Weekly downloads
0
decreased by-100%
Maintainers
1
Weekly downloads
 
Created
Source

Trie-LOUDS

npm package Node.js CI codecov

Readonly but memory-sufficient data structure for dictionaries by utilizing LOUDS.

Install

$ npm install --save trie-louds

Usage

const {ReadonlyTrieTree} = require("trie-louds");
const fs = require("fs");
const tree = ReadonlyTrieTree.fromKeywordList(["She", "sells", "seashells", "by", "the", "seashore"]);

console.log(tree.contains("She")); // true
console.log(tree.contains("she")); // false
console.log(tree.getWords("sea").words); // [ 'seashells', 'seashore' ] (search the words with given prefix)
console.log(tree.getValue("seashells")); // 2 (index of keywords)
console.log(tree.getValue("sell")); // null (not found)

console.log(tree.getWords("").words); // [ 'She', 'by', 'seashells', 'seashore', 'sells', 'the' ] (searched words are sorted)
const limited = tree.getWords("", 3);
console.log(limited.words); // [ 'She', 'by', 'seashells' ] (you can limit the number of searched words (default is 1000))
console.log(limited.hasMore); // true (if there are unsearched words due to limit, hasMore will be true)
// (and you can continue searching by calling getMoreWords with temporaryInfo)
console.log(tree.getMoreWords(limited.temporaryInfo).words); // [ 'seashore', 'sells', 'the' ]

fs.writeFileSync("tree.dat", tree.dump()); // You can dump the tree data.
const loadedTree = ReadonlyTrieTree.load(fs.readFileSync("tree.dat"));
console.log(loadedTree.getWords("sea").words); // [ 'seashells', 'seashore' ]

Command

You can dump the tree data by command.

example

  1. run trie-dump --input examples/keyword.txt --output examples/trie.dat
  2. then you have trie.dat in examples/ folder.
  3. execute:
const {ReadonlyTrieTree} = require("trie-louds");
const tree = ReadonlyTrieTree.loadFileSync("examples/trie.dat");
console.log(tree.getWords(""));

enwiki trie tree

You can create the trie tree of wikipedia-en keywords.

> cat enwiki-20210220-pages-articles-multistream-index.txt | sed -e 's/.*://g' > enwiki-keywords.txt
> trie-dump --input ..\loudstest\enwiki-keywords.txt --output enwiki.dat

In this case, we can store 20993072 words in this trie tree and dump it. The size of enwiki-keywords.txt is about 495MiB and the size of enwiki.dat is about 565MiB.

const {ReadonlyTrieTree} = require("trie-louds");
const {readFileSync} = require("fs");
const tree = ReadonlyTrieTree.load(readFileSync("./enwiki.dat"));
console.log(process.memoryUsage());
console.log(tree.getWords("Undertale"));

--- output ---
{ rss: 784404480,
  heapTotal: 10731520,
  heapUsed: 5436512,
  external: 761201137 }

{ words:
   [ 'Undertale',
     'Undertale (game)',
     'Undertale (video game)',
     'Undertale - Hopes and Dreams.ogg',
     'Undertale 2',
     'Undertale Combat Example.png',
     'Undertale Kickstarter Promotional Art.png',
     'Undertale character redirects to lists',
     'Undertale fandom',
     'Undertale soundtrack' ],
  hasMore: false }

And it takes about 750MiB when you load this trie tree on memory.

FAQs

Package last updated on 07 Mar 2021

Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts

SocketSocket SOC 2 Logo

Product

  • Package Alerts
  • Integrations
  • Docs
  • Pricing
  • FAQ
  • Roadmap
  • Changelog

Packages

npm

Stay in touch

Get open source security insights delivered straight into your inbox.


  • Terms
  • Privacy
  • Security

Made with ⚡️ by Socket Inc