Huge News!Announcing our $40M Series B led by Abstract Ventures.Learn More
Socket
Sign inDemoInstall
Socket

trie-louds

Package Overview
Dependencies
Maintainers
1
Versions
3
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

trie-louds

Readonly but memory-sufficient data structure for dictionaries

  • 1.1.0
  • latest
  • Source
  • npm
  • Socket score

Version published
Maintainers
1
Created
Source

Trie-LOUDS

npm package Node.js CI codecov

Readonly but memory-sufficient data structure for dictionaries by utilizing LOUDS.

Install

$ npm install --save trie-louds

API

static fromKeywordList(keys, verbose=false)

Build a tree from list of keywords.

Parameters

  • keys: string[] List of keywords.
  • verbose: boolean If true, some debug logs will be printed.

Returns

  • tree: ReadonlyTrieTree

Example

const {ReadonlyTrieTree} = require("trie-louds");
const tree = ReadonlyTrieTree.fromKeywordList(["She", "sells", "seashells", "by", "the", "seashore"]);

contains(word)

Returns true if it contains the given word. This is case-sensitive.

Parameters

  • word: string

Returns

  • answer: boolean

Example

const {ReadonlyTrieTree} = require("trie-louds");
const tree = ReadonlyTrieTree.fromKeywordList(["She", "sells", "seashells", "by", "the", "seashore"]);

console.log(tree.contains("She")); // true
console.log(tree.contains("she")); // false

getValue(word)

Returns word's index in the keyword list. If it doesn't contain word, this returns null.

Parameters

  • word: string

Returns

  • index: number|null

Example

const {ReadonlyTrieTree} = require("trie-louds");
const tree = ReadonlyTrieTree.fromKeywordList(["She", "sells", "seashells", "by", "the", "seashore"]);

console.log(tree.getValue("She")); // 0
console.log(tree.getValue("sells")); // 1
console.log(tree.getValue("seashells")); // 2
console.log(tree.getValue("sell")); // null (not found)

getWords(prefix, limit=1000)

Search the words which have the given prefix. If more than limit words are found, property hasMore become true.

Parameters

  • prefix: string
  • limit: number The maximum number of words in the result.

Returns

  • result: SearchResult
    • words: string[] Found words.
    • values: number[] Indices of found words.
    • hasMore: boolean If true, there are unsearched words.
    • temporaryInfo?: TempInfo This exists iff hasMore is true. See getMoreWords().

Example

const {ReadonlyTrieTree} = require("trie-louds");
const tree = ReadonlyTrieTree.fromKeywordList(["She", "sells", "seashells", "by", "the", "seashore"]);

console.log(tree.getWords("").words); // [ 'She', 'by', 'seashells', 'seashore', 'sells', 'the' ] (searched words are sorted)

const limited = tree.getWords("", 3);
console.log(limited.words); // [ 'She', 'by', 'seashells' ]
console.log(limited.hasMore); // true

getWords(prefix, setting)

You can set more detailed search settings.

Parameters

  • prefix: string
  • setting: OptSearchSetting
    • limit?: number Default is 1000. Same as limit in getWords(prefix, limit).
    • maxLength?: number If this exists, words longer than this will be excluded.
    • minLength?: number If this exists, words shorter than this will be excluded.

Returns

  • result: SearchResult Same as the output of getWords(prefix, limit).

getMoreWords(temporaryInfo, limit=1000)

You can continue searching by calling this with temporaryInfo returned by getWords function.

Parameters

  • temporaryInfo: TempInfo
  • limit: number The maximum number of words in the result.

Returns

  • result: SearchResult
    • words: string[] Found words.
    • values: number[] Indices of found words.
    • hasMore: boolean If true, there are unsearched words.
    • temporaryInfo?: TempInfo This exists iff hasMore is true.

Example

const {ReadonlyTrieTree} = require("trie-louds");
const tree = ReadonlyTrieTree.fromKeywordList(["She", "sells", "seashells", "by", "the", "seashore"]);

const limited = tree.getWords("", 3);
console.log(limited.words); // [ 'She', 'by', 'seashells' ]
console.log(limited.hasMore); // true
console.log(tree.getMoreWords(limited.temporaryInfo).words); // [ 'seashore', 'sells', 'the' ]

getMoreWords(temporaryInfo, setting)

You can set more detailed search settings.

Parameters

  • temporaryInfo: TempInfo
  • setting: OptSearchSetting

Returns

  • result: SearchResult

countWords(prefix, setting)

Count the words which meets the given setting and prefix. It will take the same computational cost as getWords function. If you don't need detailed settings, countWordsFaster is suitable.

Parameters

  • prefix: string
  • setting: OptSearchSetting
    • limit?: number This option doesn't work in this function. It will search all.
    • maxLength?: number If this exists, words longer than this will be excluded.
    • minLength?: number If this exists, words shorter than this will be excluded.

Returns

  • count: number

Example

const {ReadonlyTrieTree} = require("trie-louds");
const tree = ReadonlyTrieTree.fromKeywordList(["She", "sells", "seashells", "by", "the", "seashore"]);

console.log(tree.countWords("")); // 6
console.log(tree.countWords("s")); // 3

countWordsFaster(prefix)

Counts the words which have the given prefix. This is much faster than countWords().

Parameters

  • prefix: string

Returns

  • count: number

dump()

You can dump the tree to Buffer.

Returns

  • buf: Buffer

Example

const fs = require("fs");
fs.writeFileSync("tree.dat", tree.dump());

static load(buffer)

You can load the tree from dumped data.

Parameters

  • buffer: Buffer

Returns

  • tree: ReadonlyTrieTree

Example

const loadedTree = ReadonlyTrieTree.load(fs.readFileSync("tree.dat"));
console.log(loadedTree.getWords("sea").words); // [ 'seashells', 'seashore' ]

Command

You can dump the tree data by command.

example

  1. run trie-dump --input examples/keyword.txt --output examples/trie.dat
  2. then you have trie.dat in examples/ folder.
  3. execute:
const {ReadonlyTrieTree} = require("trie-louds");
const tree = ReadonlyTrieTree.loadFileSync("examples/trie.dat");
console.log(tree.getWords(""));

enwiki trie tree

You can create the trie tree of wikipedia-en keywords.

> cat enwiki-20210220-pages-articles-multistream-index.txt | sed -e 's/.*://g' > enwiki-keywords.txt
> trie-dump --input ..\loudstest\enwiki-keywords.txt --output enwiki.dat

In this case, we can store 20993072 words in this trie tree and dump it. The size of enwiki-keywords.txt is about 495MiB and the size of enwiki.dat is about 512MiB.

const {ReadonlyTrieTree} = require("trie-louds");
const {readFileSync} = require("fs");
const tree = ReadonlyTrieTree.load(readFileSync("./enwiki.dat"));
console.log(process.memoryUsage());
console.log(tree.getWords("Undertale"));

--- output ---
{ rss: 682528768,
  heapTotal: 10731520,
  heapUsed: 5398648,
  external: 658583438 }

{ words:
   [ 'Undertale',
     'Undertale (game)',
     'Undertale (video game)',
     'Undertale - Hopes and Dreams.ogg',
     'Undertale 2',
     'Undertale Combat Example.png',
     'Undertale Kickstarter Promotional Art.png',
     'Undertale character redirects to lists',
     'Undertale fandom',
     'Undertale soundtrack' ],
  values:
   [ 19574893,
     15834775,
     16217668,
     15772089,
     18495783,
     15198496,
     19989624,
     17975151,
     20721313,
     18488239 ],
  hasMore: false }

And it takes about 651MiB when you load this trie tree on memory.

Keywords

FAQs

Package last updated on 28 Mar 2021

Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts

SocketSocket SOC 2 Logo

Product

  • Package Alerts
  • Integrations
  • Docs
  • Pricing
  • FAQ
  • Roadmap
  • Changelog

Packages

npm

Stay in touch

Get open source security insights delivered straight into your inbox.


  • Terms
  • Privacy
  • Security

Made with ⚡️ by Socket Inc