New Research: Supply Chain Attack on Axios Pulls Malicious Dependency from npm.Details →
Socket
Book a DemoSign in
Socket

entity-predictor

Package Overview
Dependencies
Maintainers
1
Versions
8
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

entity-predictor

Lightweight, Zero Dependency Node.js library for entity name prediction and normalization.

latest
npmnpm
Version
1.3.1
Version published
Maintainers
1
Created
Source

Entity Predictor

A lightweight, Zero Dependency Node.js library for entity name prediction and normalization.

It uses fuzzy matching to identify entities from messy input, supporting:

  • Aliases & Acronyms (e.g., "SBI" -> "STATE BANK OF INDIA")
  • Confidence Scoring ("Trustable", "High Confidence", etc.)
  • Top-N Matches (Get the top 3 best guesses)
  • Configurable Stop Words (Ignore "The", "Inc", etc.)

Features

  • Fuzzy Matching: Matches inputs to entities even with typos or partial names.
  • Alias Support: Handles acronyms (e.g., "SBI" -> "STATE BANK OF INDIA") and alternative names.
  • Confidence Scoring: Returns a confidence score and a human-readable trust level ("Trustable", "High", "Moderate").
  • Normalization: Automatically normalizes input to ignore case and special characters.

Installation

npm install entity-predictor

Usage

1. Import and Initialize

You can initialize the predictor with a list of entities. Entities can be simple strings or objects defining aliases.

import { EntityPredictor } from "entity-predictor";

const entities = [
  // Simple string entity
  "ICICI BANK",
  "AXIS BANK",

  // Entity with aliases
  {
    name: "STATE BANK OF INDIA",
    aliases: ["SBI", "State Bank", "S.B.I."],
  },
  {
    name: "HDFC BANK",
    aliases: ["HDFC", "Housing Development Finance Corporation"],
  },
];

const predictor = new EntityPredictor(entities);

2. Predict Entities

Use the predict() method to find the best match for an input string.

const result = predictor.predict("sbi");

console.log(result);
/*
Output:
{
  entity: "STATE BANK OF INDIA",
  confidence: 1,
  confidenceLevel: "Trustable",
  input: "sbi"
}
*/

Handling Typos

const result = predictor.predict("icici bk");

console.log(result);
/*
Output:
{
  entity: "ICICI BANK",
  confidence: 0.71,
  confidenceLevel: "Moderate Confidence"
}
*/

3. Top-N Matches

Get a list of best matches instead of just one.

const results = predictor.predictTop("Apple", 3);
// Returns array of matches: [{ entity: "Apple Inc", ... }, ...]

4. Handling Ambiguity (isAmbiguous)

Sometimes, an input matches multiple entities with the exact same confidence score. For example, "UCO" could match "UCO Bank", "Union Commercial Bank", etc.

The result object includes an isAmbiguous flag to warn you.

const result = predictor.predict("uco");

if (result.isAmbiguous) {
  console.warn("Ambiguous input! Found multiple candidates.");
  // Use predictTop to show options to the user
  const options = predictor.predictTop("uco", 5);
  console.log(options);
} else {
  console.log("Found:", result.entity);
}

5. Stop Words Filtering

Automatically remove noise words like "The", "Inc", "Ltd". Disabled by default.

// Enable with default list
const predictor = new EntityPredictor(entities, { ignoreStopWords: true });

// Enable with custom list
const predictor = new EntityPredictor(entities, {
  ignoreStopWords: true,
  stopWords: ["inc", "co", "corp"],
});

6. Custom Normalization

Pass a custom normalizer to clean data your way.

const predictor = new EntityPredictor(entities, {
  normalizer: (text) => text.toUpperCase(),
});

7. Redis Datasets Support

Load entities directly from a Redis source (requires your own redis client).

import Redis from "ioredis"; // or any redis client
import { EntityPredictor } from "entity-predictor";

const redis = new Redis();
const predictor = new EntityPredictor(); // Start empty or with some local entities

// Load from a Redis String (JSON)
// Key content: '["Apple", {"name": "Google", "aliases": ["Alphabet"]}]'
await predictor.loadFromRedis(redis, { key: "my_entities", type: "json" });

// Load from a Redis Set
// Key content: SMEMBERS -> ["Tesla", "SpaceX"]
await predictor.loadFromRedis(redis, { key: "my_set_key", type: "set" });

// Load from a Redis Hash
// Key content: HGETALL -> { "Amazon": '["AWS"]', "Netflix": "FLIX" }
await predictor.loadFromRedis(redis, { key: "my_hash_key", type: "hash" });

8. Add Entities Dynamically

You can add new entities to an existing predictor instance.

predictor.addEntity("PUNJAB NATIONAL BANK", ["PNB"]);

API Reference

new EntityPredictor(entities, options)

  • entities: Array of strings or objects { name: string, aliases: string[] }.
  • options: (Optional)
    • ignoreStopWords: boolean (default false)
    • stopWords: string[] (optional, defaults to internal list)
    • normalizer: (text: string) => string
  • Throws: TypeError if entities is not an array.

predict(input, threshold)

  • input: String to search for.
  • threshold: (Optional) Minimum confidence score (default 0.6).
  • Returns: Best match object { entity: string, confidence: number, input: string, isAmbiguous: boolean, ... }, { entity: "UNKNOWN", ... } if no match found, or null if input is invalid.

predictTop(input, limit, threshold)

  • limit: Max number of results (default 5).
  • Returns: Array of match objects.

Typescript Support

Includes index.d.ts for full TypeScript support.

Keywords

nlp

FAQs

Package last updated on 05 Jan 2026

Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts