You're Invited:Meet the Socket Team at BlackHat and DEF CON in Las Vegas, Aug 4-6.RSVP
Socket
Book a DemoInstallSign in
Socket

text-keyword-extractor

Package Overview
Dependencies
Maintainers
0
Versions
2
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

text-keyword-extractor

Extract keywords from text content

1.0.1
latest
Source
npmnpm
Version published
Weekly downloads
1
Maintainers
0
Weekly downloads
 
Created
Source

text-keyword-extractor

Extract keywords from text content with various processing options.

Installation

# Text Keyword Extractor

A Node.js package for extracting keywords from text content. This package identifies proper nouns, high-frequency words, and contextual keywords from both content and titles while filtering out common stop words.

## Features
- Proper noun extraction (including compound names and terms with numbers)
- High-frequency keyword identification
- Context extraction from titles
- Stop words filtering
- Support for multi-word phrases
- Customizable frequency threshold

## Installation

```bash
npm install text-keyword-extractor

Usage

Basic Usage

const { KeywordExtractor } = require('text-keyword-extractor');

// Initialize with content and optional title
const content = `Google and Microsoft announced new AI features.
                 OpenAI's ChatGPT continues to evolve.
                 Apple and Amazon are also investing in AI technology.`;
const title = "Tech Giants Announce AI Features";

const extractor = new KeywordExtractor(content, title);
const keywords = extractor.extractKeywords();
console.log(keywords);
// Output: ["OpenAI ChatGPT", "Google", "Microsoft", "Apple", "Amazon", "AI", "Tech Giants"]

Individual Methods

1. Extract Proper Nouns

const { KeywordExtractor } = require('text-keyword-extractor');

const content = "Microsoft and Google are working with OpenAI.";
const extractor = new KeywordExtractor(content);
const properNouns = extractor.findProperNouns();
console.log(properNouns);
// Output: ["Microsoft", "Google", "OpenAI"]

2. Find High-Frequency Keywords

const extractor = new KeywordExtractor(content);
const frequentWords = extractor.findHighFrequencyKeywords(5); // Get top 5 keywords
console.log(frequentWords);
// Output: [
//   { word: "AI", frequency: 3 },
//   { word: "technology", frequency: 2 }
// ]

3. Extract Keywords from Title

const extractor = new KeywordExtractor(content, "Breaking: ChatGPT Launches New Features");
const titleContext = extractor.findContextFromTitle();
console.log(titleContext);
// Output: ["ChatGPT", "Features"]

Utility Functions

You can also use individual utility functions without creating an instance:

const { utilities } = require('text-keyword-extractor');

// Remove stop words from array
const cleaned = utilities.removeStopWords(["The", "quick", "brown", "fox"]);
console.log(cleaned); // ["quick", "brown", "fox"]

// Find proper nouns in text
const properNouns = utilities.findProperNouns("Google and Microsoft announced new features");
console.log(properNouns); // ["Google", "Microsoft"]

// Get frequent keywords
const frequent = utilities.findHighFrequencyKeywords(content, 5);
console.log(frequent); // Returns top 5 frequent words with their counts

API Reference

Class: KeywordExtractor

Constructor

const extractor = new KeywordExtractor(content, title);
  • content (string): The text content to analyze
  • title (string, optional): Additional title for context

Methods

extractKeywords()

Returns an array of extracted keywords after processing all available methods.

findProperNouns()

Extracts proper nouns from the content. Identifies:

  • Single capitalized words (e.g., Google)
  • Compound names (e.g., MacBook)
  • Terms with numbers (e.g., iPhone14)
  • Multi-word proper nouns (e.g., Saudi Arabia)
findHighFrequencyKeywords(N)

Returns top N frequent keywords with their frequency counts.

  • N (number, default: 7): Number of keywords to return
findContextFromTitle()

Extracts relevant keywords from the title after removing stop words.

removeStopWords(tokens)

Removes common stop words from an array of tokens.

  • tokens (string[]): Array of words to process

License

MIT

Keywords

keywords

FAQs

Package last updated on 10 Nov 2024

Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts