Extract keywords from text content with various processing options.
Installation
A Node.js package for extracting keywords from text content. This package identifies proper nouns, high-frequency words, and contextual keywords from both content and titles while filtering out common stop words.
- Proper noun extraction (including compound names and terms with numbers)
- High-frequency keyword identification
- Context extraction from titles
- Stop words filtering
- Support for multi-word phrases
- Customizable frequency threshold
```bash
npm install text-keyword-extractor
Usage
Basic Usage
const { KeywordExtractor } = require('text-keyword-extractor');
const content = `Google and Microsoft announced new AI features.
OpenAI's ChatGPT continues to evolve.
Apple and Amazon are also investing in AI technology.`;
const title = "Tech Giants Announce AI Features";
const extractor = new KeywordExtractor(content, title);
const keywords = extractor.extractKeywords();
console.log(keywords);
Individual Methods
const { KeywordExtractor } = require('text-keyword-extractor');
const content = "Microsoft and Google are working with OpenAI.";
const extractor = new KeywordExtractor(content);
const properNouns = extractor.findProperNouns();
console.log(properNouns);
2. Find High-Frequency Keywords
const extractor = new KeywordExtractor(content);
const frequentWords = extractor.findHighFrequencyKeywords(5);
console.log(frequentWords);
const extractor = new KeywordExtractor(content, "Breaking: ChatGPT Launches New Features");
const titleContext = extractor.findContextFromTitle();
console.log(titleContext);
Utility Functions
You can also use individual utility functions without creating an instance:
const { utilities } = require('text-keyword-extractor');
const cleaned = utilities.removeStopWords(["The", "quick", "brown", "fox"]);
console.log(cleaned);
const properNouns = utilities.findProperNouns("Google and Microsoft announced new features");
console.log(properNouns);
const frequent = utilities.findHighFrequencyKeywords(content, 5);
console.log(frequent);
API Reference
Constructor
const extractor = new KeywordExtractor(content, title);
content
(string): The text content to analyzetitle
(string, optional): Additional title for context
Methods
Returns an array of extracted keywords after processing all available methods.
findProperNouns()
Extracts proper nouns from the content. Identifies:
- Single capitalized words (e.g., Google)
- Compound names (e.g., MacBook)
- Terms with numbers (e.g., iPhone14)
- Multi-word proper nouns (e.g., Saudi Arabia)
findHighFrequencyKeywords(N)
Returns top N frequent keywords with their frequency counts.
N
(number, default: 7): Number of keywords to return
findContextFromTitle()
Extracts relevant keywords from the title after removing stop words.
removeStopWords(tokens)
Removes common stop words from an array of tokens.
tokens
(string[]): Array of words to process
License
MIT