What is natural?
The 'natural' npm package is a general natural language processing (NLP) library for Node.js. It provides a variety of tools and algorithms for text processing, including tokenization, stemming, classification, and more.
What are natural's main functionalities?
Tokenization
Tokenization is the process of breaking down text into individual words or tokens. The 'natural' package provides several tokenizers, including word and sentence tokenizers.
const natural = require('natural');
const tokenizer = new natural.WordTokenizer();
console.log(tokenizer.tokenize('This is a sample sentence.'));
Stemming
Stemming reduces words to their root form. The 'natural' package includes several stemmers, such as the Porter Stemmer, which is used in this example.
const natural = require('natural');
const stemmer = natural.PorterStemmer;
console.log(stemmer.stem('running'));
Classification
Classification involves categorizing text into predefined classes. The 'natural' package supports several classifiers, including the Naive Bayes classifier demonstrated here.
const natural = require('natural');
const classifier = new natural.BayesClassifier();
classifier.addDocument('I love programming.', 'positive');
classifier.addDocument('I hate bugs.', 'negative');
classifier.train();
console.log(classifier.classify('I love bugs.'));
Phonetic Matching
Phonetic matching is used to find words that sound similar. The 'natural' package includes algorithms like SoundEx for this purpose.
const natural = require('natural');
const soundEx = natural.SoundEx;
console.log(soundEx.compare('phonetic', 'fonetic'));
String Distance
String distance measures how similar two strings are. The 'natural' package provides several distance algorithms, including Jaro-Winkler distance.
const natural = require('natural');
const distance = natural.JaroWinklerDistance;
console.log(distance('dixon', 'dicksonx'));
Other packages similar to natural
compromise
Compromise is a lightweight NLP library for Node.js. It offers similar functionalities to 'natural' such as tokenization, tagging, and parsing, but is designed to be more user-friendly and faster for common NLP tasks.
wink-nlp
Wink-NLP is a fast and accurate NLP library for Node.js. It focuses on performance and accuracy, offering features like tokenization, POS tagging, and named entity recognition. It is designed to be more efficient than 'natural' for large-scale applications.
natural
"Natural" is a general natural language facility for nodejs. It offers a broad range of functionalities for natural language processing. Documentation can be found here on GitHub Pages.
Open source licenses
Natural: MIT License
Copyright (c) 2011, 2012 Chris Umbel, Rob Ellis, Russell Mull, Hugo W.L. ter Doest
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in
all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
THE SOFTWARE.
WordNet License
This license is available as the file LICENSE in any downloaded version of WordNet.
WordNet 3.0 license: (Download)
WordNet Release 3.0 This software and database is being provided to you, the
LICENSEE, by Princeton University under the following license. By obtaining,
using and/or copying this software and database, you agree that you have read,
understood, and will comply with these terms and conditions.: Permission to use,
copy, modify and distribute this software and database and its documentation for
any purpose and without fee or royalty is hereby granted, provided that you
agree to comply with the following copyright notice and statements, including
the disclaimer, and that the same appear on ALL copies of the software, database
and documentation, including modifications that you make for internal use or for
distribution. WordNet 3.0 Copyright 2006 by Princeton University. All rights
reserved. THIS SOFTWARE AND DATABASE IS PROVIDED "AS IS" AND PRINCETON
UNIVERSITY MAKES NO REPRESENTATIONS OR WARRANTIES, EXPRESS OR IMPLIED. BY WAY OF
EXAMPLE, BUT NOT LIMITATION, PRINCETON UNIVERSITY MAKES NO REPRESENTATIONS OR
WARRANTIES OF MERCHANT- ABILITY OR FITNESS FOR ANY PARTICULAR PURPOSE OR THAT
THE USE OF THE LICENSED SOFTWARE, DATABASE OR DOCUMENTATION WILL NOT INFRINGE
ANY THIRD PARTY PATENTS, COPYRIGHTS, TRADEMARKS OR OTHER RIGHTS. The name of
Princeton University or Princeton may not be used in advertising or publicity
pertaining to distribution of the software and/or database. Title to copyright
in this software, database and any associated documentation shall at all times
remain with Princeton University and LICENSEE agrees to preserve same.
Porter stemmer German: BSD License
The Porter stemmer for German is licensed by a BSD license. It states Standard BSD License in the source code, interpreted as the original BSD license consisting of four clauses.