Huge News!Announcing our $40M Series B led by Abstract Ventures.Learn More
Socket
Sign inDemoInstall
Socket

github.com/vench/nlp

Package Overview
Dependencies
Alerts
File Explorer
Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

github.com/vench/nlp

  • v0.0.2
  • Source
  • Go
  • Socket score

Version published
Created
Source

Natural Language Processing

License: MIT GoDoc Build Status Go Report Card codecov Sourcegraph

nlp

An implementation of selected machine learning algorithms for basic natural language processing in golang. The initial focus for this project is Latent Semantic Analysis to allow retrieval/searching, clustering and classification of text documents based upon semantic content.

Built upon the Gonum library for linear algebra and scientific computing with some inspiration taken from Python's scikit-learn.

Check out the companion blog post or the go documentation page for full usage and examples.


Features

Planned

  • Ability to persist trained vectorisers
  • LDA (Latent Dirichlet Allocation) implementation for topic extraction
  • Stemming to treat words with common root as the same e.g. "go" and "going"
  • Clustering algorithms e.g. Heirachical, K-means, etc.
  • Classification algorithms e.g. SVM, random forest, etc.

References

  1. Rosario, Barbara. Latent Semantic Indexing: An overview. INFOSYS 240 Spring 2000
  2. Latent Semantic Analysis, a scholarpedia article on LSA written by Tom Landauer, one of the creators of LSA.
  3. Thomo, Alex. Latent Semantic Analysis (Tutorial).
  4. Latent Semantic Indexing. Standford NLP Course
  5. Charikar, Moses S. Similarity Estimation Techniques from Rounding Algorithms
  6. Kanerva, Pentti, Kristoferson, Jan and Holst, Anders (2000). Random Indexing of Text Samples for Latent Semantic Analysis
  7. Rangan, Venkat. Discovery of Related Terms in a corpus using Reflective Random Indexing
  8. Vasuki, Vidya and Cohen, Trevor. Reflective random indexing for semi-automatic indexing of the biomedical literature

FAQs

Package last updated on 21 May 2020

Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts

SocketSocket SOC 2 Logo

Product

  • Package Alerts
  • Integrations
  • Docs
  • Pricing
  • FAQ
  • Roadmap
  • Changelog

Packages

npm

Stay in touch

Get open source security insights delivered straight into your inbox.


  • Terms
  • Privacy
  • Security

Made with ⚡️ by Socket Inc