NLP

gopkg.in/vench/nlp.v2

Package nlp provides implementations of selected machine learning algorithms for natural language processing of text corpora. The initial primary focus being on the implementation of algorithms supporting LSA (Latent Semantic Analysis), often referred to as Latent Semantic Indexing in the context of information retrieval. The algorithms in the package typically support document input as text strings which are then encoded as a matrix of numerical feature vectors called a `term document matrix`. Columns in this matrix represent the documents in the corpus and the rows represent terms occurring in the documents. The individual elements within the matrix contains counts of the number of occurrences of each term in the associated document. This matrix can be manipulated through the application of additional transformations for weighting features, identifying relationships or optimising the data for analysis, information retrieval and/or predictions. A common transformation is for the purpose of weighting features to remove natural biases which would skew results e.g. commonly occurring words like `the`, `of`, `and`, etc. which should carry lower weight than unusual words. Term Document matrices typically have a very large number of dimensions and so transformations are often applied to reduce the dimensionality using techniques such as Locality Sensitive Hashing or Latent Semantic Analysis (typically performed using matrix SVD - `Singular Value Decomposition`) which approximates the original term document matrix with a new matrix of much lower rank (typically around 100 rather than 1000s). Truncated SVD is a fundamental part of LSA (Latent Semantic Analysis aka Latent Semantic Indexing) and serves a number of purposes: 1. The reduced dimensionality of the data theoretically requires less memory. 2. As less significant dimensions are removed, there is less `noise` in the data which could have artificially skewed results. 3. Perhaps most importantly, the SVD effectively encodes the co-occurrence of terms within the documents to capture semantic meaning rather than simply the presence (or lack of presence) of words. This combats the problem of synonymy (a common challenge in NLP) where different words in the English language can be used to mean the same thing (synonyms). In LSA, documents can have a high degree of semantic similarity with very few words in common. The post SVD matrix (with each column being a feature vector representing a document within the corpus) can be compared for similarity with each other (for clustering) or with a query (also represented as a feature vector projected into the same dimensional space). Similarity is measured by the angle between the two feature vectors being considered.

v2.0.0-20180914075149-2c226c0ca754 • 7 years ago

github.com/kitagry/go-nlp-100-knock/ex39

github.com/ieee0824/nlp

github.com/kitagry/go-nlp-100-knock/ex28

github.com/kitagry/go-nlp-100-knock/ex07

github.com/royalrick/gse

github.com/jcksnvllxr80/go-tuts/nlp-bot

github.com/menuser/youtube-demo-nlp

github.com/oblivia-simplex/nlp

github.com/duexcoast/nlp

github.com/project-flogo/catalystml-flogo/operations/nlp

github.com/iamduyang/golang_nlp

github.com/paruliansaragi/go-nlp

github.com/agniveshchaubey/slack-chat-bot-nlp

github.com/madgeniusblink/nlp

github.com/northya/another-nlp

github.com/leksyking/nlp

git.wxl.best/shixzie/nlp

github.com/sesky4/tgo/tencentcloud/nlp

gopkg.in/vench/nlp.v2

github.com/masterhafid/nlp-indo-to-english-go

github.com/nymiun/nlp

github.com/tracesless/tencentcloud-sdk-go/tencentcloud/nlp

github.com/cogcomp/cogcomp-nlp

github.com/konstantin8105/nlp

github.com/semiringinc/gojsonnlp

github.com/andreajegher/nlp

github.com/jackyzha0/reflect-nlp/ingress

github.com/muhfajar/nlp

github.com/cdpierse/hacker_news_NLP_score_prediction/etl

github.com/Shixzie/nlp

github.com/chriscasola/nlp

github.com/daniilperestoronin/nlp

github.com/kamildrazkiewicz/go-stanford-nlp

github.com/edwindvinas/nlp

github.com/northya/time-nlp-go

github.com/nuance/go-nlp

github.com/garystafford/nlp-client

github.com/CogComp/cogcomp-nlp

github.com/go-dockly/nlp

github.com/recoilme/nlp

github.com/sensorbee/nlp

github.com/alibabacloud-go/nlp-20180408

github.com/alibabacloud-go/nlp-automl-20191111

github.com/Ryan3435/nlp

github.com/go-nlp/bpe

github.com/SemiringInc/GoJSONNLP

github.com/timurgarif/nlpgo

github.com/go-nlp/bm25

github.com/rocco337/nlp

github.com/itrabbit/nlp