
Product
Introducing License Overlays: Smarter License Management for Real-World Code
Customize license detection with Socket’s new license overlays: gain control, reduce noise, and handle edge cases with precision.
GluonNLP: Your Choice of Deep Learning for NLP
GluonNLP is a toolkit that enables easy text preprocessing, datasets loading and neural models building to help you speed up your Natural Language Processing (NLP) research.
Quick Start Guide <https://github.com/dmlc/gluon-nlp#quick-start-guide>
__Resources <https://github.com/dmlc/gluon-nlp#resources>
__Tutorial proposal for GluonNLP is accepted at EMNLP 2019 <https://www.emnlp-ijcnlp2019.org>
__, Hong Kong.
GluonNLP was featured in:
From Shallow to Deep Language Representations: Pre-training, Fine-tuning, and Beyond <http://kdd19.mxnet.io>
__.details <https://www.portal.reinvent.awsevents.com/connect/sessionDetail.ww?SESSION_ID=88736>
_.awesome talk <https://pydata.org/nyc2018/schedule/presentation/76/>
__ by Sneha Jha.Make sure you have Python 3.5 or newer and a recent version of MXNet (our CI server runs the testsuite with Python 3.5).
You can install MXNet
and GluonNLP
using pip.
GluonNLP
is based on the most recent version of MXNet
.
In particular, if you want to install the most recent MXNet
release:
::
pip install --upgrade mxnet>=1.6.0
Else, if you want to install the most recent MXNet
nightly build:
::
pip install --pre --upgrade mxnet
Then, you can install GluonNLP
:
::
pip install gluonnlp
Please check more installation details <https://github.com/dmlc/gluon-nlp/blob/master/docs/install.rst>
_.
GluonNLP documentation is available at our website <http://gluon-nlp.mxnet.io/master/index.html>
__.
GluonNLP is a community that believes in sharing.
For questions, comments, and bug reports, Github issues <https://github.com/dmlc/gluon-nlp/issues>
__ is the best way to reach us.
We now have a new Slack channel here <https://apache-mxnet.slack.com/messages/CCCDM10V9>
.
(register <https://join.slack.com/t/apache-mxnet/shared_invite/enQtNDQyMjAxMjQzMTI3LTkzMzY3ZmRlNzNjNGQxODg0N2Y5NmExMjEwOTZlYmIwYTU2ZTY4ZjNlMmEzOWY5MGQ5N2QxYjhlZTFhZTVmYTc>
).
GluonNLP community welcomes contributions from anyone!
There are lots of opportunities for you to become our contributors <https://github.com/dmlc/gluon-nlp/graphs/contributors>
__:
GitHub issues <https://github.com/dmlc/gluon-nlp/issues>
__.GitHub issues <https://github.com/dmlc/gluon-nlp/issues>
__.documentation <http://gluon-nlp.mxnet.io/master/index.html>
__.GitHub issues <https://github.com/dmlc/gluon-nlp/issues>
__.scripts <https://github.com/dmlc/gluon-nlp/tree/master/scripts>
__ to reproduce
state-of-the-art results.examples <https://github.com/dmlc/gluon-nlp/tree/master/docs/examples>
__ to explain
key ideas in NLP methods and models.public datasets <https://github.com/dmlc/gluon-nlp/tree/master/gluonnlp/data>
__
(license permitting).For a list of open starter tasks, check good first issues <https://github.com/dmlc/gluon-nlp/labels/good%20first%20issue>
__.
Also see our contributing guide <http://gluon-nlp.mxnet.io/master/how_to/contribute.html>
__ on simple how-tos,
contribution guidelines and more.
Check out how to use GluonNLP for your own research or projects.
If you are new to Gluon, please check out our 60-minute crash course <http://gluon-crash-course.mxnet.io/>
__.
For getting started quickly, refer to notebook runnable examples at
Examples. <http://gluon-nlp.mxnet.io/master/examples/index.html>
__
For advanced examples, check out our
Scripts. <http://gluon-nlp.mxnet.io/master/scripts/index.html>
__
For experienced users, check out our
API Notes <http://gluon-nlp.mxnet.io/master/api/index.html>
__.
Dataset Loading <http://gluon-nlp.mxnet.io/master/api/notes/data_api.html>
__Load the Wikitext-2 dataset, for example:
.. code:: python
>>> import gluonnlp as nlp
>>> train = nlp.data.WikiText2(segment='train')
>>> train[0:5]
['=', 'Valkyria', 'Chronicles', 'III', '=']
Vocabulary Construction <http://gluon-nlp.mxnet.io/master/api/modules/vocab.html>
__Build vocabulary based on the above dataset, for example:
.. code:: python
>>> vocab = nlp.Vocab(counter=nlp.data.Counter(train))
>>> vocab
Vocab(size=33280, unk="<unk>", reserved="['<pad>', '<bos>', '<eos>']")
Neural Models Building <http://gluon-nlp.mxnet.io/master/api/modules/model.html>
__From the models package, apply a Standard RNN language model to the above dataset:
.. code:: python
>>> model = nlp.model.language_model.StandardRNN('lstm', len(vocab),
... 200, 200, 2, 0.5, True)
>>> model
StandardRNN(
(embedding): HybridSequential(
(0): Embedding(33280 -> 200, float32)
(1): Dropout(p = 0.5, axes=())
)
(encoder): LSTM(200 -> 200.0, TNC, num_layers=2, dropout=0.5)
(decoder): HybridSequential(
(0): Dense(200 -> 33280, linear)
)
)
Word Embeddings Loading <http://gluon-nlp.mxnet.io/master/api/modules/embedding.html>
__For example, load a GloVe word embedding, one of the state-of-the-art English word embeddings:
.. code:: python
>>> glove = nlp.embedding.create('glove', source='glove.6B.50d')
# Obtain vectors for 'baby' in the GloVe word embedding
>>> type(glove['baby'])
<class 'mxnet.ndarray.ndarray.NDArray'>
>>> glove['baby'].shape
(50,)
The bibtex entry for the reference paper <https://arxiv.org/abs/1907.04433>
__ of GluonNLP is:
.. code::
@article{gluoncvnlp2020, author = {Jian Guo and He He and Tong He and Leonard Lausen and Mu Li and Haibin Lin and Xingjian Shi and Chenguang Wang and Junyuan Xie and Sheng Zha and Aston Zhang and Hang Zhang and Zhi Zhang and Zhongyue Zhang and Shuai Zheng and Yi Zhu}, title = {GluonCV and GluonNLP: Deep Learning in Computer Vision and Natural Language Processing}, journal = {Journal of Machine Learning Research}, year = {2020}, volume = {21}, number = {23}, pages = {1-7}, url = {http://jmlr.org/papers/v21/19-429.html} }
For background knowledge of deep learning or NLP, please refer to the open source book Dive into Deep Learning <http://en.diveintodeeplearning.org/>
__.
FAQs
MXNet Gluon NLP Toolkit
We found that gluonnlp demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 4 open source maintainers collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Product
Customize license detection with Socket’s new license overlays: gain control, reduce noise, and handle edge cases with precision.
Product
Socket now supports Rust and Cargo, offering package search for all users and experimental SBOM generation for enterprise projects.
Product
Socket’s precomputed reachability slashes false positives by flagging up to 80% of vulnerabilities as irrelevant, with no setup and instant results.