Research
Security News
Quasar RAT Disguised as an npm Package for Detecting Vulnerabilities in Ethereum Smart Contracts
Socket researchers uncover a malicious npm package posing as a tool for detecting vulnerabilities in Etherium smart contracts.
This gem is intended to contain tools for Arabic Natural Language Processing. As of version 0.1, this toolkit gem allows you to:
Clean a text using a stop list. This stop list was generated using the tf-idf score calculated on words from over 900 articles. The words selected have also been checked and validated by hand which resulted in a stop list of over 270 words.
Stem a word or a text. The stemming algorithm used is the ISRI Arabic stemmer. It is described in the following research paper:
Arabic Stemming without a root dictionary
This root-extraction stemmer is similar to the Khoja stemmer but does not use a root-dictionnary which can be laborious to maintain. Also, when the root can not be found, the ISRI stemmer would return a normalized form and not the orginial unmodified form. Overall, the ISRI has been proved to perform equivalently if not better than the Khoja.
Add this line to your application's Gemfile:
gem 'nlp_arabic'
And then execute:
$ bundle
Or install it yourself as:
$ gem install nlp_arabic
Once installed, you can use it like this:
NlpArabic.clean(text) will return the text without the stop words.
NlpArabic.stem(word) will return the word stemmed.
NlpArabic.stem_text(text) will stem an entire text.
NlpArabic.clean_and_stem(text) will do both.
NlpArabic.wash_and_stem(text) will stem the text removing stop words and delimiters from it.
NlpArabic.tokenize_text(text) will break the text into an array of words and delimiters.
Each step of the ISRI algorithm is coded in a separate function so you should be able to find the helper function you may be looking for just by browsing the code.
After checking out the repo, run bin/console
for an interactive prompt that will allow you to experiment. For now the gem doesn't use any dependencies so you don't need to run bin/setup
.
To install this gem onto your local machine, run bundle exec rake install
. To release a new version, update the version number in version.rb
, and then run bundle exec rake release
to create a git tag for the version, push git commits and tags, and push the .gem
file to rubygems.org.
You are more than welcome to contribute to this project :) Please try to respect the ruby style guidelines described here. The default encoding used is UTF-8.
git checkout -b my-new-feature
)git commit -am 'Add some feature'
)git push origin my-new-feature
)FAQs
Unknown package
We found that nlp_arabic demonstrated a not healthy version release cadence and project activity because the last version was released a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Research
Security News
Socket researchers uncover a malicious npm package posing as a tool for detecting vulnerabilities in Etherium smart contracts.
Security News
Research
A supply chain attack on Rspack's npm packages injected cryptomining malware, potentially impacting thousands of developers.
Research
Security News
Socket researchers discovered a malware campaign on npm delivering the Skuld infostealer via typosquatted packages, exposing sensitive data.