
Security News
npm Adopts OIDC for Trusted Publishing in CI/CD Workflows
npm now supports Trusted Publishing with OIDC, enabling secure package publishing directly from CI/CD workflows without relying on long-lived tokens.
github.com/wchunt/naive-bayes-spam-filter
This is a Naive Bayes classifier for spam detection in Go. It uses a simple bag-of-words model to train on a dataset of spam and non-spam (ham) emails and then classify new emails as either spam or ham. The algorithm assumes that each word in an email is independent of the other words, which is clearly not true, hence the "naive" in the name of the algorithm.
This project was originally a school assignment that I enjoyed, written in C++, but was later ported to Go to practice writing Go code and to experiment with parallelization.
The project solves the problem of classifying a given text file as either spam or ham. The model is trained on a dataset of known spam and real text files and then applied to new a new text file to determine it's classification.
The classifier works as follows:
Read in a dataset of text files labeled as real and spam.
Tokenize the text files by splitting them into individual words and storing the frequency of each word in the dataset.
Calculate the prior probabilities of each class (spam and real) and the conditional probabilities of each word given each class.
To classify a new text file, tokenize it, calculate the log-probability of it belonging to each class using the prior and conditional probabilities, and choose the class with the highest probability.
To use the classifier, clone the repository and follow the instructions in the README.md file.
FAQs
Unknown package
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
npm now supports Trusted Publishing with OIDC, enabling secure package publishing directly from CI/CD workflows without relying on long-lived tokens.
Research
/Security News
A RubyGems malware campaign used 60 malicious packages posing as automation tools to steal credentials from social media and marketing tool users.
Security News
The CNA Scorecard ranks CVE issuers by data completeness, revealing major gaps in patch info and software identifiers across thousands of vulnerabilities.