Security News
tea.xyz Spam Plagues npm and RubyGems Package Registries
Tea.xyz, a crypto project aimed at rewarding open source contributions, is once again facing backlash due to an influx of spam packages flooding public package registries.
content-based-recommender
Advanced tools
Readme
This is a simple content-based recommender implemented in javascript to illustrate the concept of content-based recommendation. Content-based recommender is a popular recommendation technique to show similar items to users, especially useful to websites for e-commerce, news content, etc.
After the recommender is trained by an array of documents, it can tell the list of documents which are more similar to the input document.
The training process involves 3 main steps:
Special thanks to the library natural helps a lot by providing a lot of NLP functionalities, such as tf-idf and word stemming.
⚠️ Note:
I haven't tested how this recommender is performing with a large dataset. I will share more results after some more testing.
npm install content-based-recommender
And then import the ContentBasedRecommender class
const ContentBasedRecommender = require('content-based-recommender')
trainBidirectional(collectionA, collectionB)
to allow recommendations between
two different datasetsUpgrade dependencies to fix security alerts
Introduce the use of unigram, bigrams and trigrams when constructing the word vector
Simplify the implementation by not using sorted set data structure to store the similar documents data. Also support the maxSimilarDocuments and minScore options to save memory used by the recommender.
Update to newer version of vector-object
const ContentBasedRecommender = require('content-based-recommender')
const recommender = new ContentBasedRecommender({
minScore: 0.1,
maxSimilarDocuments: 100
});
// prepare documents data
const documents = [
{ id: '1000001', content: 'Why studying javascript is fun?' },
{ id: '1000002', content: 'The trend for javascript in machine learning' },
{ id: '1000003', content: 'The most insightful stories about JavaScript' },
{ id: '1000004', content: 'Introduction to Machine Learning' },
{ id: '1000005', content: 'Machine learning and its application' },
{ id: '1000006', content: 'Python vs Javascript, which is better?' },
{ id: '1000007', content: 'How Python saved my life?' },
{ id: '1000008', content: 'The future of Bitcoin technology' },
{ id: '1000009', content: 'Is it possible to use javascript for machine learning?' }
];
// start training
recommender.train(documents);
//get top 10 similar items to document 1000002
const similarDocuments = recommender.getSimilarDocuments('1000002', 0, 10);
console.log(similarDocuments);
/*
the higher the score, the more similar the item is
documents with score < 0.1 are filtered because options minScore is set to 0.1
[
{ id: '1000004', score: 0.5114304586412038 },
{ id: '1000009', score: 0.45056313558918837 },
{ id: '1000005', score: 0.37039308109283564 },
{ id: '1000003', score: 0.10896767690747626 }
]
*/
This example shows how to automatically match posts with related tags
const ContentBasedRecommender = require('content-based-recommender')
const posts = [
{
id: '1000001',
content: 'Why studying javascript is fun?',
},
{
id: '1000002',
content: 'The trend for javascript in machine learning',
},
{
id: '1000003',
content: 'The most insightful stories about JavaScript',
},
{
id: '1000004',
content: 'Introduction to Machine Learning',
},
{
id: '1000005',
content: 'Machine learning and its application',
},
{
id: '1000006',
content: 'Python vs Javascript, which is better?',
},
{
id: '1000007',
content: 'How Python saved my life?',
},
{
id: '1000008',
content: 'The future of Bitcoin technology',
},
{
id: '1000009',
content: 'Is it possible to use javascript for machine learning?',
},
];
const tags = [
{
id: '1',
content: 'Javascript',
},
{
id: '2',
content: 'machine learning',
},
{
id: '3',
content: 'application',
},
{
id: '4',
content: 'introduction',
},
{
id: '5',
content: 'future',
},
{
id: '6',
content: 'Python',
},
{
id: '7',
content: 'Bitcoin',
},
];
const tagMap = tags.reduce((acc, tag) => {
acc[tag.id] = tag;
return acc;
}, {});
const recommender = new ContentBasedRecommender();
recommender.trainBidirectional(posts, tags);
for (let post of posts) {
const relatedTags = recommender.getSimilarDocuments(post.id);
const tags = relatedTags.map(t => tagMap[t.id].content);
console.log(post.content, 'related tags:', tags);
}
/*
Why studying javascript is fun? related tags: [ 'Javascript' ]
The trend for javascript in machine learning related tags: [ 'machine learning', 'Javascript' ]
The most insightful stories about JavaScript related tags: [ 'Javascript' ]
Introduction to Machine Learning related tags: [ 'machine learning', 'introduction' ]
Machine learning and its application related tags: [ 'machine learning', 'application' ]
Python vs Javascript, which is better? related tags: [ 'Python', 'Javascript' ]
How Python saved my life? related tags: [ 'Python' ]
The future of Bitcoin technology related tags: [ 'future', 'Bitcoin' ]
Is it possible to use javascript for machine learning? related tags: [ 'machine learning', 'Javascript' ]
*/
To create the recommender instance
Supported options:
To tell the recommender about your documents and then it will start training itself.
Works like the normal train function, but it creates recommendations between two different collections instead of within one collection.
To get an array of similar items with document id
It returns an array of objects, with fields id and score (ranging from 0 to 1)
To export the recommender as json object.
const recommender = new ContentBasedRecommender();
recommender.train(documents);
const object = recommender.export();
//can save the object to disk, database or otherwise
To update the recommender by importing from a json object, exported by the export() method
const recommender = new ContentBasedRecommender();
recommender.import(object); // object can be loaded from disk, database or otherwise
npm install
npm run test
FAQs
A simple content-based recommender implemented in javascript
The npm package content-based-recommender receives a total of 96 weekly downloads. As such, content-based-recommender popularity was classified as not popular.
We found that content-based-recommender demonstrated a not healthy version release cadence and project activity because the last version was released a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
Tea.xyz, a crypto project aimed at rewarding open source contributions, is once again facing backlash due to an influx of spam packages flooding public package registries.
Security News
As cyber threats become more autonomous, AI-powered defenses are crucial for businesses to stay ahead of attackers who can exploit software vulnerabilities at scale.
Security News
UnitedHealth Group disclosed that the ransomware attack on Change Healthcare compromised protected health information for millions in the U.S., with estimated costs to the company expected to reach $1 billion.