FindSimilar
User-friendly library to find similar objects
You can find Full Project Documentation here
Workflows
PyPi
Anaconda
License
Support
PyPi Downloads
Anaconda Downloads
Languages
Development
Repository Stats
Mission
The mission of the FindSimilar project is to provide a powerful and versatile open source library that empowers
developers to efficiently find similar objects and perform comparisons across a variety of data types.
Whether dealing with texts, images, audio, or more,
our project aims to simplify the process of identifying similarities and enhancing decision-making.
Open Source Project
This is the open source project with MIT license.
Be free to use, fork, clone and contribute.
Features
Find similar texts
- on different languages
- with or without stopwords
- using dictionary (or not)
- using keywords (or not)
Requirements
Development Status
Install
with pip
pip install find-similar
See more in Full Documentation
Quickstart
from find_similar import find_similar
texts = ['one two', 'two three', 'three four']
text_to_compare = 'one four'
find_similar(text_to_compare, texts, count=10)
[TokenText(text="one two", len(tokens)=2, cos=0.5), TokenText(text="three four", len(tokens)=2, cos=0.5), TokenText(text="two three", len(tokens)=2, cos=0)]
- The result is the list of
TokenText
instances ordering by cos
cos
is the mark of texts similarity
See the demonstration and mini tutorial in the Demo project
Contributing
You are welcome! To easy start please check: