🚀 Big News: Socket Acquires Coana to Bring Reachability Analysis to Every Appsec Team.Learn more
Socket
Sign inDemoInstall
Socket

csv-reconcile-fingerprint

Package Overview
Dependencies
Maintainers
1
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

csv-reconcile-fingerprint

A scoring plugin for csv-reconcile using fingerprint clustering.

0.1.8
PyPI
Maintainers
1

csv-reconcile-fingerprint

PyPI Tests Changelog License

A scoring plugin for csv-reconcile using fingerprint clustering. It generates a fingerprint of the input string by normalizing, removing punctuation, and sorting unique tokens. Based on the OpenRefine clustering implementation https://openrefine.org/docs/technical-reference/clustering-in-depth and code from this gist by @pietz.

The resulting strings are compared with Jaccard distance to output a score between 0 and 100.

Installation and Usage

Install this library using pip:

pip install csv-reconcile

This a plugin to the csv reconciliation plugin. So you just have to install csv reconcile package and specify the scorer with '--scorer fingerprint' when initiating the reconciliation service.

Development

To contribute to this library, first checkout the code. Then create a new virtual environment:

cd csv-reconcile-fingerprint
python -m venv venv
source venv/bin/activate

Now install the dependencies and test dependencies:

python -m pip install -e '.[test]'

To run the tests:

python -m pytest

FAQs

Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts