Socket
Socket
Sign inDemoInstall

winnowing

Package Overview
Dependencies
0
Maintainers
1
Alerts
File Explorer

Install Socket

Detect and block malicious and high-risk dependencies

Install

    winnowing

A Python implementation of the Winnowing (local algorithms for document fingerprinting)


Maintainers
1

Readme

Winnowing

A Python implementation of the Winnowing (local algorithms for document fingerprinting)

Original Work

The original research paper can be found at http://dl.acm.org/citation.cfm?id=872770.

Installation

You may install winnowing package via pip as follows:

::

pip install winnowing

Alternatively, you may also install the package by cloning this repository.

::

git clone https://github.com/suminb/winnowing.git
cd winnowing && python setup.py install

Usage

.. code:: python

>>> from winnowing import winnow

>>> winnow('A do run run run, a do run run')
set([(5, 23942), (14, 2887), (2, 1966), (9, 23942), (20, 1966)])

>>> winnow('run run')
set([(0, 23942)]) # match found!

Default Hash Function


Quite honestly, I did not know what hash function to use. The paper did
not talk about it. So I decided to use a part of SHA-1; more precisely,
the last 16 bits of the digest.

Custom Hash Function
~~~~~~~~~~~~~~~~~~~~

You may use your own hash function as demonstrated below.

.. code:: python

    def hash_md5(text):
        import hashlib

        hs = hashlib.md5(text)
        hs = hs.hexdigest()
        hs = int(hs, 16)

        return hs

    # Override the hash function
    winnow.hash_function = hash_md5

    winnow('The cake was a lie')

Lower Bound of Fingerprint Density

(TODO: Write this section)

FAQs


Did you know?

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts

SocketSocket SOC 2 Logo

Product

  • Package Alerts
  • Integrations
  • Docs
  • Pricing
  • FAQ
  • Roadmap

Stay in touch

Get open source security insights delivered straight into your inbox.


  • Terms
  • Privacy
  • Security

Made with ⚡️ by Socket Inc