New Case Study:See how Anthropic automated 95% of dependency reviews with Socket.Learn More
Socket
Sign inDemoInstall
Socket

fiddup

Package Overview
Dependencies
Maintainers
1
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

fiddup

Utility to find similar files based on filename or hash.

  • 3.0.0
  • PyPI
  • Socket score

Maintainers
1

Fiddup

Version 3.0.0 MIT License Flake8 Tests Stable Build

File DeDuplicator

Small tool to quickly scan a directory for files of similar names. Useful to scan through archives of books, documents, downloads, movies, music, ...

Two modes are available: Assistant (name based comparison), and Hash mode (hash comparison).

Fiddup is non-destructive. It will report similarities and duplicates, but it will not remove them.

In order to keep things performant and memory-limited, hashmode only hashes parts of both files. In case of false positives, first try to increase the --chunk_count flag. (default=5)

Installation

From PyPi

pip3 install fiddup

From Sauce

  • git pull https://github.com/jarviscodes/fiddup

  • setup.py install

Usage

(env) E:\Users\Jarvis\PycharmProjects\fiddup>python -m fiddup --help
Usage: python -m fiddup [OPTIONS] COMMAND [ARGS]...

Options:
  --help  Show this message and exit.

Commands:
  assistant
  hashmode
Fiddup v3.0.0
Usage: python -m fiddup assistant [OPTIONS]

Options:
  -i, --in_path TEXT     Path to scan for duplicates.  [required]
  -t, --threshold FLOAT  Similarity threshold. Assistant will only show
                         similarities > this.
  -e, --extensions TEXT  List of extensions to scan for. Specify multiple with
                         e.g.: -e zip -e txt -e pdf.  [required]
  -d, --directory        Include directories in comparison. Only available in
                         assistant mode.
  -v, --verbose          Show verbose output.
  --help                 Show this message and exit.
Fiddup v3.0.0
Usage: python -m fiddup hashmode [OPTIONS]

Options:
  -i, --in_path TEXT     Path to scan for duplicates.  [required]
  -e, --extensions TEXT  List of extensions to scan for. Specify multiple with
                         e.g.: -e zip -e txt -e pdf.  [required]
  -v, --verbose          Show verbose output.
  --chunk_count INTEGER  Number of chunks to read from files while hashing.
                         Higher = more accuracy = Slower.
  --help                 Show this message and exit.

Assistant

Outputs a filename1, filename2, name similarity table. Useful when sorting out things manually on name base.

Hashmode

Get the hashes from the files and compare the files content-wise by doing so.

Testing

python -m unittest discover -s tests

or

python -m pytest

FAQs


Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts

SocketSocket SOC 2 Logo

Product

  • Package Alerts
  • Integrations
  • Docs
  • Pricing
  • FAQ
  • Roadmap
  • Changelog

Packages

npm

Stay in touch

Get open source security insights delivered straight into your inbox.


  • Terms
  • Privacy
  • Security

Made with ⚡️ by Socket Inc