Huge News!Announcing our $40M Series B led by Abstract Ventures.Learn More
Socket
Sign inDemoInstall
Socket

uhferret

Package Overview
Dependencies
Maintainers
1
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

uhferret

  • 1.3.7
  • Rubygems
  • Socket score

Version published
Maintainers
1
Created
Source

= UHFerret

homepage:: https://peterlane.netlify.org/ferret/ source:: https://notabug.org/peterlane/uhferret-gem/releases

== Description

UHFerret is a copy-detection tool, supporting the analysis of large sets of documents to find pairs of documents with substantial amounts of lexical copying. Documents containing either natural language (e.g. English) or computer programs (in C-family) may be processed.

This library provides a Ruby wrapper around uhferret suitable for scripting, a command-line executable, 'uhferret', and a simple server version, 'uhferret-server'.

NB: to install uhferret, Ruby must be able to compile and build C extensions.

== Use

=== Command Line

Usage: uhferret [options] file1 file2 ...
    -h, --help                       help message
    -c, --code                       process documents as code
    -t, --text                       process documents as text (default)
    -d, --data-table                 output similarity table (default)
    -l, --list-trigrams              output trigram list
    -a, --all-comparisons            output list of all comparisons
    -x, --xml-report FILE            generate xml report from two documents
    -f, --definition-file FILE       read document names from file

To compute the similarities of a set of files, use:

$ uhferret file1.txt file2.txt ...

An xml output can be generated for a pair of files using:

$ uhferret -x outfile.xml file1.txt file2.txt

The xml output can be displayed in a browser using the style sheet 'uhferret.xsl' in the examples folder, and then printed from the browser.

=== Program

Ferret can also be used as a library, and called from within a program. For example:

ferret = Ferret.new ferret.add 'filename1.txt' ferret.add 'filename2.txt' ferret.run ferret.output_similarity_table

Will create a new instance of Ferret, add two documents, run and then output the similarity between the two.

=== Server

Usage: uhferret-server [options]
    -h, --help                       help message
    -p, --port n                     port number
    -f, --folder FOLDER              base folder

The folder to store the processed files will default to 'FerretFiles' and the port to 2000. Initial address: http://localhost:2000/ferret/home

NB: The server uses some *nix commands, and so currently does not work under Windows.

== Acknowledgements

UHFerret has been developed at the University of Hertfordshire by members of the Plagiarism Detection Group. The original concept of using trigrams for measuring copying was developed by Caroline Lyon and James Malcolm. JunPeng Bao, Ruth Barrett and Bob Dickerson also contributed to the development of earlier versions of Ferret.

FAQs

Package last updated on 10 Nov 2020

Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts

SocketSocket SOC 2 Logo

Product

  • Package Alerts
  • Integrations
  • Docs
  • Pricing
  • FAQ
  • Roadmap
  • Changelog

Packages

npm

Stay in touch

Get open source security insights delivered straight into your inbox.


  • Terms
  • Privacy
  • Security

Made with ⚡️ by Socket Inc