
Security News
Axios Supply Chain Attack Reaches OpenAI macOS Signing Pipeline, Forces Certificate Rotation
OpenAI rotated macOS signing certificates after a malicious Axios package reached its CI pipeline in a broader software supply chain attack.
pl-copyfind
Advanced tools
ideas are based on those found at plagiarism.bloomfieldmedia.com's CopyFind/WCopyFind
A plagarism comparing function
This project was inspired by the work of Dr Lou Bloomfield's CopyFind/WCopyFind windows programs (http://plagiarism.bloomfieldmedia.com/z-wordpress/software/wcopyfind/). Algorithmically there is an equivalence, however there are marked differences.
pl-copyfind does not:
~download and extract the 'text' from various file formats. Depending upon the platform (either nodejs or a browser), there are a few solutions to this:
textract. This is a 'one stop shop' for reading in a plethora of file formats. Note that this cannot be used in a browser solution, as it requires external binaries to be installed (although a server-side solution can be used for this).~generate output files, although an optional html output is available.
pl-copyfind does:
PhraseLength: 6, // Shortest Phrase to Match
WordThreshold: 100, // Fewest Matches to Report
SkipLength: 20, // Needs bSkipLongWords. words this long are skipped
MismatchTolerance: 2, // #Most Imperfections to Allow
MismatchPercentage: 80, // Minimum % of Matching Words
bIgnoreCase: false, // Ignore Letter Case
bIgnoreNumbers: false, // Ignore Numbers
bIgnoreOuterPunctuation: false, // Ignore Outer Punctuation
bIgnorePunctuation: false, // Ignore Punctuation
bSkipLongWords: false, // Skip Long Words
bSkipNonwords: false, // Skip Non-Words
bBuildReport: true, // generate html output
bBriefReport: true, // show a html report of matches with lead in and out text, for context (otherwise shows full source text). Needs bBuildReport
bTerseReport: false // show ONLY the matching text. Needs bBuildReport
See the demos folder for a complete working example that does not require a web server to execute (just open index.html from your local file system to try it out).
var copyfind = require('pl-copyfind');
...
var options = { PhraseLength: 3, WordThreshold: 3, bIgnoreCase:true};
var src_text = "original text is here. lorem ipsum dolorem est";
var check_text = "I plagiarised lorem ipsum DOLOREM est and I reckon I can get away with it";
copyfind(src_text, check_text, options, function(err, data) {
if (err)
throw "Failed to compare: " + err.toString();
if (!data.matches.length)
return false; // no comparison found
console.log("Found " + data.matches.length + " matches"); // expect 1
for (var i=0; i<data.matches.length; i++) {
var match = data.matches[i];
var orig_text = src_text.substr(match.textL.pos, match.textL.length);
var copied_text = check_text.substr(match.textR.pos, match.textR.length);
console.log("Match found: \n" + orig_text + "\nvs. \n" + copied_text + "\nat position : " + match.textR.pos);
}
});
var copyfind = require('pl-copyfind');
...
var options = { PhraseLength: 3, WordThreshold: 3 };
var src_texts = ["original text is here. lorem ipsum dolorem est","This is another original text that is also dolorem est"];
var check_texts = ["I plagiarised lorem ipsum dolorem est and I reckon I can get away with it","I didn't do lorem est this time"];
copyfind(src_texts, check_texts, options, function(err, data) {
if (err)
throw "Failed to compare: " + err.toString();
if (!data.matches.length)
return false; // no comparison found on any text
for (var l=0; l<src_texts.length; l++) {
for (var r=0; r<check_texts.length; r++) {
for (var i=0; i<data.matches[l][r].length; i++) {
var match = data.matches[l][r][i];
var orig_text = src_texts[l].substr(match.textL.pos, match.textL.length);
var copied_text = check_texts[r].substr(match.textR.pos, match.textR.length);
console.log("Match found: #["+l+"]\n" + orig_text + "\nvs. #["+r+"]\n" + copied_text + "\nat position : " + match.textR.pos);
}
}
}
});
var copyfind = require('pl-copyfind');
...
var options = { bBuildReport: true };
var src_text = "original text is here. lorem ipsum dolorem est";
var check_text = "I plagiarised lorem ipsum dolorem est and I reckon I can get away with it";
copyfind(src_texts, check_text, options, function(err, data) {
if (err)
throw "Failed to compare: " + err.toString();
alert(data.html);
});
var copyfind = require('pl-copyfind');
...
var options = { };
var src_text = "original text is here. lorem ipsum dolorem est";
var check_text1 = "I plagiarised lorem ipsum dolorem est and I reckon I can get away with with it";
var check_text2 = "Another plagiarised lorem ipsum dolorem est and I reckon I can get away with it";
copyfind(src_texts, check_text1, options, function(err, data) {
if (err)
throw "Failed to compare: " + err.toString();
alert("execution took " + data.executionTime + " ms");
options.hashesL = data.hashesL; // save the hashdata. You *could* store this in a file cache too
});
// re-uses hashesL for better performance
copyfind(src_texts, check_text2, options, function(err, data) {
if (err)
throw "Failed to compare: " + err.toString();
alert("execution took " + data.executionTime + " ms");
});
This module and all its source is licensed under GPL, which is the original licensing of WCopyFind/CopyFind source. The license file can be found at [https://github.com/cmroanirgo/pl-copyfind/blob/master/LICENSE.md].
Please note that if you use this library, as-is, then your project need not be subject to what is commonly called 'GPL cancer'. It is only if you embrace and extend the module that you must also release your source code, also under a GPL license. However, as all things go, it would be appreciated if attribution for the work done in this project was acknowledged in your source and information pages.
FAQs
ideas are based on those found at plagiarism.bloomfieldmedia.com's CopyFind/WCopyFind
The npm package pl-copyfind receives a total of 9 weekly downloads. As such, pl-copyfind popularity was classified as not popular.
We found that pl-copyfind demonstrated a not healthy version release cadence and project activity because the last version was released a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Security News
OpenAI rotated macOS signing certificates after a malicious Axios package reached its CI pipeline in a broader software supply chain attack.

Security News
Open source is under attack because of how much value it creates. It has been the foundation of every major software innovation for the last three decades. This is not the time to walk away from it.

Security News
Socket CEO Feross Aboukhadijeh breaks down how North Korea hijacked Axios and what it means for the future of software supply chain security.