
Security News
Potemkin Understanding in LLMs: New Study Reveals Flaws in AI Benchmarks
New research reveals that LLMs often fake understanding, passing benchmarks but failing to apply concepts or stay internally consistent.
A nodejs micro-module to scan through a file and identify the positions to cleanly split the file into multiple chunks based on the line delimiter character.
A nodejs micro-module to scan through a file and identify the positions to cleanly split the file into multiple chunks based on the line delimiter character.
npm install cleancut
var cleancut = require('cleancut');
var filename = './mockaroo_mockdata.csv';
var opts = {
maxChunks : 10, // default 2
minSize : 1048576, // default 1048576 bytes = 10 mb
scanSize : 10240, // default 10240 bytes = 10 kb
linebreak : '\n' // default '\n'
};
var results = cleancut(filename, opts);
console.log(results.splitAt);
cleancut(filename, opts, true)
.then(function(results){
console.log(results.splitAt);
});
cleancut(filename, opts,
function(err,results){
console.log(results.splitAt);
});
results.splitAt
[ { _id: 0, start: 0, end: 6358 },
{ _id: 1, start: 6359, end: 12725 },
{ _id: 2, start: 12726, end: 19124 },
{ _id: 3, start: 19125, end: 25433 },
{ _id: 4, start: 25434, end: 31815 },
{ _id: 5, start: 31816, end: 38130 },
{ _id: 6, start: 38131, end: 44506 },
{ _id: 7, start: 44507, end: 50845 },
{ _id: 8, start: 50846, end: 57229 },
{ _id: 9, start: 57230, end: 63533 } ]
filename
: the source file to cut cleanly (e.g. a very big csv file)opts
: configuration file to define how to cut cleanly
maxChunks
: the max number of chunks to cut the file into (default: 2),minSize
: the min size each chunk must be in bytes (default: 1048576 bytes),scanSize
: the number of bytes to sample at each cut point (default: 10240 bytes),linebreak
: the line delimiter (default: '\n')callback(err,results)
(optional): callback function with err
and results
arguments.
err
: error message if anyresults
: result object
srcfile
: the source file to be cutlinebreak
: the line delimiter for the cutsplitAt
: array of objects specifying the cut points
_id
: chunk idstart
: start position in bytesend
: end position in bytesreturn
: either a Promise<results>
or results
as above in callback
If callback
is not defined, cleancut will be a synchronous function returning result
.
If callback is defined or true
, a Promise
will be returned.
FAQs
A nodejs micro-module to scan through a file and identify the positions to cleanly split the file into multiple chunks based on the line delimiter character.
The npm package cleancut receives a total of 2 weekly downloads. As such, cleancut popularity was classified as not popular.
We found that cleancut demonstrated a not healthy version release cadence and project activity because the last version was released a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
New research reveals that LLMs often fake understanding, passing benchmarks but failing to apply concepts or stay internally consistent.
Security News
Django has updated its security policies to reject AI-generated vulnerability reports that include fabricated or unverifiable content.
Security News
ECMAScript 2025 introduces Iterator Helpers, Set methods, JSON modules, and more in its latest spec update approved by Ecma in June 2025.