
Research
PyPI Package Disguised as Instagram Growth Tool Harvests User Credentials
A deceptive PyPI package posing as an Instagram growth tool collects user credentials and sends them to third-party bot services.
atok-parser
Advanced tools
Writing parsers is quite a common but sometimes lengthy task. To ease this process atok-parser leverages the atok tokenizer and performs the basic steps to set up a streaming parser, such as:
track([Boolean])
: keep track of the line and column positions to be used when building errors. Note that when set, tracking incurs a performance penalty.write()
, end()
, pause()
and resume()
whitespace()
number()
float()
word()
string()
utf8()
chunk()
stringList()
match()
noop()
wait()
It is published on node package manager (npm). To install, do:
npm install atok-parser
A silly example to illustrate the various pre defined variables and parser definition. It parses a flot number and returns the value via its #parse
method.
function myParser (options) {
function handler (num) {
// The options are set from the myParser function parameters
// self is already set to the Parser instance
if ( options.check && !isFinite(num) )
return self.emit('error', new Error('Invalid float: ' + num))
self.emit('data', num)
}
// the float() and whitespace() helpers are provided by atok-parser
atok.float(handler)
atok.whitespace()
}
var Parser = require('..').createParser(myParser)
// Add the #parse() method to the Parser
Parser.prototype.parse = function (data) {
var res
// One (silly) way to make parse() look synchronous...
this.once('data', function (data) {
res = data
})
this.write(data)
// ...write() is synchronous
return res
}
// Instantiate a parser
var p = new Parser({ check: true })
// Parse a valid float
var validfloat = p.parse('123.456 ')
console.log('parsed data is of type', typeof validfloat, 'value', validfloat)
// The following data will produce an invalid float and an error
p.on('error', console.error)
var invalidfloat = p.parse('123.456e1234 ')
createParserFromFile(file[, parserOptions, parserEvents, atokOptions])
: return a parser class (Function) based on the input file.
The following variables are made available to the parser javascript code:
atok {_Object_}
: atok tokenizer instanciated with provided options. Also set as this.atok DO NOT DELETEself {_Object_}
: reference to thisPredefined methods:
write(data)
end([data])
pause()
resume()
debug([logger (_Function_)])
track(flag (_Boolean_))
Events automatically forwarded from tokenizer to parser:
drain
debug
createParser(data[, parserOptions, parserEvents, atokOptions])
: same as createParserFromFile()
but with supplied content instead of a file name
Helpers are a set of standard Atok rules organized to match a specific type of data. If the data is encountered, the handler is fired with the results. If not, the rule is ignored. The behaviour of a single helper is the same as a single Atok rule:
continue(jump, jumpOnFail)
was applied to the helpercontinue(jump)
was applied to the helpernext(ruleSetId)
continue(jump, jumpOnFail)
. A helper has exactly the size of a single rule, which greatly helps defining complex rules.// Parse a whitespace separated list of floats
var myParser = [
'atok.float(function (n) { self.emit("data", n) })'
, 'atok.continue(-1, -2)'
, 'atok.whitespace()'
]
var Parser = require('atok-parser').createParser(myParser)
var p = new Parser
p.on('data', function (num) {
console.log(typeof num, num)
})
p.end('0.133 0.255')
Arguments are not required. If no handler is specified, the [data] event will be emitted with the corresponding data.
whitespace(handler)
: ignore consecutive spaces, tabs, line breaks.
handler(whitespace)
number(handler)
: process positive integers
handler(num)
float(handler)
: process float numbers. NB. the result can be an invalid float (NaN or Infinity).
handler(floatNumber)
word(handler)
: process a word containing letters, digits and underscores
handler(word)
string([start, end, esc,] handler)
: process a delimited string. If end is not supplied, it is set to start.
handler(string)
utf8([start, end,] handler)
: process a delimited string containing UTF-8 encoded characters. If end is not supplied, it is set to start.
handler(UTF-8String)
chunk(charSet, handler)
:
handler(chunk)
stringList([start, end, separator,] handler)
: process a delimited list of strings
handler(listOfStrings)
match(start, end, stringQuotes, handler)
: find a matching pattern (e.g. bracket matching), skipping string content if required
handler(token)
noop(next)
: passthrough - does not do anything except applying given properties (useful to branch rules without having to use atok#saveRuleSet()
and atok#loadRuleSet()
)
wait(atokPattern[...atokPattern], handler)
: wait for the given pattern. Nothing happens until data is received that triggers the pattern. Must be preceded by continue()
to properly work. Typical usage is when expecting a string the starting quote is received but not the end... so wait until then and resume the rules workflow.nvp([nameCharSet, separator, endPattern] handler)
: parse a named value pair (default nameCharSet={ start: 'aA0_', end: 'zZ9_' }, separator==, endPattern={ firstOf: ' \t\n\r' }). Disable endPattern by setting it to '' or [].
handler(name, value)
A set of examples are located under the examples/ directory.
FAQs
Parser generator based on the atok tokenizer
The npm package atok-parser receives a total of 1 weekly downloads. As such, atok-parser popularity was classified as not popular.
We found that atok-parser demonstrated a not healthy version release cadence and project activity because the last version was released a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Research
A deceptive PyPI package posing as an Instagram growth tool collects user credentials and sends them to third-party bot services.
Product
Socket now supports pylock.toml, enabling secure, reproducible Python builds with advanced scanning and full alignment with PEP 751's new standard.
Security News
Research
Socket uncovered two npm packages that register hidden HTTP endpoints to delete all files on command.