
Research
Security News
Malicious PyPI Package Exploits Deezer API for Coordinated Music Piracy
Socket researchers uncovered a malicious PyPI package exploiting Deezer’s API to enable coordinated music piracy through API abuse and C2 server control.
Ohm is a parser generator for building languages and interpreters. It provides a way to define grammars and parse text according to those grammars. Ohm is particularly useful for creating domain-specific languages, interpreters, and compilers.
Defining Grammars
This feature allows you to define a grammar using Ohm's syntax. The example defines a simple arithmetic grammar that can parse expressions involving addition, subtraction, multiplication, and division.
const ohm = require('ohm-js');
const grammar = ohm.grammar(`
Arithmetic {
Exp = AddExp
AddExp = AddExp "+" MulExp -- plus
| AddExp "-" MulExp -- minus
| MulExp
MulExp = MulExp "*" PriExp -- times
| MulExp "/" PriExp -- divide
| PriExp
PriExp = "(" Exp ")" -- paren
| number
number = digit+
}
`);
Parsing Input
Once a grammar is defined, you can use it to parse input strings. This example parses an arithmetic expression and checks if the parsing succeeded.
const input = '3 + 5 * (10 - 4)';
const matchResult = grammar.match(input);
if (matchResult.succeeded()) {
console.log('Parsing succeeded!');
} else {
console.log('Parsing failed.');
}
Semantic Actions
Ohm allows you to define semantic actions that can be performed on the parse tree. This example defines an 'eval' operation to evaluate arithmetic expressions parsed by the grammar.
const semantics = grammar.createSemantics().addOperation('eval', {
Exp: function(e) { return e.eval(); },
AddExp_plus: function(a, _, b) { return a.eval() + b.eval(); },
AddExp_minus: function(a, _, b) { return a.eval() - b.eval(); },
MulExp_times: function(a, _, b) { return a.eval() * b.eval(); },
MulExp_divide: function(a, _, b) { return a.eval() / b.eval(); },
PriExp_paren: function(_1, e, _2) { return e.eval(); },
number: function(digits) { return parseInt(this.sourceString, 10); }
});
const result = semantics(matchResult).eval();
console.log(result);
PEG.js is a simple parser generator for JavaScript that produces fast parsers with excellent error reporting. Like Ohm, it allows you to define grammars and parse text, but it uses Parsing Expression Grammars (PEG) instead of Ohm's custom syntax.
ANTLR (Another Tool for Language Recognition) is a powerful parser generator for reading, processing, executing, or translating structured text or binary files. It is more feature-rich and supports multiple target languages, making it more versatile than Ohm, but also more complex to use.
Nearley is a simple, fast, and powerful parser toolkit for JavaScript. It uses Earley parsing, which can handle more complex grammars than PEG-based parsers like PEG.js. Nearley is more flexible but can be more difficult to use for simple grammars compared to Ohm.
Ohm is a parsing toolkit consisting of a library and a domain-specific language. You can use it to parse custom file formats or quickly build parsers, interpreters, and compilers for programming languages.
The Ohm language is based on parsing expression grammars (PEGs), which are a formal way of describing syntax, similar to regular expressions and context-free grammars. The Ohm library provides a JavaScript interface for creating parsers, interpreters, and more from the grammars you write.
Some awesome things people have built using Ohm:
The easiest way to get started with Ohm is to use the interactive editor. Alternatively, you can play with one of the following examples on JSFiddle:
For use in the browser:
Download ohm.js (development version, with full source and comments) or ohm.min.js (a minified version for faster page loads).
Add a new script tag to your page, and set the src
attribute to the path of the file you just downloaded. E.g.:
<script src="ohm.js"></script>
This creates a global variable named ohm
.
If you are using Node.js, you can just install the ohm-js
package using npm:
npm install ohm-js
This will install Ohm in the local node_modules folder. Use require
to access it from a Node script:
const ohm = require('ohm-js');
To use Ohm, you need a grammar that is written in the Ohm language. The grammar provides a formal definition of the language or data format that you want to parse. There are a few different ways you can define an Ohm grammar:
The simplest opion is to define the grammar directly in a JavaScript string and instantiate it
using ohm.grammar()
. In most cases, you should use a template literal with String.raw:
const myGrammar = ohm.grammar(String.raw`
MyGrammar {
greeting = "Hello" | "Hola"
}
`);
In Node.js, you can define the grammar in a separate file, and read the file's contents and instantiate it using ohm.grammar(contents)
:
In myGrammar.ohm
:
MyGrammar {
greeting = "Hello" | "Hola"
}
In JavaScript:
const fs = require('fs');
const ohm = require('ohm-js');
const contents = fs.readFileSync('myGrammar.ohm', 'utf-8');
const myGrammar = ohm.grammar(contents);
For more information, see Instantiating Grammars in the API reference.
Once you've instantiated a grammar object, use the grammar's match()
method to recognize input:
const userInput = 'Hello';
const m = myGrammar.match(userInput);
if (m.succeeded()) {
console.log('Greetings, human.');
} else {
console.log("That's not a greeting!");
}
The result is a MatchResult object. You can use the succeeded()
and failed()
methods to see whether the input was recognized or not.
For more information, see the main documentation.
Ohm has two tools to help you debug grammars: a text trace, and a graphical visualizer.
You can try the visualizer online.
To see the text trace for a grammar g
, just use the g.trace()
method instead of g.match
. It takes the same arguments, but instead of returning a MatchResult
object, it returns a Trace object — calling its toString
method returns a string describing
all of the decisions the parser made when trying to match the input. For example, here is the
result of g.trace('ab').toString()
for the grammar G { start = letter+ }
:
ab ✓ start ⇒ "ab"
ab ✓ letter+ ⇒ "ab"
ab ✓ letter ⇒ "a"
ab ✓ lower ⇒ "a"
ab ✓ Unicode [Ll] character ⇒ "a"
b ✓ letter ⇒ "b"
b ✓ lower ⇒ "b"
b ✓ Unicode [Ll] character ⇒ "b"
✗ letter
✗ lower
✗ Unicode [Ll] character
✗ upper
✗ Unicode [Lu] character
✗ unicodeLtmo
✗ Unicode [Ltmo] character
✓ end ⇒ ""
If you've written an Ohm grammar that you'd like to share with others, see our suggestions for publishing grammars.
All you need to get started:
git clone https://github.com/harc/ohm.git
cd ohm
npm install
NOTE: We recommend using the latest Node.js stable release.
npm test
runs the unit tests.npm run test-watch
re-runs the unit tests every time a file changes.npm run build
builds dist/ohm.js and dist/ohm.min.js,
which are stand-alone bundles that can be included in a webpage.src/ohm-grammar.ohm
), run npm run bootstrap
to re-build Ohm
and test your changes.Before submitting a pull request, be sure to add tests, and ensure that npm run prepublish
runs
without errors.
FAQs
An object-oriented language for parsing and pattern matching
The npm package ohm-js receives a total of 157,046 weekly downloads. As such, ohm-js popularity was classified as popular.
We found that ohm-js demonstrated a not healthy version release cadence and project activity because the last version was released a year ago. It has 3 open source maintainers collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Research
Security News
Socket researchers uncovered a malicious PyPI package exploiting Deezer’s API to enable coordinated music piracy through API abuse and C2 server control.
Research
The Socket Research Team discovered a malicious npm package, '@ton-wallet/create', stealing cryptocurrency wallet keys from developers and users in the TON ecosystem.
Security News
Newly introduced telemetry in devenv 1.4 sparked a backlash over privacy concerns, leading to the removal of its AI-powered feature after strong community pushback.