New Case Study:See how Anthropic automated 95% of dependency reviews with Socket.Learn More
Socket
Sign inDemoInstall
Socket

bson-transpilers

Package Overview
Dependencies
Maintainers
30
Versions
503
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

bson-transpilers - npm Package Compare versions

Comparing version 0.13.4 to 0.13.5

download-antlr.js

127

CONTRIBUTING.md

@@ -7,3 +7,3 @@ # Contributing to bson-transpilers

to create a parse tree. As `ANTLR` is written in Java, you will need to set up a
few tools before being able to compile this locally.
few tools before being able to compile this locally.

@@ -15,20 +15,5 @@ Make sure you have Java installed:

Download `ANTLR4`:
```shell
$ cd /usr/local/lib && curl -O http://www.antlr.org/download/antlr-4.7.2-complete.jar
```
You will then need to add it to your `$CLASSPATH`:
```shell
$ export CLASSPATH=".:/usr/local/lib/antlr-4.7.2-complete.jar:$CLASSPATH"
```
Alias `antlr4` and `grun`:
```shell
$ alias antlr4='java -Xmx500M -cp "/usr/local/lib/antlr-4.7.2-complete.jar:$CLASSPATH" org.antlr.v4.Tool' && alias grun='java org.antlr.v4.gui.TestRig'
```
_I strongly suggest using an IDE that will help you visualize ANTLR trees (JetBrains has a good plugin).
Otherwise you can use the java version of the grammar and compile it with
`javac <Language>*.java && grun <Language> <StartRule> -gui`.
Otherwise you can use the java version of the grammar and compile it with
`javac <Language>*.java && grun <Language> <StartRule> -gui`.
[This might be helpful](https://github.com/antlr/antlr4/blob/master/doc/getting-started.md)._

@@ -45,19 +30,21 @@

- __OUTPUT=:__ comma-separated output languages you want to test. Also called "target" language.
- __MODE=:__ comma-separated names of the test files (without .yaml) that you want to run
- __MODE=:__ comma-separated names of the test files (without .yaml) that you want to run
```shell
OUTPUT=csharp INPUT=shell MODE=native,bson npm run test
OUTPUT=csharp INPUT=shell MODE=native,bson npm run test
```
# How it works
See also the original presentation: https://drive.google.com/file/d/1jvwtR3k9oBUzIjL4z_VtpHvdWahfcjTK/view
## Compilation Stages
Similar to how many transpilers work, this package parses the input
string into a tree and then generates code from the tree using the [Visitor
Similar to how many transpilers work, this package parses the input
string into a tree and then generates code from the tree using the [Visitor
pattern](https://en.wikipedia.org/wiki/Visitor_pattern).
### Step 1: Parsing
Parsing and tree generation is handled by ANTLR4.
The grammar files are located in the `grammars` folder, and the javascript
Parsing and tree generation is handled by ANTLR4.
The grammar files are located in the `grammars` folder, and the javascript
parser/lexer/etc. generated from the grammar are located in `lib/antlr`. To make
changes to the grammar, you have to modify the `.g4` file in `grammars`, then
changes to the grammar, you have to modify the `.g4` file in `grammars`, then
run `npm run compile`. You should never directly modify files in `lib`.

@@ -70,3 +57,3 @@

ANTLR generates a "shell" visitor class for each tree in
ANTLR generates a "shell" visitor class for each tree in
`lib/antlr/<grammar name>Visitor.js`. It contains an empty method

@@ -79,3 +66,3 @@ for each node in the parse tree.

Because the project is designed to handle multiple input languages and multiple
Because the project is designed to handle multiple input languages and multiple
output languages, the tree visitation stage is split into parts. The first part

@@ -85,8 +72,8 @@ is handled in the visitor class defined in `codegeneration/<input language>/Visitor.js`.

This visitor class <b>is specific to the input language</b> and can only visit
a tree generated by that grammar. The visitor visits each node and use a
[string template](#templates) defined in either the [symbol table](#symbols)
or the [type table](#types) to generate code in the ouput language.
For expressions that are too complex for a string template, the visitor will call an
`emit` method defined in the [Generator](#step-3:-generator). The general rule is
This visitor class <b>is specific to the input language</b> and can only visit
a tree generated by that grammar. The visitor visits each node and use a
[string template](#templates) defined in either the [symbol table](#symbols)
or the [type table](#types) to generate code in the ouput language.
For expressions that are too complex for a string template, the visitor will call an
`emit` method defined in the [Generator](#step-3:-generator). The general rule is
that emit methods aren't required unless you're doing something very unusual! Or

@@ -96,6 +83,6 @@ if you need to do any tree manipulation, since the templates only have access to the

If the node requires special treatment for all output languages, the visitor will
define a `process<type>` method that will do some pre-processing before calling
the appropriate string template or `emit` method. An example is `processDate` in
the JS visitor, which constructs a date object from the input and passes it to the
If the node requires special treatment for all output languages, the visitor will
define a `process<type>` method that will do some pre-processing before calling
the appropriate string template or `emit` method. An example is `processDate` in
the JS visitor, which constructs a date object from the input and passes it to the
Date template.

@@ -111,6 +98,6 @@

### Step 3: Generator
The other half of the tree visitation stage. Each ouput language will
have a Generator class defined in `codegeneration/<ouput language>/Generator.js`.
The other half of the tree visitation stage. Each ouput language will
have a Generator class defined in `codegeneration/<ouput language>/Generator.js`.
The Generator class generates code, so it is <b> specific to the ouput language.
</b> The Generator class is a subclass of the input language's visitor class.
</b> The Generator class is a subclass of the input language's visitor class.
So for example, translating between JS and Python, the order of inheritance will be:

@@ -122,3 +109,3 @@ 1. `lib/antlr/ECMAScriptVisitor.js` ["empty" superclass, specific to the tree built by ANTLR]

For nodes that cannot be translated using
For nodes that cannot be translated using
templates, the Generator class will define a method called `emit<type>` which

@@ -131,17 +118,17 @@ takes in a tree node, some optional metadata, and returns the transformed string.

When the visitor in [step #1](#step-1:-parsing) reaches a function call, variable, attribute access, or other "identifier"
When the visitor in [step #1](#step-1:-parsing) reaches a function call, variable, attribute access, or other "identifier"
expression it needs a way of knowing what that symbol evaluates to in order to know if it is valid.
### Symbols
Each input language has it's own set of symbols that are part of the
language. The majority of symbols supported in the input languages are BSON types
(i.e. `Int32`, `ObjectId`, etc) but there are a few native types like `RegExp` and
`Date` that are not BSON-specific. In order for the transpiler to know if a symbol
is undefined, we store symbol information in a
Each input language has it's own set of symbols that are part of the
language. The majority of symbols supported in the input languages are BSON types
(i.e. `Int32`, `ObjectId`, etc) but there are a few native types like `RegExp` and
`Date` that are not BSON-specific. In order for the transpiler to know if a symbol
is undefined, we store symbol information in a
[Symbol Table](https://en.wikipedia.org/wiki/Symbol_table).
#### Symbol Metadata
The visitor uses the symbol table to determine if a symbol is undefined, but the
The visitor uses the symbol table to determine if a symbol is undefined, but the
symbol table also stores some metadata so the visitor can do type and other validity checks. The symbols
are defined in [YAML](https://en.wikipedia.org/wiki/YAML) in the
are defined in [YAML](https://en.wikipedia.org/wiki/YAML) in the
`symbols/<input language>/symbols.yaml` file. A symbol definition looks like:

@@ -184,8 +171,8 @@

### Types
Each input language also has a set of types that are part of the language.
The set of types that are universal for all languages (i.e. "primitives",
"literals", like `string`, `integer`, etc) are defined in the file
Each input language also has a set of types that are part of the language.
The set of types that are universal for all languages (i.e. "primitives",
"literals", like `string`, `integer`, etc) are defined in the file
`symbols/basic_types.yaml`.
Types that are specific to the input language are defined in `symbols/<input
Types that are specific to the input language are defined in `symbols/<input
language>/types.yaml`. These include BSON types, i.e. classes like `ObjectId`, and

@@ -195,23 +182,23 @@ language-specific types like `RegExp` and `Date`. The types are defined in the same

NOTE: It is important not to mix up symbols and types, especially since they can share
the same identifier and are basically the same thing but we have to make a distinction somewhere
NOTE: It is important not to mix up symbols and types, especially since they can share
the same identifier and are basically the same thing but we have to make a distinction somewhere
because otherwise we will end up with invalid code.
The **symbol** `ObjectId` has attributes like `ObjectId.fromString(...)`
and is a constructor, so `ObjectId()` is valid. The **type** `ObjectId` has
attributes like `ObjectId().toString()` and is *a variable*, so `ObjectId()()`
The **symbol** `ObjectId` has attributes like `ObjectId.fromString(...)`
and is a constructor, so `ObjectId()` is valid. The **type** `ObjectId` has
attributes like `ObjectId().toString()` and is *a variable*, so `ObjectId()()`
is not valid and will error with `ObjectId() is not callable` or similar error.
You can kind of think of types as instantiated symbols, if that's helpful.
So: `ObjectId.toString() and ObjectId().fromString('x')` are both invalid, while
`ObjectId().toString() and ObjectId.fromString('x')` are both valid.
`ObjectId().toString() and ObjectId.fromString('x')` are both valid.
## Templates
The symbol table includes an additional piece of metadata, called a `template`.
These are functions that accept strings and return strings, and are responsible for
These are functions that accept strings and return strings, and are responsible for
doing the string transformations from one language syntax to another language's syntax.
They are defined in `symbols/<ouput language>/templates.yaml`. This is where
They are defined in `symbols/<ouput language>/templates.yaml`. This is where
the majority of code generation happens, so the templates are **specific to the output language**.
Some templates take additional arguments, which are commented in symbols/sample_template.yaml.
Templates can be split into `template` and `argTemplate`. For symbols that are function
calls, the `argsTemplate` is a function that gets applied to the arguments in case they
Templates can be split into `template` and `argTemplate`. For symbols that are function
calls, the `argsTemplate` is a function that gets applied to the arguments in case they
need rearranging between languages.

@@ -232,3 +219,3 @@

- `codegeneration/<ouput language>/Generator.js` - The generator for the specific output language.
- `lib/symbol-table/<input language>to<ouput language>.js` - The symbol table for
- `lib/symbol-table/<input language>to<ouput language>.js` - The symbol table for
the input+output combination.

@@ -304,3 +291,3 @@

}
/* ... and every other language that can compile to your language.
/* ... and every other language that can compile to your language.
* Make sure you update the getTree method, as well as the input-language

@@ -338,13 +325,13 @@ * specific visitor and the ANTLR visitor to match the input lang. */

};
``
``
```
9. Next thing is tests! You must go through each test file and add the results of
compiling each input into your output language under the `output` field.
compiling each input into your output language under the `output` field.
```yaml
Document:
- input:
Document:
- input:
javascript: "{x: '1'}"
shell: "{x: '1'}"
python: "{'x': '1'}"
output:
output:
javascript: "{\n 'x': '1'\n}"

@@ -351,0 +338,0 @@ python: "{\n 'x': '1'\n}"

{
"name": "bson-transpilers",
"version": "0.13.4",
"version": "0.13.5",
"apiVersion": "0.0.1",

@@ -15,8 +15,9 @@ "productName": "BSON Transpilers",

"start": "node index.js",
"precompile": "node download-antlr.js",
"compile": "npm run antlr4-js && npm run antlr4-py && npm run symbol-table",
"antlr4-js": "java -Xmx500M -cp '/usr/local/lib/antlr-4.7.2-complete.jar:$CLASSPATH' org.antlr.v4.Tool -Dlanguage=JavaScript -lib grammars -o lib/antlr -visitor -Xexact-output-dir grammars/ECMAScript.g4",
"antlr4-py": "java -Xmx500M -cp '/usr/local/lib/antlr-4.7.2-complete.jar:$CLASSPATH' org.antlr.v4.Tool -Dlanguage=JavaScript -lib grammars -o lib/antlr -visitor -Xexact-output-dir grammars/Python3.g4",
"antlr4-js": "java -Xmx500M -cp './antlr-4.7.2-complete.jar:$CLASSPATH' org.antlr.v4.Tool -Dlanguage=JavaScript -lib grammars -o lib/antlr -visitor -Xexact-output-dir grammars/ECMAScript.g4",
"antlr4-py": "java -Xmx500M -cp './antlr-4.7.2-complete.jar:$CLASSPATH' org.antlr.v4.Tool -Dlanguage=JavaScript -lib grammars -o lib/antlr -visitor -Xexact-output-dir grammars/Python3.g4",
"symbol-table": "node compile-symbol-table.js",
"test": "npm run symbol-table && mocha",
"prepublish": "npm run compile",
"prepublishOnly": "npm run compile",
"check": "mongodb-js-precommit './codegeneration/**/*{.js,.jsx}' './test/**/*.js' index.js",

@@ -23,0 +24,0 @@ "ci": "npm run check && npm run test"

@@ -9,2 +9,4 @@ # BSON-Transpilers

See also the original presentation: https://drive.google.com/file/d/1jvwtR3k9oBUzIjL4z_VtpHvdWahfcjTK/view
# Usage

@@ -60,3 +62,3 @@

### Errors
There are a few different error classes thrown by `bson-transpilers`, each with
There are a few different error classes thrown by `bson-transpilers`, each with
their own error code:

@@ -102,3 +104,3 @@

#### BsonTranspilersSyntaxError
###### code: E_BSONTRANSPILERS_SYNTAX
###### code: E_BSONTRANSPILERS_SYNTAX
This will throw if you have a syntax error. For example missing a colon in

@@ -114,3 +116,3 @@ Object assignment, or forgetting a comma in array definition:

// ✔: neither of these will throw
// ✔: neither of these will throw
{ key: 'beep' }

@@ -134,4 +136,4 @@ [ 'beep', 'boop', 'beepBoop' ]

###### code: E_BSONTRANSPILERS_UNIMPLEMENTED
If there is a feature in the input code that is not currently supported by the
transpiler.
If there is a feature in the input code that is not currently supported by the
transpiler.

@@ -141,3 +143,3 @@ #### BsonTranspilersRuntimeError

A generic runtime error will be thrown for all errors that are not covered by the
above list of errors. These are usually constructor requirements, for example
above list of errors. These are usually constructor requirements, for example
when using a `RegExp()` an unsupported flag is given:

@@ -155,3 +157,3 @@

###### code: E_BSONTRANSPILERS_INTERNAL
In the case where something has gone wrong within compilation, and an error has
In the case where something has gone wrong within compilation, and an error has
occured. If you see this error, please create [an issue](https://github.com/mongodb-js/bson-transpilers/issues) on Github!

@@ -158,0 +160,0 @@

SocketSocket SOC 2 Logo

Product

  • Package Alerts
  • Integrations
  • Docs
  • Pricing
  • FAQ
  • Roadmap
  • Changelog

Packages

npm

Stay in touch

Get open source security insights delivered straight into your inbox.


  • Terms
  • Privacy
  • Security

Made with ⚡️ by Socket Inc