cherow - npm Package Compare versions

Comparing version 0.0.15 to 0.0.16

package.json

		{
		"name": "cherow",
		"version": "0.0.15",
		"version": "0.0.16",
		"description": "",
		@@ -63,10 +63,3 @@ "main": "dist/cherow.js",
		"typescript": "^2.5.2"
		},
		"dependencies": {
		"acorn": "^5.1.2",
		"acorn-jsx": "^4.0.1",
		"benchmark": "^2.1.4",
		"esprima": "^4.0.0",
		"jazzle": "^0.5.880"
		}
		}

207

README.md

		@@ -1,131 +0,174 @@
		[![Build Status](https://travis-ci.org/cherow/cherow.svg?branch=master)](https://travis-ci.org/cherow/cherow)
		# Cherow

		Cherow is a very fast, standard-compliant [ECMAScript](http://www.ecma-international.org/publications/standards/Ecma-262.htm) parser written in ECMAScript.
		Work in progress

		It strictly follows the ECMAScript® 2018 Language Specification and should parse acc. these specifications
		ANNOUNCEMENT I have stopped playing around, and gone into a mode where I'm actually finishing this parser.

		It's safe to use in production even I'm not done with this parser. I'm finishing the parser in the dev branch.
		Everything parses now after the ECMA specifications.

		## Features
		It's now save to use Cherow in development, and it covers the same thing as Esprima and Acorn
		does. It even parse a lot of cases the mentioned parsers fails on.

		- Full support for ECMAScript® 2018 [(ECMA-262 8th Edition)](http://www.ecma-international.org/publications/standards/Ecma-262.htm)
		- Stage 3 proposals (experimental)
		- Support for JSX, a syntax extension for React
		- Optional tracking of syntax node location (index-based and line-column)
		- 4650 unit tests with full code coverage
		In fact. Cherow now parses everything after the ECMAScript specs.

		## ESNext features
		NOTE!! No point to open issue tickets. I know what I'm doing : )

		`Stage 3` features support. This need to be enabled with the `next` option
		It's safe to use Cherow now, but I'm still not willing to move the code to Master branch until the internals are fixed, and
		the my remaining TODO are fixed.

		- Dynamic Import
		- Async generators
		- Async Await
		- Object spread
		- Optional catch binding
		- Regular Expression's new `DotAll` flag
		Over 4400 unit tests should tell you that Cherow works!

		## Options
		Be aware that the code will change rapidly as I do progress.

		* `next` - Enables `ECMAScript Next` support and let you use proposals at `stage 3` or higher such as `Dynamic Import`.
		* `directives` - Enables support for [directive nodes](https://github.com/estree/estree/pull/152)
		* `raw` - Enables the raw property on literal nodes (Esprima and Acorn feature)
		* `comments` - Enables option to collect comments. Optional; Either array or function. Works like [Acorn](https://github.com/ternjs/acorn) onComment.
		* `tokens` - If enabled each found token will be returned as either an function or an array (work in progres)
		* `ranges` - Enables the start and characters offsets on the AST node.
		* `locations` - Enables location tracking. (4 min fix, but on hold for now)
		* `jsx` - Enables JSX
		## Current stage

		# API
		At current stage Cherow are working 100%, but not optimized. It's tested against both Acorn and Esprima and
		I'm also parsing huge libraries with this parser - both ES5 and ES6 code.

		Cherow can be used to perform syntactic analysis of Javascript program, or lexical analysis (tokenization).
		To parse something, you can do:

		```js

		implement { parseMOdule, parseScript } from 'cherow';

		// Parsing script
		cherow.parseScript('const fooBar = 123;');
		// parse in sloppy mode
		parseScript('function foo() { return "bar"; } ');

		// Parsing module code
		cherow.parseModule('const fooBar = 123;');
		// parse inmodule code
		parseModule('function foo() { return "bar"; } ');

		```
		## Parsing with options
		// parse and get a ESTree output like Esprima

		parseScript('function foo() { return "bar"; } ', { raw: true, directives: true });

		```js
		// parse and get a ESTree output like Acorn

		// Parsing script
		cherow.parseScript('const fooBar = 123;', { ranges: true, raw: true, next: true});
		parseScript('function foo() { return "bar"; } ', { raw: true, ranges: true });

		```

		### Libraries fail to parse

		### Collecting comments
		A few libraries fails to parse because I haven't implemented all code yet, or the code are unfinished.
		Example on this is that I haven't finished `for in` so one known issue is that it will throw.
		on `unexpected token in`. This is expected. Note This hasn't been completed yet because computed
		properties should be allowed with `in`. Something most open source parsers haven't implemented.

		Collecting comments works just the same way as for Acorn
		```js
		And 2 lbraries can't be parsed due to issues in the source code in strict mode. This is out of scope for Cherow.

		// Function
		cherow.parseScript('// foo',
		{
		comments: function(type, comment, start, end) {}
		}
		);
		I can also mention that Acorn have multiple regular expressions bugs in their source code. This has no
		effect just now because I haven't implemented my regular expression parser. But still worth to mention.

		// Array
		const commentArray = [];
		Here is the complete list I know of libraries that Cherow can't parse ATM

		cherow.parseScript('// foo',
		{
		comments: commentArray
		}
		);
		- MooTools 1.4.5
		- jquery-1.9.1.js ( fails on invalid in keyword )
		- yui-3.12.0 ( fails on invalid in keyword )

		```
		Libraries with issues in source code

		## Acorn and Esprima
		- DashDash (only in strict or module code. Parses in sloppy mode)
		- RX (onaggregates module only)

		If you prefer Acorn, you can use some of the options to let Cherow parse and give you the same output as you would do
		with Acorn. Same for Esprima.
		## Roadmap

		Here is how you do it:
		Mostly get tests to fail where they should fail according to TC39.

		Acorn
		3. Make sure things parses as they should
		4. Add back in TypeScript parser code (it's written. Need to be added)
		5. General cleanup and reduce amount of bitmasks
		6. Reduce code size

		```js
		## Important

		cherow.parseScript('{ a: b}', { raw: true, ranges: true });
		```
		JSX are 99% completed and template are missing support for escaped sequences. This is done with purpose. TypeScript handle JSX and templates differently so this will be fixed after the TS parser code have been implemented.

		Esprima
		# Development process

		That is the most complex things I'm doing. I'm using `http://astexplorer.net/` to keep track on what fails and doesn't fail in various parsers. I'm also
		running `SpiderMonkey`, `V8`, and testing it against `NodejS` itself to validate if I'm doing the right things.

		Beside that. For every change I do in the code base, I'm parsing around 30 different libraries every time just to validate that I didn't break anything!

		On top of that I'm running various benchmarks.

		So everyone can make sure that the thing I'm pushing to this repo actually are working :)


		## Location tracker

		As it is now this parser have options you can use to get the same AST output as either Esprima or Acorn. By default Esprima doesnt have any location tracking on the node. Acorn has. For that use `ranges: true` as the option when you parse.

		Column and lines are in progress still.

		Note there exist a difference between how Esprima and Acorn calculating the ranges. You can see this in template string with an simple identifier. Acorn set the identifier start value to 2 (after template head). Esprima calculating it as 0.

		Due to this differences I'm still thinking how to do this. Example `Espree` uses Acorn, but uses Esprima location tracking layout.

		However. Column and lines can be activated with `ranges: true`, but will output wrong values due to the fact this is far from completed.

		I may end up adding a Esprima and Acorn mode option due to backward compability.

		## Error tracking

		This is an complex process. Mostly all open source parsers report either wrong location or wrong token position. One example here
		is Esprima wich is failing on invalid computed property. `({[x]})`. In this case Esprima will throw and report the last brace - `}` - as the wrong token. In fact this has nothing to do with the invalid computed shorthand property.

		Cherow are designed from ground up to fix all this things and report errors correctly. In mentioned case, Cherow will report the first bracket - `[` - as the wrong token. Wich is the start of the computed property.

		Here is an example on how I do it.

		It should fail on both `function static() { "use strict"; }` and `"use strict"; function static() {}` with correct error location. Note the `"use strict";`
		directive in the functions body.

		So the source code for it, will look like this:

		```js
		// if allready in strict mode code, thow
		this.error(Errors.UnexpectedStrictReserved);
		// ... else record current location and mark
		// that we found a reserved word
		this.errorLocation = this.trackErrorLocation();
		this.flags \|= Flags.ReservedWord;
		```
		In the functions body I check if the parser state are in strict mode, and if that's the case, I check for the bitmask and throw the
		error at the recorded location.

		cherow.parseScript('{ a: b}', { raw: true, directives: true });
		No magic! Just simple coding.

		```
		## Performance

		## Benchmarks
		Once again. Allmost all open source parsers have two or more deopts. Cherow are designed to avoid this. A good example is when you are accessing another object shape for checking values - `obj.type === "AssignmentExpression"`. This can and will cause an deopt.

		See the benchmarks [here](BENCHMARK.md)
		Performance is one out of many reasons why developing this parser take so long time.

		## ESTree
		## TypeScript

		Cherow outputs a sensible syntax tree format as standardized by [ESTree project](https://github.com/estree/estree), and does
		not add any "extra" properties to any of it's node like [Esprima](https://github.com/jquery/esprima).
		Cherow will be extended so it can parse TypeScript. This will add around 1450 extra lines of code to the code base.

		However. There is a small difference from other parsers because Cherow outputs a `await` property on the `ForStatement` node.
		This because of the support of `For Await` and `Async Generators`.
		TypeScript parsing code will be added back in after Cherow are passing TC39 100%, and the code size have been optimized and reduced.

		Flow will not be supported.

		## Contribution

		You are welcome to contribute. As a golden rule - always run benchmarks to verify that you haven't created any
		bottlenecks or did something that you shouldn't.
		## TC39

		Terms of contribution:
		At this stage 90% TC39 compatible. There exist a few tests that doesn't fail when they
		should. I'm working on it :)

		- Think twice before you try to implement anything
		- Minimum 1.5 mill ops / sec for light weight cases, and 800k - 1 mill ops / sec for "heavy" cases
		- Avoid duplicating the source code
		- Create tests that cover what you have implemented
		And there can be bugs in the bleeding edge cases like Stage 3 (use next Option). One example
		here is object rest spread. A few weeks ago the specs changed again, and I haven't updated it yet

		## Other

		### Tokenizing.

		It's there, but work in progress. Use `tokens: true`. This is work in progress. See TODO.

		### Comment collecting

		It's there, and working. It should work just like Acorn's `onComment`. Just do `comments: true`

		### Tolerate mode

		I still have to find a valid usecase for it. Once I do, I will add it.

LICENSE

cherow - npm Package Compare versions

New alerts

Fixed alerts

Improved metrics

Worsened metrics

Dependency changes