Comparing version 1.0.0-rc.4 to 1.0.0
<!--mdast setext--> | ||
<!--lint disable maximum-line-length--> | ||
<!--lint disable maximum-line-length no-multiple-toplevel-headings--> | ||
<!--lint disable no-multiple-toplevel-headings--> | ||
1.0.0 / 2015-09-16 | ||
================== | ||
1.0.0-rc.4 / 2015-08-25 | ||
======================= | ||
* Refactor with final changes before 1.0.0 ([da92a14](https://github.com/wooorm/retext/commit/da92a14)) | ||
* Refactor parsers to work with retext ([772a6eb](https://github.com/wooorm/retext/commit/772a6eb)) | ||
* Add positional information to NLCST parsers ([d1ab2f0](https://github.com/wooorm/retext/commit/d1ab2f0)) | ||
* Refactor retext ([30c0e21](https://github.com/wooorm/retext/commit/30c0e21)) | ||
* Refactor parsers to work with retext ([9bec852](https://github.com/wooorm/retext/commit/9bec852)) | ||
* Update dependencies, dev-dependencies ([7f2afa6](https://github.com/wooorm/retext/commit/7f2afa6)) | ||
1.0.0-rc.3 / 2015-08-09 | ||
======================= | ||
* Add positional information to NLCST parsers ([5de2f80](https://github.com/wooorm/retext/commit/5de2f80)) | ||
1.0.0-rc.2 / 2015-07-31 | ||
======================= | ||
* Fix missing files ([7dac502](https://github.com/wooorm/retext/commit/7dac502)) | ||
1.0.0-rc.1 / 2015-07-31 | ||
======================= | ||
* Refactor retext ([e68af6a](https://github.com/wooorm/retext/commit/e68af6a)) | ||
0.6.0 / 2015-07-31 | ||
@@ -29,0 +14,0 @@ ================== |
@@ -28,5 +28,4 @@ /** | ||
'name': 'retext', | ||
'type': 'cst', | ||
'Parser': Parser, | ||
'Compiler': Compiler | ||
}); |
@@ -84,3 +84,3 @@ /** | ||
function compile() { | ||
return toString(this.file.namespace('retext').cst); | ||
return toString(this.file.namespace('retext').tree); | ||
} | ||
@@ -87,0 +87,0 @@ |
{ | ||
"name": "retext", | ||
"version": "1.0.0-rc.4", | ||
"version": "1.0.0", | ||
"description": "Extensible system for analysing and manipulating natural language", | ||
@@ -17,3 +17,3 @@ "license": "MIT", | ||
"parse-latin": "^2.0.0", | ||
"unified": "^1.0.0" | ||
"unified": "^2.0.0" | ||
}, | ||
@@ -36,3 +36,2 @@ "repository": { | ||
"jscs-jsdoc": "^1.0.0", | ||
"matcha": "^0.6.0", | ||
"mdast": "^1.0.0", | ||
@@ -42,3 +41,3 @@ "mdast-comment-config": "^1.0.0", | ||
"mdast-lint": "^1.0.0", | ||
"mdast-slug": "^1.0.0", | ||
"mdast-slug": "^2.0.0", | ||
"mdast-validate-links": "^1.0.0", | ||
@@ -60,5 +59,4 @@ "mocha": "^2.0.0", | ||
"build-md": "mdast . --quiet", | ||
"build": "npm run build-md && npm run build-bundle", | ||
"benchmark": "matcha benchmark.js" | ||
"build": "npm run build-md && npm run build-bundle" | ||
} | ||
} |
260
readme.md
@@ -5,15 +5,15 @@ # ![Retext](https://cdn.rawgit.com/wooorm/retext/master/logo.svg) | ||
**retext** is an extensible natural language system—by default using | ||
[**parse-latin**](https://github.com/wooorm/parse-latin) to transform natural | ||
language into **[NLCST](https://github.com/wooorm/nlcst/)**. | ||
**Retext** provides a pluggable system for analysing and manipulating natural | ||
language in JavaScript. NodeJS and the browser. Tests provide 100% coverage. | ||
**retext** is an extensible natural language processor with support for | ||
multiple languages. **Retext** provides a pluggable system for analysing | ||
and manipulating natural language in JavaScript. Node and the browser. | ||
100% coverage. | ||
> Rather than being a do-all library for Natural Language Processing (such as | ||
> [NLTK](http://www.nltk.org) or [OpenNLP](https://opennlp.apache.org)), | ||
> **retext** aims to be useful for more practical use cases (such as censoring | ||
> profane words or decoding emoticons, but the possibilities are endless) | ||
> instead of more academic goals (research purposes). | ||
> **retext** aims to be useful for more practical use cases (such as checking | ||
> for [insensitive words](https://github.com/wooorm/alex) or decoding | ||
> [emoticons](https://github.com/wooorm/retext-emoji)) instead of more academic | ||
> goals (research purposes). | ||
> **retext** is inherently modular—it uses plugins (similar to | ||
> [rework](https://github.com/reworkcss/rework/) for CSS) instead of providing | ||
> [mdast](https://github.com/wooorm/mdast/) for markdown) instead of providing | ||
> everything out of the box (such as | ||
@@ -31,34 +31,11 @@ > [Natural](https://github.com/NaturalNode/natural)). This makes **retext** a | ||
[Component.js](https://github.com/componentjs/component): | ||
**retext** is also available for [bower](http://bower.io/#install-packages), | ||
and [duo](http://duojs.org/#getting-started), and as an AMD, CommonJS, and | ||
globals module, [uncompressed](retext.js) and [compressed](retext.min.js). | ||
```bash | ||
component install wooorm/retext | ||
``` | ||
[Bower](http://bower.io/#install-packages): | ||
```bash | ||
bower install retext | ||
``` | ||
[Duo](http://duojs.org/#getting-started): | ||
```javascript | ||
var Retext = require('wooorm/retext'); | ||
``` | ||
UMD (globals/AMD/CommonJS) ([uncompressed](retext.js) and [compressed](retext.min.js)): | ||
```html | ||
<script src="path/to/retext.js"></script> | ||
<script> | ||
var retext = new Retext(); | ||
</script> | ||
``` | ||
## Usage | ||
The following example uses [**retext-emoji**](https://github.com/wooorm/retext-emoji) | ||
(to show emoji) and [**retext-smartypants**](https://github.com/wooorm/retext-smartypants) | ||
(for smart punctuation). | ||
to show emoji and [**retext-smartypants**](https://github.com/wooorm/retext-smartypants) | ||
for smart punctuation. | ||
@@ -84,12 +61,11 @@ Require dependencies: | ||
```javascript | ||
var doc = processor.process( | ||
'The three wise monkeys [. . .] sometimes called the ' + | ||
'three mystic apes--are a pictorial maxim. Together ' + | ||
'they embody the proverbial principle to ("see no evil, ' + | ||
'hear no evil, speak no evil"). The three monkeys are ' + | ||
'Mizaru (:see_no_evil:), covering his eyes, who sees no ' + | ||
'evil; Kikazaru (:hear_no_evil:), covering his ears, ' + | ||
'who hears no evil; and Iwazaru (:speak_no_evil:), ' + | ||
'covering his mouth, who speaks no evil.' | ||
); | ||
var doc = processor.process([ | ||
'The three wise monkeys [. . .] sometimes called the three mystic', | ||
'apes--are a pictorial maxim. Together they embody the proverbial', | ||
'principle to ("see no evil, hear no evil, speak no evil"). The', | ||
'three monkeys are Mizaru (:see_no_evil:), covering his eyes, who', | ||
'sees no evil; Kikazaru (:hear_no_evil:), covering his ears, who', | ||
'hears no evil; and Iwazaru (:speak_no_evil:), covering his mouth,', | ||
'who speaks no evil.' | ||
].join('\n')); | ||
``` | ||
@@ -100,9 +76,9 @@ | ||
```text | ||
The three wise monkeys […] sometimes called the three | ||
mystic apes—are a pictorial maxim. Together they | ||
embody the proverbial principle to (“see no evil, | ||
hear no evil, speak no evil”). The three monkeys are | ||
Mizaru (🙈), covering his eyes, who sees no evil; | ||
Kikazaru (🙉), covering his ears, who hears no evil; | ||
and Iwazaru (🙊), covering his mouth, who speaks no evil. | ||
The three wise monkeys […] sometimes called the three mystic | ||
apes—are a pictorial maxim. Together they embody the proverbial | ||
principle to (“see no evil, hear no evil, speak no evil”). The | ||
three monkeys are Mizaru (🙈), covering his eyes, who | ||
sees no evil; Kikazaru (🙉), covering his ears, who | ||
hears no evil; and Iwazaru (🙊), covering his mouth, | ||
who speaks no evil. | ||
``` | ||
@@ -132,3 +108,3 @@ | ||
`Object`: an instance of Retext: The returned object functions just like | ||
`Object` — an instance of Retext: The returned object functions just like | ||
**retext** (it has the same methods), but caches the `use`d plugins. This | ||
@@ -139,3 +115,3 @@ provides the ability to chain `use` calls to use multiple plugins, but | ||
### [retext](#api).process(value\[, done\]) | ||
### [retext](#api).process(value\[, [done](#function-doneerr-file-doc)\]) | ||
@@ -151,34 +127,44 @@ Parse a text document, apply plugins to it, and compile it into | ||
* `value` (`string`) — Text document; | ||
* `value` ([`VFile`](https://github.com/wooorm/vfile) or `string`) | ||
— Text document; | ||
* `done` (`function(err, doc, file)`, optional) — Callback invoked when the | ||
output is generated with either an error, or a result. Only strictly | ||
needed when async plugins are used. | ||
* `done` ([`Function`](#function-doneerr-file-doc), optional). | ||
**Returns** | ||
`string` or `null`: A document. Formatted in whatever plugins generate. | ||
The result is `null` if a plugin is asynchronous, in which case the callback | ||
`done` should’ve been passed (don’t worry: plugin creators make sure you know | ||
its async). | ||
`string?`: A document. Formatted in whatever plugins generate. The result is | ||
`null` if a plugin is asynchronous, in which case the callback `done` should’ve | ||
been passed (don’t worry: plugin creators make sure you know its async). | ||
### plugin | ||
### function done(err, [file](https://github.com/wooorm/vfile), doc) | ||
A plugin is simply a function, with `function(retext[, options])` as its | ||
signature. The first argument is the **Retext** instance a user attached the | ||
plugin to. The plugin is invoked when a user `use`s the plugin (not when a | ||
document is parsed) and enables the plugin to modify retext. | ||
Callback invoked when the output is generated with either an error, or the | ||
processed document (represented as a virtual file and a string). | ||
The plugin can return another function: `function(NLCSTNode, file[, next])`. | ||
This function is invoked when a document is parsed. | ||
**Parameters** | ||
## Plugins | ||
* `err` (`Error?`) — Reason of failure; | ||
* `file` ([`VFile?`](https://github.com/wooorm/vfile)) — Virtual file; | ||
* `doc` (`string?`) — Generated document. | ||
* [retext-content](https://github.com/wooorm/retext-content) | ||
— Append, prepend, remove, and replace content into/from Retext nodes; | ||
## Plugin | ||
* [retext-cst](https://github.com/wooorm/retext-cst) | ||
— (**[demo](http://wooorm.github.io/retext-cst/)**) | ||
— Encoding and decoding between AST (JSON) and TextOM object model; | ||
### function attacher([retext](#api)\[, options\]) | ||
A plugin is a function, which takes the **Retext** instance a user attached | ||
the plugin on as a first parameter and optional configuration as a second | ||
parameter. | ||
A plugin can return a `transformer`. | ||
### function transformer([node](https://github.com/wooorm/nlcst), [file](https://github.com/wooorm/vfile)\[, next\]) | ||
A transformer changes the provided document (represented as a node and a | ||
virtual file). | ||
Transformers can be asynchronous, in which case `next` must be invoked | ||
(optionally with an error) when done. | ||
## List of Plugins | ||
* [retext-directionality](https://github.com/wooorm/retext-directionality) | ||
@@ -196,2 +182,8 @@ — (**[demo](http://wooorm.github.io/retext-directionality/)**) | ||
* [retext-dutch](https://github.com/wooorm/retext-dutch) | ||
— Dutch language support; | ||
* [retext-english](https://github.com/wooorm/retext-english) | ||
— English language support; | ||
* [retext-emoji](https://github.com/wooorm/retext-emoji) | ||
@@ -201,9 +193,5 @@ — (**[demo](http://wooorm.github.io/retext-emoji/)**) | ||
* [retext-find](https://github.com/wooorm/retext-find) | ||
— Easily find nodes; | ||
* [retext-equality](https://github.com/wooorm/retext-equality) | ||
— Warn about possible insensitive, inconsiderate language; | ||
* [retext-inspect](https://github.com/wooorm/retext-inspect) | ||
— (**[demo](http://wooorm.github.io/retext-inspect/)**) | ||
— Nicely display nodes in `console.log` calls; | ||
* [retext-keywords](https://github.com/wooorm/retext-keywords) | ||
@@ -221,9 +209,2 @@ — (**[demo](http://wooorm.github.io/retext-keywords/)**) | ||
* [retext-link](https://github.com/wooorm/retext-link) | ||
— (**[demo](http://wooorm.github.io/retext-link/)**) | ||
— Detect links in text; | ||
* [retext-live](https://github.com/wooorm/retext-live) | ||
— Change a node based on a (new?) value; | ||
* [retext-metaphone](https://github.com/wooorm/retext-metaphone) | ||
@@ -241,9 +222,2 @@ — (**[demo](http://wooorm.github.io/retext-metaphone/)**) | ||
* [retext-range](https://github.com/wooorm/retext-range) | ||
— Sequences of content within a TextOM tree between two points; | ||
* [retext-search](https://github.com/wooorm/retext-search) | ||
— (**[demo](http://wooorm.github.io/retext-search/)**) | ||
— Search in a TextOM tree; | ||
* [retext-sentiment](https://github.com/wooorm/retext-sentiment) | ||
@@ -265,88 +239,28 @@ — (**[demo](http://wooorm.github.io/retext-sentiment/)**) | ||
* [retext-visit](https://github.com/wooorm/retext-visit) | ||
— (**[demo](http://wooorm.github.io/retext-visit/)**) | ||
— Visit nodes, optionally by type; | ||
## List of Utilities | ||
* [retext-walk](https://github.com/wooorm/retext-walk) | ||
— Walk trees, optionally by type. | ||
The following projects are useful when working with the syntax tree, | ||
[NLCST](https://github.com/wooorm/nlcst): | ||
## Desired Plugins | ||
* [wooorm/nlcst-to-string](https://github.com/wooorm/nlcst-to-string) | ||
— Stringify a node; | ||
> Hey! Want to create one of the following, or any other plugin, for | ||
> **retext** but not sure where to start? I suggest to read **retext-visit**’s | ||
> source code to see how it’s build first (it’s probably the most straight | ||
> forward to learn), and go from there. | ||
> Let me know if you still have any questions, go ahead and send me | ||
> [feedback](mailto:tituswormer@gmail.com) or [raise an | ||
> issue](https://github.com/wooorm/retext/issues). | ||
* [wooorm/nlcst-is-literal](https://github.com/wooorm/nlcst-is-literal) | ||
— Check whether a node is meant literally; | ||
* retext-date | ||
— Detect time and date in text; | ||
* [wooorm/nlcst-test](https://github.com/wooorm/nlcst-test) | ||
— Validate a NLCST node; | ||
* retext-frequen | ||
-words — Like **retext-keywords**, but based on frequency and stop-words | ||
instead of a POS-tagger; | ||
In addition, see [`wooorm/unist`](https://github.com/wooorm/unist#unist-node-utilties) | ||
for other utilities which work with **retext** nodes, but also with | ||
[**mdast**](https://github.com/wooorm/mdast) nodes. | ||
* retext-hyphen | ||
— Insert soft-hyphens where needed; this might have to be implemented | ||
with some sort of node which doesn’t stringify; | ||
And finally, see [`wooorm/vfile`](https://github.com/wooorm/vfile#related-tools) | ||
for a list of utilities for working with virtual files. | ||
* retext-location | ||
— Track the position of nodes (line, column); | ||
* retext-no-pants | ||
— Opposite of **retext-smartypants**; | ||
* retext-no-break | ||
— Inserts [non-breaking spaces](http://en.wikipedia.org/wiki/Non-breaking_space#Non-breaking_behavior) | ||
between things like “100 km”; | ||
* retext-profanity | ||
— Censor profane words; | ||
* retext-punctuation-pair | ||
— Detect which opening or initial punctuation, belongs to which closing | ||
or final punctuation mark (and vice versa); | ||
* retext-summary | ||
— Summarise text; | ||
* retext-sync | ||
— Detect changes in a textarea (or contenteditable?), sync the diffs over | ||
to a **retext** tree, let plugins modify the content, and sync the diffs | ||
back to the textarea; | ||
* retext-typography | ||
— Applies typographic enhancements, like (or using?) retext-smartypants | ||
and retext-hyphen; | ||
* retraverse | ||
— Like Estraverse. | ||
## Parsers | ||
* [parse-latin](https://github.com/wooorm/parse-latin) (**[demo](http://wooorm.github.io/parse-latin/)**) | ||
— default; | ||
* [parse-english](https://github.com/wooorm/parse-english) (**[demo](http://wooorm.github.io/parse-english/)**) | ||
— Specifically for English; | ||
* [parse-dutch](https://github.com/wooorm/parse-dutch) (**[demo](http://wooorm.github.io/parse-dutch/)**) | ||
— Specifically for Dutch; | ||
## Benchmark | ||
On a MacBook Air, it parses about 2 big articles, 25 sections, or 230 | ||
paragraphs per second. | ||
```text | ||
retext.parse(value, callback); | ||
325 op/s » A paragraph (5 sentences, 100 words) | ||
33 op/s » A section (10 paragraphs, 50 sentences, 1,000 words) | ||
3 op/s » An article (100 paragraphs, 500 sentences, 10,000 words) | ||
``` | ||
## Related | ||
* [nlcst](https://github.com/wooorm/nlcst) | ||
* [unist](https://github.com/wooorm/unist) | ||
* [unified](https://github.com/wooorm/unified) | ||
@@ -353,0 +267,0 @@ ## License |
License Policy Violation
LicenseThis package is not allowed per your license policy. Review the package's license to ensure compliance.
Found 1 instance in 1 package
License Policy Violation
LicenseThis package is not allowed per your license policy. Review the package's license to ensure compliance.
Found 1 instance in 1 package
No v1
QualityPackage is not semver >=1. This means it is not stable and does not support ^ ranges.
Found 1 instance in 1 package
14
1
28219
111
258
+ Addedextend@3.0.2(transitive)
+ Addedunified@2.1.4(transitive)
- Removedunified@1.0.0(transitive)
Updatedunified@^2.0.0