TextOM
Object model for manipulating natural language in JavaScript.
- For parsing capabilities, see parse-latin;
- For a pluggable system for analysing and manipulating natural language, see retext;
- For semantics of natural language nodes, see NLCST.
Installation
npm:
$ npm install textom
Component.js:
$ component install wooorm/textom
Bower:
$ bower install textom
Duo:
var TextOMConstructor = require('wooorm/textom');
UMD (globals/AMD/CommonJS) (uncompressed and minified:
<script src="path/to/textom.js"></script>
<script>
var TextOM = new TextOMConstructor();
</script>
Usage
var TextOMConstructor = require('textom');
var TextOM = new TextOMConstructor();
var root = new TextOM.RootNode();
API
See below for IDL definitions.
Let’s say all following examples start with below code.
Any changes made by below examples are discarded upon their ending.
var TextOMConstructor = require('textom');
var TextOM = new TextOMConstructor();
var root = new TextOM.RootNode();
var paragraph = new TextOM.ParagraphNode();
root.append(paragraph);
var sentence = new TextOM.SentenceNode();
paragraph.append(sentence);
var dogs = sentence.append(new TextOM.WordNode()),
space0 = sentence.append(new TextOM.WhiteSpaceNode(' ')),
ampersand = sentence.append(new TextOM.SymbolNode('&')),
space1 = sentence.append(new TextOM.WhiteSpaceNode(' ')),
cats = sentence.append(new TextOM.WordNode()),
fullStop = sentence.append(new TextOM.PunctuationNode('.'));
var dogsText = dogs.append(new TextOM.TextNode('Dogs')),
catsText = cats.append(new TextOM.TextNode('cats'));
root.toString();
TextOM
Object.
Constructor.
TextOM.Node.on(name, listener)
TextOM.RootNode.on('someeventname', function () {});
Subscribe listener
to name
events on instances of Node
.
TextOM.Node.off(name?, listener?)
TextOM.WordNode.off('someeventname');
off(name, listener)
: Unsubscribe listener
from name
events on instances of Node
;off(name)
: Unsubscribe from name
events on instances of Node
;off()
: Unsubscribe from events on instances of Node
.
TextOM.Node#on(name, listener)
root.on('someeventname', function () {});
Subscribe listener
to name
events on node
.
TextOM.Node#off(name?, listener?)
dogs.off('someeventname');
off(name, listener)
: Unsubscribe listener
from name
events on node
;off(name)
: Unsubscribe from name
events on node
;off()
: Unsubscribe from events on node
.
TextOM.Node#emit(name, parameters...)
TextOM.WordNode.on('someeventname', function () {
this;
});
dogs.emit('someeventname');
emit(name, parameters...)
: Fire a name
event with parameters
on node
;emit(name)
: Fire a name
event on node
.
Bubbles through node
s constructors. In the case of dogs
: WordNode
, Element
, Child
, Parent
, Node
.
TextOM.Node#trigger(name, context, parameters...)
root.on('someeventnameinside', function (context) {
this;
context;
});
dogsText.trigger('someeventname', dogs);
trigger(name, context, parameters...)
: Fire a name
event with parameters
on node
;trigger(name, context)
: Fire a name
event on node
;trigger(name)
: Same as TextOM\.Node#emit(name)
.
emit
s an event, and triggers name + "inside"
events on context and its parents, and their constructor.
In the case of dogsText
: someeventname
is emitted on dogsText
and TextNode
, and someeventname
is triggered on dogs
and WordNode
; sentence
and SentenceNode
; paragraph
and ParagraphNode
; root
and RootNode
.
TextOM.Node#nodeName
Identifier for Nodes.
TextOM.Node#TextOM
root.TextOM === TextOM;
TextOM object associated with node
.
TextOM.Node#ROOT_NODE
Identifier for RootNodes.
TextOM.Node#PARAGRAPH_NODE
Identifier for ParagraphNodes.
TextOM.Node#SENTENCE_NODE
Identifier for SentenceNodes.
TextOM.Node#WORD_NODE
Identifier for WordNodes.
TextOM.Node#SYMBOL_NODE
Identifier for SymbolNodes.
TextOM.Node#PUNCTUATION_NODE
Identifier for PunctuationNodes.
TextOM.Node#WHITE_SPACE_NODE
Identifier for WhiteSpaceNodes.
TextOM.Node#SOURCE_NODE
Identifier for SourceNodes.
TextOM.Node#TEXT_NODE
Identifier for TextNodes.
TextOM.Node#NODE
Identifier for Nodes.
TextOM.Node#PARENT
Identifier for Parents.
TextOM.Node#ELEMENT
Identifier for Elements.
TextOM.Node#CHILD
Identifier for Childs.
TextOM.Node#TEXT
Identifier for Texts.
Constructor (Node).
TextOM.Parent#nodeName
Identifier for Parents.
TextOM.Parent#head
paragraph.head;
sentence.head;
First Child
of parent
or null
.
TextOM.Parent#tail
paragraph.tail;
sentence.tail;
Last Child
of parent
(if more than one child exists) or null
.
TextOM.Parent#length
root.length;
sentence.length;
Number of children in parent
.
TextOM.Parent#prepend(child)
sentence.head;
sentence.prepend(fullStop);
sentence.head;
Insert child
as parent
s first child.
TextOM.Parent#prependAll(child[])
sentence.head;
sentence.prependAll([fullStop, cats]);
sentence.head;
sentence.head.next;
Insert every child
in children
at the start of parent
.
Adheres to sorting (the first child
in children
will become parent
s head
).
TextOM.Parent#append(child)
sentence.tail;
sentence.append(dogs);
sentence.tail;
Insert child
as parent
s last child.
TextOM.Parent#appendAll(child[])
sentence.tail;
sentence.appendAll([dogs, cats]);
sentence.tail;
sentence.tail.prev;
Insert every child
in children
at the end of parent
.
Adheres to sorting (the last child
in children
will become parent
s tail
).
TextOM.Parent#item(index?)
root.item();
sentence.item(0);
sentence.item(5);
sentence.item(6);
item(index)
: Get Child
at index
in parent
or null
;item()
: Get parent
s first Child
or null
.
TextOM.Parent#toString()
root.toString();
'' + sentence;
Get parent
s content.
TextOM.Parent#valueOf()
dogs.valueOf();
Get parent
s NLCST representation.
TextOM.Child()
Constructor (Node).
TextOM.Child#nodeName
Identifier for Childs.
TextOM.child#parent
dogs.parent;
sentence.parent;
paragraph.parent;
child
s Parent
or null
.
TextOM.child#prev
dogs.prev;
space0.prev;
child
s preceding sibling (Child
) or null
.
TextOM.child#next
cats.next;
fullStop.next;
child
s following sibling (Child
) or null
.
TextOM.Child#before(sibling)
dogs.prev;
dogs.before(cats);
dogs.prev;
Insert sibling
(Child
) as child
s preceding sibling in parent
.
TextOM.Child#beforeAll(child[])
dogs.prev;
dogs.beforeAll([cats, space0]);
dogs.prev;
dogs.prev.prev;
Insert every (Child
) in siblings
(Array
) before child
in parent
.
Adheres to sorting (the last sibling
in siblings
will become child
s prev
).
TextOM.Child#after(child)
cats.next;
cats.after(dogs);
cats.next;
Insert sibling
(Child
) as child
s following sibling in parent
.
TextOM.Child#afterAll(child[])
cats.next;
cats.afterAll([space0, dogs]);
cats.next;
cats.next.next;
Insert every (Child
) in siblings
(Array
) after child
in parent
.
Adheres to sorting (the first sibling
in siblings
will become child
s next
).
TextOM.Child#remove()
root.toString();
fullStop.remove();
root.toString();
Remove child
from parent
.
TextOM.Child#replace(sibling)
root.toString();
cats.replace(dogs);
root.toString();
Replace child
with sibling
(Child
) in parent
.
TextOM.Element()
Constructor (Parent and Child
).
TextOM.Element#nodeName
Identifier for Elements.
TextOM.Element#split(position?)
sentence.prev;
sentence.toString();
sentence.split(2);
sentence.toString();
sentence.prev.toString();
Split element
in two.
split(position)
: A new node, prependee
(a new instance of element
s constructor), is inserted before element
in parent
. prependee
receives the children from 0 to position
(not including). element
receives the children from position
(including);split()
: A new node, prependee
(a new instance of element
s constructor), is inserted before element
in parent
.
TextOM.Text(value?) [NLCST:Text]
Constructor (Child).
TextOM.Text#nodeName
Identifier for Texts.
TextOM.Text#toString()
dogsText.toString();
space1.toString();
fullStop.toString();
Get text
s value.
TextOM.Text#valueOf()
dogsText.valueOf();
Get text
s NLCST representation.
TextOM.Text#fromString(value?)
root.toString();
catsText.fromString();
root.toString();
catsText.fromString("Lions");
root.toString();
fromString(value)
: Set text
s value to value
;fromString()
: Remove text
s value.
TextOM.Text#split(position?)
catsText.prev;
catsText.toString();
catsText.split(2);
catsText.toString();
catsText.prev.toString();
Split text
in two.
split(position)
: A new node, prependee
(a new instance of text
s constructor), is inserted before text
in parent
. prependee
receives the value from 0 to position
(not including). text
receives the value from position
(including);split()
: A new node, prependee
(a new instance of text
s constructor), is inserted before text
in parent
.
Constructor (Parent).
TextOM.RootNode#type
Identifier for RootNodes.
Constructor (Element).
TextOM.ParagraphNode#type
Identifier for ParagraphNodes.
Constructor (Element).
TextOM.SentenceNode#type
Identifier for SentenceNodes.
Constructor (Element).
TextOM.WordNode#type
Identifier for WordNodes.
Constructor (Text).
TextOM.SymbolNode#type
Identifier for SymbolNodes.
Constructor (SymbolNode).
TextOM.PunctuationNode#type
Identifier for PunctuationNodes.
Constructor (SymbolNode).
TextOM.WhiteSpaceNode#type
Identifier for WhiteSpaceNodes.
Constructor (Text).
TextOM.SourceNode#type
Identifier for SourceNodes.
Constructor (Text).
TextOM.TextNode#type
Identifier for TextNodes.
IDL
The below IDL-like document gives a short view of the defined interfaces by TextOM.
module textom
{
[Constructor]
interface Node {
const string nodeName = "Node"
const string NODE = "Node"
const string PARENT = "Parent"
const string ELEMENT = "Element"
const string CHILD = "Child"
const string TEXT = "Text"
const string ROOT_NODE = "RootNode"
const string PARAGRAPH_NODE = "ParagraphNode"
const string SENTENCE_NODE = "SentenceNode"
const string WORD_NODE = "WordNode"
const string SYMBOL_NODE = "SymbolNode"
const string PUNCTUATION_NODE = "PunctuationNode"
const string WHITE_SPACE_NODE = "WhiteSpaceNode"
const string SOURCE_NODE = "SourceNode"
const string TEXT_NODE = "TextNode"
void on(String type, Function callback);
void off(optional String type = null, optional Function callback = null);
};
[Constructor,
ArrayClass]
interface Parent {
readonly attribute string nodeName = "Parent";
getter Child? item(unsigned long index);
readonly attribute unsigned long length;
readonly attribute Child? head;
readonly attribute Child? tail;
Child prepend(Child child);
Child append(Child child);
Child[] prependAll(Child[] children);
Child[] appendAll(Child[] children);
[NewObject] Object valueOf();
string toString();
};
Parent implements Node;
[Constructor]
interface Child {
readonly attribute nodeName = "Child"
readonly attribute Parent? parent;
readonly attribute Child? prev;
readonly attribute Child? next;
Child before(Child child);
Child after(Child child);
Child replace(Child child);
Child remove(Child child);
Child[] beforeAll(Child[] children);
Child[] afterAll(Child[] children);
};
Child implements Node;
[Constructor]
interface Element {
readonly attribute nodeName = "Element"
[NewObject] Element split(unsigned long position);
};
Element implements Child;
Element implements Parent;
[Constructor(optional String value = "")]
interface Text {
readonly attribute nodeName = "Text"
[NewObject] Object valueOf();
string toString();
string fromString(String value);
[NewObject] Text split(unsigned long position);
};
Text implements Child;
[Constructor]
interface RootNode {
readonly attribute string type = "RootNode";
};
RootNode implements Parent;
[Constructor]
interface ParagraphNode {
readonly attribute string type = "ParagraphNode";
};
ParagraphNode implements Element;
[Constructor]
interface SentenceNode {
readonly attribute string type = "SentenceNode";
};
SentenceNode implements Element;
[Constructor]
interface WordNode {
readonly attribute string type = "WordNode";
};
WordNode implements Element;
[Constructor(optional String value = "")]
interface SymbolNode {
readonly attribute string type = "SymbolNode";
};
SymbolNode implements Text;
[Constructor(optional String value = "")]
interface PunctuationNode {
readonly attribute string type = "PunctuationNode";
};
PunctuationNode implements SymbolNode;
[Constructor(optional String value = "")]
interface WhiteSpaceNode {
readonly attribute string type = "WhiteSpaceNode";
};
WhiteSpaceNode implements SymbolNode;
[Constructor(optional String value = "")]
interface TextNode {
readonly attribute string type = "TextNode";
};
[Constructor(optional String value = "")]
interface SourceNode {
readonly attribute string type = "SourceNode";
};
SourceNode implements Text;
}
Events
TextOM provides events which can be subscribed to, to get notified when something changes.
Event can be subscribed to through on()
methods, and unsubscribed to through off()
methods. These methods exist on every instance and on every constructor.
When subscribing to an instance's events, listener
is invoked for changes to that specific instance. When subscribing to a constructor's events, listener
is invoked for changes to any of constructor's instances.
List of events
dogs.on('remove', function (previous) {
this === dogs;
previous === sentence;
});
dogs.remove();
Fires when a Child
is removed from previousParent
.
- this: Removed
Child
; - parameters:
- previous: Removed from
Parent
.
dogs.on('insert', function () {
this === dogs;
});
sentence.append(dogs);
Fires when a Child
is inserted into a Parent
.
dogsText.on('changetext', function (current, previous) {
this === dogsText;
current === 'Poodles';
previous === 'Dogs';
});
dogsText.fromString('Poodles');
Fires when a Text
changes value.
- this: Changed
Text
; - parameters:
- current: Current value;
- previous: Previous value;
dogs.on('change', function () {
this === dogs;
});
dogsText.fromString('Poodles');
Fires when a direct child of a parent changes: either its value, when a new child is inserted, or when a child is removed.
changetextinside [bubbling]
root.on('changetextinside', function (node, current, previous) {
this === root;
node === catsText;
current === 'lions';
previous === 'cats';
});
catsText.fromString('lions');
Fires when a Text
inside an ancestor.
- this: Ancestor of a
Text
; - parameters:
- node: Changed
Text
; - current: Current value;
- previous: Previous value;
sentence.on('insertinside', function (node) {
this === sentence;
node === ampersand;
});
sentence.append(ampersand);
Fires when a Child
is inserted inside an ancestor.
- this: Ancestor of a
Child
; - parameters:
root.on('removeinside', function (node, previous) {
this === root;
node === dogs;
previous === sentence;
});
dogs.remove();
Fires when a Child
is removed inside an ancestor.
- this: Ancestor of a
Child
; - parameters:
root.on('changeinside', function (parent) {
this === root;
parent === sentence;
});
dogs.remove();
Fires when a Child
is removed inside an ancestor.
- this: Ancestor of a
Child
; - parameters:
- parent: Parent of the change.
Bubbling & Non-bubbling events
TextOM provides two types of events: Bubbling and non-bubbling. In API terms, bubbling event names end with "inside"
.
Non-bubbling (“normal”) events
Normal events fire on instances of Child and do not fire on ancestors. They additionally fire on all constructors of the instance.
Let’s say we have the example code given in API, and add the following line to it:
dogsText.fromString('Poodles');
A "changetext"
event fires on dogsText
. Because dogsText
is a TextNode
, the event fires on TextNode
too. Because TextNode
inherits from Text
, the event also fires on Text
, continuing with Child
, and finally Node
.
Bubbling events
Bubbling events start on a Parent
and continue through its ancestors. These events also fire on the ancestors constructor.
Let’s say we have the example code given in API, and add the following line to it:
dogsText.fromString('Wolves');
A "changetextinside"
event fires on dogsText
parent, dogs
, and because dogs
is a WordNode
, the event fires on WordNode
too, continuing with sentence
and SentenceNode
, paragraph
and ParagraphNode
, and finally root
and RootNode
.
Performance
Not that intersting. Fast enough. Just for checking performance regression for new features.
Parent
80,260 op/s » Append 1 new node to an empty parent
41,993 op/s » Append 2 new nodes to an empty parent
28,576 op/s » Append 3 new nodes to an empty parent
42,302 op/s » Append 1 attached node to an empty parent
21,049 op/s » Append 2 attached nodes to an empty parent
14,704 op/s » Append 3 attached nodes to an empty parent
416 op/s » Append 100 attached nodes to an empty parent
81,406 op/s » Prepend 1 new node to an empty parent
41,051 op/s » Prepend 2 new nodes to an empty parent
28,024 op/s » Prepend 3 new nodes to an empty parent
41,487 op/s » Prepend 1 attached node to an empty parent
21,122 op/s » Prepend 2 attached nodes to an empty parent
14,269 op/s » Prepend 3 attached nodes to an empty parent
405 op/s » Prepend 100 attached nodes to an empty parent
Child
39,223 op/s » Insert 1 new node after an only child
26,724 op/s » Insert 2 new nodes after an only child
19,999 op/s » Insert 3 new nodes after an only child
26,779 op/s » Insert 1 attached node after an only child
16,336 op/s » Insert 2 attached nodes after an only child
11,969 op/s » Insert 3 attached nodes after an only child
386 op/s » Insert 100 attached nodes after a first child
39,727 op/s » Insert 1 new node before a first child
26,254 op/s » Insert 2 new nodes before a first child
20,122 op/s » Insert 3 new nodes before a first child
27,033 op/s » Insert 1 attached node before a first child
16,842 op/s » Insert 2 attached nodes before a first child
12,220 op/s » Insert 3 attached nodes before a first child
398 op/s » Insert 100 attached nodes before a first child
Parent: all
78,500 op/s » Append 1 new node to an empty parent
55,596 op/s » Append 2 new nodes to an empty parent
42,358 op/s » Append 3 new nodes to an empty parent
40,838 op/s » Append 1 attached node to an empty parent
24,457 op/s » Append 2 attached nodes to an empty parent
16,675 op/s » Append 3 attached nodes to an empty parent
552 op/s » Append 100 attached nodes to an empty parent
78,364 op/s » Prepend 1 new node to an empty parent
56,193 op/s » Prepend 2 new nodes to an empty parent
42,430 op/s » Prepend 3 new nodes to an empty parent
38,994 op/s » Prepend 1 attached node to an empty parent
23,559 op/s » Prepend 2 attached nodes to an empty parent
16,943 op/s » Prepend 3 attached nodes to an empty parent
568 op/s » Prepend 100 attached nodes to an empty parent
Child: all
40,011 op/s » Insert 1 new node after an only child
32,734 op/s » Insert 2 new nodes after an only child
27,943 op/s » Insert 3 new nodes after an only child
26,499 op/s » Insert 1 attached node after an only child
18,801 op/s » Insert 2 attached nodes after an only child
14,400 op/s » Insert 3 attached nodes after an only child
560 op/s » Insert 100 attached nodes after a first child
39,973 op/s » Insert 1 new node before a first child
32,195 op/s » Insert 2 new nodes before a first child
27,934 op/s » Insert 3 new nodes before a first child
26,765 op/s » Insert 1 attached node before a first child
19,257 op/s » Insert 2 attached nodes before a first child
14,836 op/s » Insert 3 attached nodes before a first child
594 op/s » Insert 100 attached nodes before a first child
Related
License
MIT © Titus Wormer