The Programmable Mind (aka Entodicton)
This is the client for a server that processes natural language statements
into JSON. Instead of using grammar based parser, the server uses a
generalized operator precedence parser and neural nets.
Demo Walkthough
If you run node_modules/entodiction/test.js that will run a sample
program against the API.
This is the input
operators = ['(i [([went] ([to] (<the> store)))])']
bridges = [
{ "id": "the", "level": 0, "bridge": "{ ...after, determiner: 'the' }" },
{ "id": "to", "level": 0, "bridge": "{ ...next(operator), after: after[0] }" },
{ "id": "went", "level": 0, "bridge": "{ ...squish(after[0]), ...next(operator) }" },
{ "id": "went", "level": 1, "bridge": "{ ...next(operator), who: before[0] }" },
{ "id": "went", "level": 2, "bridge": "{ action: 'go', marker: 'go', actor: operator.who, place: operator.to }" },
]
utterances = ["joe went to the store"]
Operators
Operators is used to do two things: specify the priority of operators and the argument structure. The priority
is used to train a neural net. The idea here is to give sample sentences that are marked up so a graph
of priorities can be made and fed to a neural net. The '[]' or '<>' is used to mark operators. In a generalized
operator precedence parser, the result of a apply an operator can be another operator. The '[]' means
there is a next operator the '<>' means there is not. The operators that this example defines are
Operator/Level Arity
the/0 prefix operator
to/0 prefix operator
went/0 prefix operator
went/1 postfix operator
The priorities defined in order of application are
the/1 > to/1 > went/1 > went/2
'Went' could be defined as infix but in the future once I implement conjunction this definition will allow
sentences such as "I went to the store bought a coffee and chips and jumped on the bus and I got there".
Bridges
This works by combining contexts. Each context has a marker which indicates that is the operator. The
bridge langauge is used to specify how to combine contexts to get the next context. This abstraction
support multiple languages mapping ultimately to the same JSON. Its a generaliztion of what I did
before in the v4 design seen in the youtube video. The syntax is
{
"id": <id of the operator>,
"level": <level of the operator>,
"bridge": <how to calculate new context>
}
'after' is the arguments after the operator. 'before' is the argument before the operator. 'operator' is the
operator. They are all contexts. The '...' operator works like the spread operator in JS. 'next(operator)'
will take increment the level for the operator. 'squish()' will take the marker of the context and
use that as a property name for the contexts. Here is an example. For this bridge
{
"id": "went",
"level": 0,
"bridge": "{ ...squish(after[0]), ...next(operator) }"
}
and initial state
operator = { 'marker': went/0 }
after = [{
'marker': to/0,
'after': { 'marker': 'store', 'determiner': 'the' }
}]
the result is
{
'marker': went/0,
'to': {
'marker': 'store',
'determiner': 'the'
}
}
Utterances
This is a list of statement that will be processed using the given definitions
Priorities
If request fail to process correctly one of the main causes is operator ordering. The 'operators' definition is used to generate training data for the ordering neural net. Sometimes that is not enough. There is a 'priorities' property that can be used to supply additional training data. Priorties is a list of operators. The last operator it the preferred one. The logs show the order that operators were run in. If it wrong look for another message like
Context for choosing the operator ('wantMcDonalds', 0) was [('i', 0), ('wantMcDonalds', 0), ('aEnglish', 0), ('fromM', 0)]
In this case I wanted 'fromM' to apply before 'wantMcDonalds'. So I add this to the priorities array
[['i', 0], ['wantMcDonalds', 0], ['aEnglish', 0], ['fromM', 0]]
Generators
A generator is used to describe how to map json back to strings. This is an example
({ 'marker': 'tankConcept', 'number': { '>': 0 } }, '${number} ${word}')
The first part is a condition that is used to select the context. This example would match a context where the value 'marker' equals 'tankConcept' and the property 'number' is an number greater than zero. The second part can access properties in the context and generate a string. The access the properties 'number' and 'word' to generate a string.
List the default generators first. For example if you want English to be the default list the generator for English for and for other language with a language selector later.