![require(esm) Backported to Node.js 20, Paving the Way for ESM-Only Packages](https://cdn.sanity.io/images/cgdhsj6q/production/be8ab80c8efa5907bc341c6fefe9aa20d239d890-1600x1097.png?w=400&fit=max&auto=format)
Security News
require(esm) Backported to Node.js 20, Paving the Way for ESM-Only Packages
require(esm) backported to Node.js 20, easing the transition to ESM-only packages and reducing complexity for developers as Node 18 nears end-of-life.
brill-pos-tagger
Advanced tools
npm install brill-pos-tagger
var Tagger = require("./lib/brill_pos_tagger");
var base_folder = "/home/hugo/workspace/brill-pos-tagger";
var rules_file = base_folder + "/data/tr_from_pos.txt";
var lexicon_file = base_folder + "/data/lexicon.json";
var default_category = 'N';
var tagger = new Tagger(lexicon_file, rules_file, default_category, function(error) {
if (error) {
console.log(error);
}
else {
var sentence = ["I", "see", "the", "man", "with", "the", "telescope"];
console.log(JSON.stringify(tagger.tag(sentence)));
}
});
The lexicon is either a JSON file that has the following structure:
{
"word1": ["cat1"],
"word2": ["cat2", "cat3"],
...
}
or a text file:
word1 cat1 cat2
word2 cat3
...
Words may have multiple categories in the lexicon file. The tagger uses only the first one.
Transformation rules are specified as follows:
OLD_CAT NEW_CAT PREDICATE PARAMETER
This means that if the predicate is true that if the category of the current position is OLD_CAT, the category is replaced by NEW_CAT. The predicate may use the parameter in distinct ways: sometimes the parameter is used for specifying the outcome of the predicate:
NN CD CURRENT-WORD-IS-NUMBER YES
This means that if the outcome of CURRENT-WORD-IS-NUMBER is YES, the category is replaced by CD
The parameter can also be used to check the category of a word in the sentence:
VBD NN PREV-TAG DT
Here the category of the previous word must be DT
for the rule to be applied.
The tagger applies transformation rules that may change the category of words. The input sentence must be split into words which are assigned with categories. The tagged sentence is then processed from left to right. At each step all rules are applied once; rules are applied in the order in which they are specified. Algorithm:
function(sentence) {
var tagged_sentence = new Array(sentence.length);
// snip
// Apply transformation rules
for (var i = 0, size = sentence.length; i < size; i++) {
this.transformation_rules.forEach(function(rule) {
rule.apply(tagged_sentence, i);
});
}
return(tagged_sentence);
}
Predicates are defined in module lib/Predicate.js
. In that file a function must be created that serves as predicate. A predicate accepts a tagged sentence, the current position in the sentence that is being tagged, and the outcome(s) of the predicate. An example of a predicate that checks the category of the current word:
function current_word_is_tag(tagged_sentence, i, parameter) {
return(tagged_sentence[i][0] === parameter);
}
Some predicates accept two parameters. Next step is to map a keyword to this predicate so that it can be used in the transformation rules. The mapping is also defined in the grammar file:
var predicates = {
"CURRENT-WORD-IS-TAG": current_word_is_tag,
"PREV-WORD-IS-CAP": prev_word_is_cap
}
FAQs
Part of speech tagger based on Eric Brill's algorithm
The npm package brill-pos-tagger receives a total of 5 weekly downloads. As such, brill-pos-tagger popularity was classified as not popular.
We found that brill-pos-tagger demonstrated a not healthy version release cadence and project activity because the last version was released a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
require(esm) backported to Node.js 20, easing the transition to ESM-only packages and reducing complexity for developers as Node 18 nears end-of-life.
Security News
PyPI now supports iOS and Android wheels, making it easier for Python developers to distribute mobile packages.
Security News
Create React App is officially deprecated due to React 19 issues and lack of maintenance—developers should switch to Vite or other modern alternatives.