Product
Introducing License Enforcement in Socket
Ensure open-source compliance with Socket’s License Enforcement Beta. Set up your License Policy and secure your software!
compromise
Advanced tools
npm install compromise
Welcome to v12! - Release Notes here 👍
compromise makes it simple to interpret and match text:
let doc = nlp(entireNovel)
doc.if('the #Adjective of times').text()
// "it was the blurst of times??"
if (doc.has('simon says #Verb')) {
return true
}
conjugate and negate verbs in any tense:
let doc = nlp('she sells seashells by the seashore.')
doc.verbs().toPastTense()
doc.text()
// 'she sold seashells by the seashore.'
transform nouns to plural and possessive forms:
let doc = nlp('the purple dinosaur')
doc.nouns().toPlural()
doc.text()
// 'the purple dinosaurs'
interpret plaintext numbers
nlp.extend(require('compromise-numbers'))
let doc = nlp('ninety five thousand and fifty two')
doc.numbers().add(2)
doc.text()
// 'ninety five thousand and fifty four'
grab subjects in a text:
let doc = nlp(buddyHolly)
doc
.people()
.if('mary')
.json()
// [{text:'Mary Tyler Moore'}]
let doc = nlp(freshPrince)
doc
.places()
.first()
.text()
// 'West Phillidelphia'
doc = nlp('the opera about richard nixon visiting china')
doc.topics().json()
// [
// { text: 'richard nixon' },
// { text: 'china' }
// ]
work with contracted and implicit words:
let doc = nlp("we're not gonna take it, no we ain't gonna take it.")
// match an implicit term
doc.has('going') // true
// transform
doc.contractions().expand()
dox.text()
// 'we are not going to take it, no we are not going to take it.'
Use it on the client-side:
<script src="https://unpkg.com/compromise"></script>
<script src="https://unpkg.com/compromise-numbers"></script>
<script>
nlp.extend(compromiseNumbers)
var doc = nlp('two bottles of beer')
doc.numbers().minus(1)
document.body.innerHTML = doc.text()
// 'one bottle of beer'
</script>
or as an es-module:
import nlp from 'compromise'
var doc = nlp('London is calling')
doc.verbs().toNegative()
// 'London is not calling'
or if you don't care about POS-tagging, you can use the tokenize-only build: (90kb!)
<script src="https://unpkg.com/compromise/builds/compromise-tokenize.js"></script>
<script>
var doc = nlp('No, my son is also named Bort.')
//you can see the text has no tags
console.log(doc.has('#Noun')) //false
//but the whole api still works
console.log(doc.has('my .* is .? named /^b[oa]rt/')) //true
</script>
compromise is 170kb (minified):
it's pretty fast. It can run on keypress:
it works mainly by conjugating many forms of a basic word list.
The final lexicon is ~14,000 words:
you can read more about how it works, here.
set a custom interpretation of your own words:
let myWords = {
kermit: 'FirstName',
fozzie: 'FirstName',
}
let doc = nlp(muppetText, myWords)
or make more changes with a compromise-plugin.
const nlp = require('compromise')
nlp.extend((Doc, world) => {
// add new tags
world.addTags({
Character: {
isA: 'Person',
notA: 'Adjective',
},
})
// add or change words in the lexicon
world.addWords({
kermit: 'Character',
gonzo: 'Character',
})
// add methods to run after the tagger
world.postProcess(doc => {
doc.match('light the lights').tag('#Verb . #Plural')
})
// add a whole new method
Doc.prototype.kermitVoice = function() {
this.sentences().prepend('well,')
this.match('i [(am|was)]').prepend('um,')
return this
}
})
(these methods are on the nlp
object)
.json()
result(all match methods use the match-syntax.)
'wash-out'
'(939) 555-0113'
'$2.50'
'#nlp'
'hi@compromise.cool'
:)
💋
'@nlp_compromise'
'compromise.cool'
'quickly'
'he'
'but'
'of'
'Mrs.'
people()
+ places()
+ `organizations"she would"
-> "she'd"
"Spencer's"
'FBI'
'eats, shoots, and leaves'
'football captain' → 'football captains'
'turnovers' → 'turnover'
's
to the end, in a safe manner.'will go' → 'went'
'walked' → 'walks'
'walked' → 'will walk'
'walks' → 'walk'
'walks' → 'walking'
'went' → 'did not go'
"didn't study" → 'studied'
These are some helpful extensions:
npm install compromise-adjectives
quick
quick
to quickest
quick
to quicker
quick
to quickly
quick
to quicken
quick
to quickness
npm install compromise-dates
June 8th
or 03/03/18
npm install compromise-numbers
25 kilos'
1/3rd
five
or fifth
5
or 5th
fifth
or 5th
five
or 5
npm install compromise-export
npm install compromise-html
npm install compromise-hash
npm install compromise-keypress
npm install compromise-ngrams
npm install compromise-paragraphs
this plugin creates a wrapper around the default sentence objects.
npm install compromise-sentences
he walks
-> he walked
he walked
-> he walks
he walks
-> he will walk
he walks
-> he didn't walk
he doesn't walk
-> he walks
?
!
?
or !
!
?
.
npm install compromise-syllables
Typescript support is still a work in progress. So far support for plugins has been mostly complete, and can be used to type-safely extend NLP.
import nlp from 'compromise'
import ngrams from 'compromise-ngrams'
import numbers from 'compromise-numbers'
// .extend() can be chained
const nlpEx = nlp.extend(ngrams).extend(numbers)
nlpEx('This is type safe!').ngrams({ min: 1 })
nlpEx('This is type safe!').numbers()
The .extend()
function returns an nlp type with updated Document and World types (Phrase, Term and Pool are not currently supported). While the global nlp also recieves the plugin from a runtime perspective; it's type will not be updated - this is a limitation of Typescript.
Typesafe plugins can be created by using the nlp.Plugin
type:
interface myExtendedDoc {
sayHello(): string
}
interface myExtendedWorld {
hello: string
}
const myPlugin: nlp.Plugin<myExtendedDoc, myExtendedWorld> = (Doc, world) => {
world.hello = 'Hello world!'
Doc.prototype.sayHello = () => world.hello
}
const _nlp = nlp.extend(myPlugin)
const doc = _nlp('This is safe!')
doc.sayHello()
doc.world.hello = "Hello again!"
compromise_1.default is not a function
- This is a problem with your tsconfig.json
it can be solved by adding "esModuleInterop": true
. Make sure to run tsc --init
when starting a new Typescript project.slash-support:
We currently split slashes up as different words, like we do for hyphens. so things like this don't work:
nlp('the koala eats/shoots/leaves').has('koala leaves') //false
inter-sentence match:
By default, sentences are the top-level abstraction.
Inter-sentence, or multi-sentence matches aren't supported:
nlp("that's it. Back to Winnipeg!").has('it back')//false
nested match syntax:
the danger beauty of regex is that you can recurse indefinitely.
Our match syntax is much weaker. Things like this are not (yet) possible:
doc.match('(modern (major|minor))? general')
complex matches must be achieved with successive .match() statements.
dependency parsing: Proper sentence transformation requires understanding the syntax tree of a sentence, which we don't currently do. We should! Help wanted with this.
MIT
13.1.0
.lookup()
for major speed improvements.[word?]
syntax parsingFAQs
modest natural language processing
The npm package compromise receives a total of 41,136 weekly downloads. As such, compromise popularity was classified as popular.
We found that compromise demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 3 open source maintainers collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Product
Ensure open-source compliance with Socket’s License Enforcement Beta. Set up your License Policy and secure your software!
Product
We're launching a new set of license analysis and compliance features for analyzing, managing, and complying with licenses across a range of supported languages and ecosystems.
Product
We're excited to introduce Socket Optimize, a powerful CLI command to secure open source dependencies with tested, optimized package overrides.