Obscenity
Robust, extensible profanity filter for NodeJS.
Why Obscenity?
- Accurate: Though Obscenity is far from perfect (as with all profanity filters), it makes reducing false positives as simple as possible: adding whitelisted phrases is as easy as adding a new string to an array, and using word boundaries is equally simple.
- Robust: Obscenity's transformer-based design allows it to match on variants of phrases other libraries are typically unable to, e.g.
fuuuuuuuckkk
, ʃṳ𝒸𝗄
, wordsbeforefuckandafter
and so on. There's no need to manually write out all the variants either: just adding the pattern fuck
will match all of the cases above by default. - Extensible: With Obscenity, you aren't locked into anything - removing phrases that you don't agree with from the default set of words is trivial, as is disabling any transformations you don't like (perhaps you feel that leet-speak decoding is too error-prone for you).
Installation
$ npm install obscenity
$ yarn add obscenity
$ pnpm add obscenity
Example usage
First, import Obscenity:
const {
RegExpMatcher,
TextCensor,
englishDataset,
englishRecommendedTransformers,
} = require('obscenity');
Or, in TypeScript/ESM:
import {
RegExpMatcher,
TextCensor,
englishDataset,
englishRecommendedTransformers,
} from 'obscenity';
Now, we can create a new matcher using the English preset.
const matcher = new RegExpMatcher({
...englishDataset.build(),
...englishRecommendedTransformers,
});
Now, we can use our matcher to search for profanities in the text. Here's two examples of what you can do:
Check if there are any matches in some text:
if (matcher.hasMatch('fuck you')) {
console.log('The input text contains profanities.');
}
Output the positions of all matches along with the original word used:
const matches = matcher.getAllMatches('ʃ𝐟ʃὗƈk ỹоứ 𝔟ⁱẗ𝙘ɦ', true);
for (const match of matches) {
const { phraseMetadata, startIndex, endIndex } =
englishDataset.getPayloadWithPhraseMetadata(match);
console.log(
`Match for word ${phraseMetadata.originalWord} found between ${startIndex} and ${endIndex}.`,
);
}
Censoring matched text:
To censor text, we'll need to import another class: the TextCensor
.
Some other imports and creation of the matcher have been elided for simplicity.
const { TextCensor, ... } = require('obscenity');
const censor = new TextCensor();
const input = 'fuck you little bitch';
const matches = matcher.getAllMatches(input);
console.log(censor.applyTo(input, matches));
This is just a small slice of what Obscenity can do: for more, check out the documentation.
Accuracy
Note: As with all swear filters, Obscenity is not perfect (nor will it ever be). Use its output as a heuristic, and not as the sole judge of whether some content is appropriate or not.
With the English preset, Obscenity (correctly) finds matches in all of the following texts:
- you are a little fucker
- fk you
- ffuk you
- i like a$$es
-
ʃ𝐟ʃὗƈk ỹоứ
...and it does not match on the following:
- the pen is mightier than the sword
- i love bananas so yeah
- this song seems really banal
- grapes are really yummy
Documentation
For a step-by-step guide on how to use Obscenity, check out the guide.
Otherwise, refer to the auto-generated API documentation.
Contributing
Issues can be reported using the issue tracker.
If you'd like to submit a pull request, please read the contribution guide first.
Author
Obscenity © Joe L. under the MIT license. Authored and maintained by Joe L.
GitHub @jo3-l