Security News
Combatting Alert Fatigue by Prioritizing Malicious Intent
In 2023, data breaches surged 78% from zero-day and supply chain attacks, but developers are still buried under alerts that are unable to prevent these threats.
node-icu-charset-detector
Advanced tools
Character set detection is the process of determining the character set, or encoding, of character data in an unknown format.
A simple binding of ICU character set detection (http://userguide.icu-project.org/conversion/detection) for Node.js.
At first, install libicu
into your system. Debian users can install libicu
by apt-get
easily.
sudo apt-get install libicu-dev
After that, install node-icu-charset-detector
from npm.
npm install node-icu-charset-detector
If you prefer to install the package by hand, try following commands.
git clone git://github.com/mooz/node-icu-charset-detector.git
cd node-icu-charset-detector
node-waf configure
node-waf build
node-waf install
node-icu-charset-detector
provides a class CharsetMatch
which takes a instance of Buffer
for the first argument of the constructor. A instance of CharsetMatch
has three methods below.
CharsetMatch.prototype.getName()
CharsetMatch.prototype.getLanguage()
CharsetMatch.prototype.getConfidence()
Here is a simple usage of node-icu-charset-detector
.
var charsetDetector = require("node-icu-charset-detector");
var CharsetMatch = charsetDetector.CharsetMatch;
var byteArray = fs.readFileSync(path);
var charsetMatch = new CharsetMatch(byteArray);
var detectedCharsetName = charsetMatch.getName();
var detectedLanguage = charsetMatch.getLanguage();
var detectionConfidence = charsetMatch.getConfidence();
Since ICU itself does not have a feature to convert character sets, you may need to use node-iconv
(https://github.com/bnoordhuis/node-iconv) which has a powerful character sets converting feature.
Here is a simple example to leverage node-iconv
to convert character sets which is not supported by native Node.js.
var Iconv = require("iconv").Iconv;
function bufferToString(buffer, charset) {
try {
return buffer.toString(charset);
} catch (x) {
var charsetConverter = new Iconv(charset, "utf8");
return charsetConverter.convert(buffer).toString();
}
}
var charsetMatch = new CharsetMatch(byteArray);
var bufferString = bufferToString(byteArray, charsetMatch.getName());
FAQs
Simple binding for ICU charset detector
The npm package node-icu-charset-detector receives a total of 316 weekly downloads. As such, node-icu-charset-detector popularity was classified as not popular.
We found that node-icu-charset-detector demonstrated a not healthy version release cadence and project activity because the last version was released a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
In 2023, data breaches surged 78% from zero-day and supply chain attacks, but developers are still buried under alerts that are unable to prevent these threats.
Security News
Solo open source maintainers face burnout and security challenges, with 60% unpaid and 60% considering quitting.
Security News
License exceptions modify the terms of open source licenses, impacting how software can be used, modified, and distributed. Developers should be aware of the legal implications of these exceptions.