Huge News!Announcing our $40M Series B led by Abstract Ventures.Learn More
Socket
Sign inDemoInstall
Socket

node-icu-charset-detector

Package Overview
Dependencies
Maintainers
1
Versions
13
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

node-icu-charset-detector - npm Package Compare versions

Comparing version 0.0.5 to 0.0.6

node-icu-charset-detector.js

4

package.json
{
"name" : "node-icu-charset-detector",
"version" : "0.0.5",
"main" : "./build/Release/node-icu-charset-detector",
"version" : "0.0.6",
"main" : "./node-icu-charset-detector.js",
"description" : "Simple binding for ICU charset detector",

@@ -6,0 +6,0 @@ "keywords" : ["charset-detection", "icu"],

@@ -17,10 +17,2 @@ # ICU Character Set Detection for Node.js

If you prefer to install the package by hand, try following commands.
git clone git://github.com/mooz/node-icu-charset-detector.git
cd node-icu-charset-detector
node-waf configure
node-waf build
node-waf install
## Usage

@@ -30,35 +22,34 @@

`node-icu-charset-detector` provides a class `CharsetMatch` which takes a instance of `Buffer` for the first argument of the constructor. A instance of `CharsetMatch` has three methods below.
`node-icu-charset-detector` provides a function `detectCharset(buffer)`, where `buffer` is an instance of `Buffer` whose charset should be detected.
- `CharsetMatch.prototype.getName()`
- returns the name of detected character set.
- `CharsetMatch.prototype.getLanguage()`
- returns the language for detected character set.
- `CharsetMatch.prototype.getConfidence()`
- returns the confidence of detection.
var charsetDetector = require("node-icu-charset-detector");
Here is a simple usage of `node-icu-charset-detector`.
var charsetDetector = require("node-icu-charset-detector");
var CharsetMatch = charsetDetector.CharsetMatch;
var buffer = fs.readFileSync("/path/to/the/file");
var charset = charsetDetector.detectCharset(buffer);
var byteArray = fs.readFileSync(path);
var charsetMatch = new CharsetMatch(byteArray);
var detectedCharsetName = charsetMatch.getName();
var detectedLanguage = charsetMatch.getLanguage();
var detectionConfidence = charsetMatch.getConfidence();
console.log("charset name: " + charset.toString());
console.log("language: " + charset.language);
console.log("detection confidence: " + charset.confidence);
`detectCharset(buffer)` returns the detected charset name for `buffer`, and the returned charset name has two extra properties `language` and `confidence`:
- `charset.language`
- language name for the detected character set.
- `charset.confidence`
- confidence of the charset detection for `charset`.
### Leveraging node-iconv
Since ICU itself does not have a feature to convert character sets, you may need to use `node-iconv` (https://github.com/bnoordhuis/node-iconv) which has a powerful character sets converting feature.
Since ICU itself does not have a feature to convert character sets, you may need to use `node-iconv` (https://github.com/bnoordhuis/node-iconv), which has a powerful character sets converting feature.
Here is a simple example to leverage `node-iconv` to convert character sets which is not supported by native Node.js.
Here is a simple example to leverage `node-iconv` to convert character sets not supported by Node itself.
var Iconv = require("iconv").Iconv;
function bufferToString(buffer, charset) {
function bufferToString(buffer) {
var charsetDetector = require("node-icu-charset-detector");
var charset = charsetDetector.detectCharset(buffer).toString();
try {
return buffer.toString(charset);
} catch (x) {
var Iconv = require("iconv").Iconv;
var charsetConverter = new Iconv(charset, "utf8");

@@ -68,4 +59,4 @@ return charsetConverter.convert(buffer).toString();

}
var charsetMatch = new CharsetMatch(byteArray);
var bufferString = bufferToString(byteArray, charsetMatch.getName());
var buffer = fs.readFileSync("/path/to/the/file");
var bufferString = bufferToString(buffer);

Sorry, the diff of this file is not supported yet

SocketSocket SOC 2 Logo

Product

  • Package Alerts
  • Integrations
  • Docs
  • Pricing
  • FAQ
  • Roadmap
  • Changelog

Packages

npm

Stay in touch

Get open source security insights delivered straight into your inbox.


  • Terms
  • Privacy
  • Security

Made with ⚡️ by Socket Inc