Huge News!Announcing our $40M Series B led by Abstract Ventures.Learn More →

jschardet

Package Overview

Dependencies

Advanced tools

Install Socket

Detect and block malicious and high-risk dependencies

Install

jschardet

Character encoding auto-detection in JavaScript (port of python's chardet)

1.1.0
Source
npm

Version published: 11 years ago

Weekly downloads: 200K; decreased by-18.02%

Maintainers: 1

Weekly downloads

Created: 13 years ago

What is jschardet?

jschardet is a JavaScript library for character encoding detection. It is a port of the Python chardet library and is used to detect the character encoding of a given text. This can be particularly useful when dealing with text data from various sources where the encoding is unknown.

What are jschardet's main functionalities?

Detect Character Encoding

This feature allows you to detect the character encoding of a given text. The `detect` method returns an object with the encoding and confidence level.

const jschardet = require('jschardet');
const text = 'Some text with unknown encoding';
const result = jschardet.detect(text);
console.log(result);

Other packages similar to jschardet

JsChardet

Port of python's chardet (http://chardet.feedparser.org/).

License

LGPL

How To Use It

npm install jschardet

var jschardet = require("jschardet")

// "àíàçã" in UTF-8
jschardet.detect("\xc3\xa0\xc3\xad\xc3\xa0\xc3\xa7\xc3\xa3")
// { encoding: "utf-8", confidence: 0.9690625 }

// "次常用國字標準字體表" in Big5 
jschardet.detect("\xa6\xb8\xb1\x60\xa5\xce\xb0\xea\xa6\x72\xbc\xd0\xb7\xc7\xa6\x72\xc5\xe9\xaa\xed")
// { encoding: "Big5", confidence: 0.99 }

Supported Charsets

Big5, GB2312/GB18030, EUC-TW, HZ-GB-2312, and ISO-2022-CN (Traditional and Simplified Chinese)
EUC-JP, SHIFT_JIS, and ISO-2022-JP (Japanese)
EUC-KR and ISO-2022-KR (Korean)
KOI8-R, MacCyrillic, IBM855, IBM866, ISO-8859-5, and windows-1251 (Russian)
ISO-8859-2 and windows-1250 (Hungarian)
ISO-8859-5 and windows-1251 (Bulgarian)
windows-1252
ISO-8859-7 and windows-1253 (Greek)
ISO-8859-8 and windows-1255 (Visual and Logical Hebrew)
TIS-620 (Thai)
UTF-32 BE, LE, 3412-ordered, or 2143-ordered (with a BOM)
UTF-16 BE or LE (with a BOM)
UTF-8 (with or without a BOM)
ASCII

Technical Information

I haven't been able to create tests to correctly detect:

ISO-2022-CN
windows-1250 in Hungarian
windows-1251 in Bulgarian
windows-1253 in Greek
EUC-CN

A one-file minimized version is missing.

Authors

Ported from python to JavaScript by António Afonso (https://github.com/aadsm/jschardet) Transformed into an npm package by Markus Ast (https://github.com/brainafk)

Keywords

FAQs

What is jschardet?

Is jschardet popular?

Is jschardet well maintained?

Package last updated on 12 Oct 2013

Did you know?

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

jschardet

What is jschardet?

What are jschardet's main functionalities?

Other packages similar to jschardet

chardet

iconv-lite

node-icu-charset-detector

JsChardet

License

How To Use It

Supported Charsets

Technical Information

Authors

Keywords

Related posts

Node.js Implements Stricter Policies for Semver-Major Pull Requests Ahead of Release Deadlines

Roblox Developers Targeted with npm Packages Infected with Skuld Infostealer and Blank Grabber

vlt Debuts New JavaScript Package Manager and Serverless Registry at NodeConf EU