New Research: Supply Chain Attack on Axios Pulls Malicious Dependency from npm.Details →
Socket
Book a DemoSign in
Socket

diacritical

Package Overview
Dependencies
Maintainers
1
Versions
5
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

diacritical

Implment HTML corrections of Baha'i terms using the 'accents' database

latest
Source
npmnpm
Version
0.1.6
Version published
Weekly downloads
4
-20%
Maintainers
1
Weekly downloads
 
Created
Source

Diacriṭícál

Bahá’í Term Suggestion Script

Home Page: http://chadananda.github.io/diacritical

The big idea:

Formatting Bahá’í electronic texts is a pain. Even the best quality electronic texts from the BWC contain thousands of errors. And by far the most common type of error is the darned diacritcals on all those transliterated Arabic and Persian terms.

This project attempts to address that fundamental problem by providing an automated replacement and suggestion system which can utilize a dictionary of authoritatively spelled terms.

Some Examples:

See how many errors the current dictionary can find in some reference.bahai.org books:

Get involved, help build the dictionary!

The project to actually gather that massive correct dictionary is over here: https://github.com/chadananda/accents

To get involved, simply contact me at chadananda@gmail.com. I'll set up user credentials for you which allow you to log in and start adding words to the dictionary.

A JSON object containing the current wordlist can be pulled down using this REST URL:

Using the Library

After adding diacritical.js to a web page simply provide a dictionary of UTF-8 terms as a Javascript array. For example:

var diacrital = New Diacritical();
var dictionary = [
    'Abu’l-Faḍl',
    'Aḥmad',
    'Aḥmad',
    'Aḥmad',
    'Ahmad',
    'aḥmad',
    'Aḥsanu’l-Qiṣaṣ',
    'Ba_ghdád',
    'Baqí‘',
    'Báqir-i-Ra_shtí',
    'Dalílu’l-Mutaḥayyirín',
    'Fatḥ-‘Alí',
];
var bad_text = "This is some sample text with bad diacriticals: Ahmad, Baghdad, "+
  "Fath-Ali and Baqi. Notice that the dictionary uses _sh to indicate an "+
  "underscore. Also, notice that the dictionary is case sensitive but the "+
  "replacement system is still smart enough to deal with all-caps like AHMAD. "+
  "Moreover, the dictionary ideally should have multiple versions of each word in "+
  "order to help weed out misspellings. (Notice one of the spellings of Ahmad is "+
  "wrong in the dictionary but the word is still corrected correctly.)";

var fixed_text = diacritical.replaceText(bad_text, dictionary);

Using the Library as a node module

Diacritical is now a node module so you can simply install it in your node project with:

npm install diacritical

Then instantiate a new diacritical object with:

var Diacritical = require('diacritical'),
    diacritical = new Diacritical;

You'll have to fetch a wordlist like this:

  var accents_url = 'http://diacritics.iriscouch.com/accents/_design/terms_list/_view/terms_list';
  var dictionary = [];
  request(accents_url, function (error, response, body) {
    if (!error && response.statusCode == 200) {
      var data = JSON.parse(body);
      for (i = 0; i < data.rows.length; ++i) dictionary.push(data.rows[i]['key']);    
    } else {
      console.log("Error Response: " + error);
    }
  });

Test harness: http://chadananda.github.io/diacritical/tests

Keywords

baha'i

FAQs

Package last updated on 12 Oct 2014

Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts