
Security News
CISA Rebuffs Funding Concerns as CVE Foundation Draws Criticism
CISA denies CVE funding issues amid backlash over a new CVE foundation formed by board members, raising concerns about transparency and program governance.
words-count
Advanced tools
Words count for multi-languages paragraph mixed with numbers and punctuations
Words count for multi-languages paragraph mixed with numbers and punctuation.
One rule apply to all.
npm i words-count
import wordsCount from 'words-count';
// const wordsCount = require('words-count').default;
console.log(wordsCount('Hello World'));
words_to_be_count = 'Hello “世界”';
words-count.js -> 3
words_to_be_count.length -> 10
words_to_be_count.split(' ').length -> 2
Countable.js -> 2
PHP str_word_count(words_to_be_count) -> 1
PHP mb_strlen(words_to_be_count) -> 10
Office Word -> 5
Numbers count as 1 word
const words = "Some words ...";
// Treat punctuation as word breaker
const total = wordsCount(words, {
punctuationAsBreaker: true
});
// Treat more characters as punctuation
const total = wordsCount(words, {
punctuation: ['-', 'a', 'b']
});
// Disable default built-in punctuation list
const total = wordsCount(words, {
disableDefaultPunctuation: true
});
const words = "Some words ...";
const { wordsSplit, wordsDetect } = require('words-count');
const splittedWords = wordsSplit(words);
const { words, count } = wordsDetect(words);
Test cases are based on best assumption.
Original Content:
Google's free service instantly translates words, phrases, and web pages between English and over 100 other languages.
Basic Test Content:
Translate original content into target language by Google Translate.
Test Case Coverage:
English, Chinese, Chinese-Traditional, Japanese, Korean, French, German,
Italian, Spanish, Portuguese, Russian, Ukrainian, Arabic, Hebrew, Afrikaans,
Albanian, Amharic, Armenian, Azerbaijani, Basque, Belarusian, Bengali,
Bulgarian, Bosnian, Catalan, Cebuano, Croatian, Chichewa, Corsican, Czech,
Danish, Dutch, Esperanto, Estonian, Filipino, Finnish, Frisian, Greek, Galician,
Georgian, Gujarati, Haitian Creole, Hausa, Hindi, Hmong, Hungarian, Icelandic,
Igbo, Indonesian, Irish, Javanese, Kannada, Kazakh, Kurdish, Kyrgyz, Latin,
Latvian, Lithuanian, Luxembourgish, Macedonian, Malagasy, Malay, Malayalam,
Nepali, Norwegian, Polish, Romanian, Serbian, Slovenian, Swedish, Turkish, Welsh
Zulu, Maori, Marathi, Mongolian, Pashto, Persian, Punjabi, Scots Gaelic, Sesotho,
Shona, Sindhi, Sinhala, Slovak, Somali, Sundanese, Swahili, Tajik, Turkish,
Urdu, uzbek, Vietnamese, Xhosa, Yiddish, Yoruba
Failed/Unknown:
Hawalian, Khmer, Lao, Maltese, Myanmar, Tamil, Thai
http://php.net/manual/en/function.str-word-count.php#109733
https://www.key-shortcut.com/en/writing-systems/%E6%96%87%E5%AD%97-chinese-cjk/cjk-characters-1/
http://jrgraphix.net/r/Unicode/0D00-0D7F
FAQs
Words count for multi-languages paragraph mixed with numbers and punctuations
The npm package words-count receives a total of 6,176 weekly downloads. As such, words-count popularity was classified as popular.
We found that words-count demonstrated a not healthy version release cadence and project activity because the last version was released a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
CISA denies CVE funding issues amid backlash over a new CVE foundation formed by board members, raising concerns about transparency and program governance.
Product
We’re excited to announce a powerful new capability in Socket: historical data and enhanced analytics.
Product
Module Reachability filters out unreachable CVEs so you can focus on vulnerabilities that actually matter to your application.