
Product
Socket for Jira Is Now Available
Socket for Jira lets teams turn alerts into Jira tickets with manual creation, automated ticketing rules, and two-way sync.
words-count
Advanced tools
Words count for multi-languages paragraph mixed with numbers and punctuations
Words count for multi-languages paragraph mixed with numbers and punctuation.
One rule apply to all.
npm i words-count
import wordsCount from 'words-count';
// const wordsCount = require('words-count').default;
console.log(wordsCount('Hello World'));
words_to_be_count = 'Hello “世界”';
words-count.js -> 3
words_to_be_count.length -> 10
words_to_be_count.split(' ').length -> 2
Countable.js -> 2
PHP str_word_count(words_to_be_count) -> 1
PHP mb_strlen(words_to_be_count) -> 10
Office Word -> 5
Numbers count as 1 word
const words = "Some words ...";
// Treat punctuation as word breaker
const total = wordsCount(words, {
punctuationAsBreaker: true
});
// Treat more characters as punctuation
const total = wordsCount(words, {
punctuation: ['-', 'a', 'b']
});
// Disable default built-in punctuation list
const total = wordsCount(words, {
disableDefaultPunctuation: true
});
const words = "Some words ...";
const { wordsSplit, wordsDetect } = require('words-count');
const splittedWords = wordsSplit(words);
const { words, count } = wordsDetect(words);
Test cases are based on best assumption.
Original Content:
Google's free service instantly translates words, phrases, and web pages between English and over 100 other languages.
Basic Test Content:
Translate original content into target language by Google Translate.
Test Case Coverage:
English, Chinese, Chinese-Traditional, Japanese, Korean, French, German,
Italian, Spanish, Portuguese, Russian, Ukrainian, Arabic, Hebrew, Afrikaans,
Albanian, Amharic, Armenian, Azerbaijani, Basque, Belarusian, Bengali,
Bulgarian, Bosnian, Catalan, Cebuano, Croatian, Chichewa, Corsican, Czech,
Danish, Dutch, Esperanto, Estonian, Filipino, Finnish, Frisian, Greek, Galician,
Georgian, Gujarati, Haitian Creole, Hausa, Hindi, Hmong, Hungarian, Icelandic,
Igbo, Indonesian, Irish, Javanese, Kannada, Kazakh, Kurdish, Kyrgyz, Latin,
Latvian, Lithuanian, Luxembourgish, Macedonian, Malagasy, Malay, Malayalam,
Nepali, Norwegian, Polish, Romanian, Serbian, Slovenian, Swedish, Turkish, Welsh
Zulu, Maori, Marathi, Mongolian, Pashto, Persian, Punjabi, Scots Gaelic, Sesotho,
Shona, Sindhi, Sinhala, Slovak, Somali, Sundanese, Swahili, Tajik, Turkish,
Urdu, uzbek, Vietnamese, Xhosa, Yiddish, Yoruba
Failed/Unknown:
Hawalian, Khmer, Lao, Maltese, Myanmar, Tamil, Thai
http://php.net/manual/en/function.str-word-count.php#109733
https://www.key-shortcut.com/en/writing-systems/%E6%96%87%E5%AD%97-chinese-cjk/cjk-characters-1/
http://jrgraphix.net/r/Unicode/0D00-0D7F
FAQs
Words count for multi-languages paragraph mixed with numbers and punctuations
We found that words-count demonstrated a not healthy version release cadence and project activity because the last version was released a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Product
Socket for Jira lets teams turn alerts into Jira tickets with manual creation, automated ticketing rules, and two-way sync.

Company News
Socket won two 2026 Reppy Awards from RepVue, ranking in the top 5% of all sales orgs. AE Alexandra Lister shares what it's like to grow a sales career here.

Security News
NIST will stop enriching most CVEs under a new risk-based model, narrowing the NVD's scope as vulnerability submissions continue to surge.