truncate-html
Truncate html string(even contains emoji chars) and keep tags in safe. You can custom ellipsis sign, ignore unwanted elements and truncate html by words.
Notice This is a node module depends on cheerio can only run on nodejs. If you need a browser version, you may consider truncate or nodejs-html-truncate.
const truncate = require('truncate-html')
truncate('<p><img src="xxx.jpg">Hello from earth!</p>', 2, { byWords: true })
Installation
npm install truncate-html
or
yarn add truncate-html
Try it online
Click https://npm.runkit.com/truncate-html to try.
API
truncate(html, [length], [options])
truncate.setup(options)
Default options
{
byWords: false,
stripTags: false,
ellipsis: '...',
decodeEntities: false,
keepWhitespaces: false,
excludes: '',
reserveLastWord: false,
keepWhitespaces: false
}
You can change default options by using truncate.setup
e.g.
truncate.setup({ stripTags: true, length: 10 })
truncate('<p><img src="xxx.jpg">Hello from earth!</p>')
or use existing cheerio instance
import * as cheerio from 'cheerio'
truncate.setup({ stripTags: true, length: 10 })
const $ = cheerio.load('<p><img src="xxx.jpg">Hello from earth!</p>', {
decodeEntities: true
}, false)
truncate($)
Notice
Typescript support
This lib is written with typescript and has a type definition file along with it. You may need to update your tsconfig.json
by adding "esModuleInterop": true
to the compilerOptions
if you encounter some typing errors, see #19.
About final string length
If the html string content's length is shorter than options.length
, then no ellipsis will be appended to the final html string. If longer, then the final string length will be options.length
+ options.ellipsis
. And if you set reserveLastWord
to true or none zero number, the final string will be various.
All html comments <!-- xxx -->
will be removed
About dealing with none alphabetic languages
When dealing with none alphabetic languages, such as Chinese/Japanese/Korean, they don't separate words with whitespaces, so options byWords
and reserveLastWord
should only works well with alphabetic languages.
And the only dependency of this project cheerio
has an issue when dealing with none alphabetic languages, see Known Issues for details.
Using existing cheerio instance
If you want to use existing cheerio instance, truncate option decodeEntities
will not work, you should set it in your own cheerio instance:
var html = '<p><img src="abc.png">This is a string</p> for test.'
const $ = cheerio.load(`${html}`, {
decodeEntities: true
}, false)
truncate($, 10)
Examples
var truncate = require('truncate-html')
var html = '<p><img src="abc.png">This is a string</p> for test.'
truncate(html, 10)
var string = '<p>poo 💩💩💩💩💩<p>'
truncate(string, 6)
var html = '<p><img src="abc.png">This is a string</p> for test.'
truncate(html, 10, { stripTags: true })
var html = '<p><img src="abc.png">This is a string</p> for test.'
truncate(html, 3, { byWords: true })
var html = '<p> <img src="abc.png">This is a string</p> for test.'
truncate(html, 10, { keepWhitespaces: true })
var html = '<p><img src="abc.png">This is a string</p> for test.'
truncate(html, {
length: 10,
stripTags: true
})
var html = '<p><img src="abc.png">This is a string</p> for test.'
truncate(html, {
length: 10,
ellipsis: '~'
})
var html = '<p><img src="abc.png">This is a string</p> for test.'
truncate(html, {
length: 10,
ellipsis: '~',
excludes: 'img'
})
var html =
'<p><img src="abc.png">This is a string</p><div class="something-unwanted"> unwanted string inserted ( ´•̥̥̥ω•̥̥̥` )</div> for test.'
truncate(html, {
length: 20,
stripTags: true,
ellipsis: '~',
excludes: ['img', '.something-unwanted']
})
var html = '<p> test for <p> encoded string</p>'
truncate(html, {
length: 20,
decodeEntities: true
})
var html = '<p> test for <p> encoded string</p>'
truncate(html, {
length: 20,
decodeEntities: false
})
var html = '<p> test for <p> 中文 string</p>'
truncate(html, {
length: 20,
decodeEntities: true
})
for More usages, check truncate.spec.ts
Credits
Thanks to: