Brightml
Smart utility rendering markdown-ready HTML.
Install
$ npm install brightml
Use
Clean all HTML at once :
var brightml = require('brightml');
var HTMLString = '<table><tr><td>Title 1</td><td>Title 2</td></tr><tr><td>Data 1</td><td>Data 2</td></tr></table>';
var cleanHTML = brightml.clean(HTMLString);
Or use the module's functions as required :
var brightml = require('brightml');
var HTMLString = '<table><tr><td>Title 1</td><td>Title 2</td></tr><tr><td>Data 1</td><td>Data 2</td></tr></table>';
brightml.parse(HTMLString);
brightml.formatTables();
var cleanHTML = brightml.render();
What it does
Using brightml.clean(html)
performs the following operations in order.
brightml.parse(HTMLString)
Convert HTML to DOM using cheerio.
For cross-referenced links, handle retrieving the foot/endnotes before the next <h1>
tag to keep notes within a chapter section.
The footnotes are then formatted as follow:
<h1>Footnotes</h1>
<p>
See how to properly format a footnote<sup id="footnote-ref"><a href="#footnote">1</a></sup>.
</p>
<p>
<sup id="footnote">
Footnotes are in a paragraph and a sup tag. Link to go back to reference is at the end of the footnote.
<a href="#footnote-ref"></a>
</sup>
</p>
brightml.setAnchorsId()
Try to set <a>
tags id
attribute on their direct parent if possible.
brightml.cleanElements()
- Remove empty tags.
- Remove forbidden HTML tags and place their HTML content in a
<p>
instead. - Remove forbidden HTML attributes.
- Remove unallowed links schema in HTML attributes.
This operation uses the rules.js
file to determine which tags/attributes/schemes are allowed.
brightml.normalizeTitlesId()
Set an id
attribute on each <h>
tag. The id
is based on the title tag content.
Each reference to this id
will be modified in consequence.
<h1 id="some-id">A great title</h1>
<a href="#some-id">Back to a great title</a>
will become:
<h1 id="a_great_title">A great title</h1>
<a href="#a_great_title">Back to a great title</a>
brightml.removeNestedTables()
Replace nested <table>
tags by a warning message followed by their content in a simple <td>
tag.
brightml.formatTables()
Ensure every <table>
elements look the same.
Used schema :
<caption></caption>
<table>
<thead>
<tr>
<th>Title 1</th>
<th>Title 2</th>
</tr>
</thead>
<tbody>
<tr>
<td>Row 1 - Data 1</td>
<td>Row 1 - Data 2</td>
</tr>
<tr>
<td>Row 2 - Data 1</td>
<td>Row 2 - Data 2</td>
</tr>
</tbody>
</table>
brightml.cleanTableCells()
Ensure every <th>
and <td>
tags don't contain a <p>
tag to prevent line breaking.
brightml.render()
Returns the current state of HTMLString
passed to brightml.parse(HTMLString)
.