Security News
New Python Packaging Proposal Aims to Solve Phantom Dependency Problem with SBOMs
PEP 770 proposes adding SBOM support to Python packages to improve transparency and catch hidden non-Python dependencies that security tools often miss.
@ridi/epub-parser
Advanced tools
Common EPUB2 data parser for Ridibooks services
npm install @ridi/epub-parser
Basic:
import { EpubParser } from '@ridi/epub-parser';
// or const { EpubParser } = require('@ridi/epub-parser');
const parser = new EpubParser('./foo/bar.epub' or './unzippedPath');
parser.parse(/* { parseOptions } */).then((book) => {
parser.readItems(book.spines/*, { readOptions } */).then((results) => {
...
});
...
});
with AesCryptor:
import { CryptoProvider, AesCryptor } from '@ridi/epub-parser';
// or const { CryptoProvider, AesCryptor } = require('@ridi/epub-parser');
const { Purpose } = CryptoProvider;
const { Mode, Padding } = AesCryptor;
class ContentCryptoProvider extends CryptoProvider {
constructor(key) {
super();
this.cryptor = new AesCryptor(Mode.ECB, { key });
}
getCryptor(filePath, purpose) {
return this.cryptor;
}
// If use as follows:
// const provider = new ContentCryptoProvider(...);
// const parser = new EpubParser('encrypted.epub', provider);
// const book = await parser.parse({ unzipPath: ... });
// const firstSpine = await parser.readItem(book.spines[0]);
//
// It will be called as follows:
// 1. run(data, 'encrypted.epub', Purpose.READ_IN_DIR)
// 2. run(data, 'META-INF/container.xml', Purpose.READ_IN_ZIP)
// 3. run(data, 'OEBPS/content.opf', Purpose.READ_IN_ZIP)
// ...
// 4. run(data, 'mimetype', Purpose.WRITE)
// ...
// 5. run(data, 'OEBPS/Text/Section0001.xhtml', Purpose.READ_IN_DIR)
//
run(data, filePath, purpose) {
const cryptor = this.getAesCryptor(filePath, purpose);
const padding = Padding.AUTO;
if (purpose === Purpose.READ_IN_DIR) {
return cryptor.decrypt(data, { padding });
} else if (purpose === Purpose.WRITE) {
return cryptor.encrypt(data, { padding });
}
return data;
}
}
const cryptoProvider = new ContentCryptoProvider(key);
const parser = new EpubParser('./encrypted.epub' or './unzippedPath', cryptoProvider);
Log level setting:
import { LogLevel, ... } from '@ridi/epub-parser';
const parser = new EpubParser(/* path */, /* cryptoProvider */, /* logLevel */)
// or const parser = new EpubParser(/* path */, /* logLevel */)
parser.logger.logLevel = LogLevel.VERBOSE; // SILENT, ERROR, WARN(default), INFO, DEBUG, VERBOSE
Returns Promise<EpubBook>
with:
Or throw exception.
?object
Returns string
or Buffer
in Promise
with:
SpineItem, CssItem, InlineCssItem, NcxItem, SvgItem:
string
Other items:
Buffer
or throw exception.
Item
(see: Item Types)?object
Returns string[]
or Buffer[]
in Promise
with:
SpineItem, CssItem, InlineCssItem, NcxItem, SvgItem:
string[]
Other items:
Buffer[]
or throw exception.
Item[]
(see: Item Types)?object
Returns Promise<boolean>
with:
true
, unzip is successful or has already been unzipped.Or throw exception.
string
boolean
Tells the progress of parser through callback
.
const { Action } = EpubParser; // PARSE, READ_ITEMS
parser.onProgress = (step, totalStep, action) => {
console.log(`[${action}] ${step} / ${totalStep}`);
}
Type | Value |
---|---|
UNDEFINED | undefined |
UNKNOWN | unknown |
ADAPTER | adp |
ANNOTATOR | ann |
ARRANGER | arr |
ARTIST | art |
ASSOCIATEDNAME | asn |
AUTHOR | aut |
AUTHOR_IN_QUOTATIONS_OR_TEXT_EXTRACTS | aqt |
AUTHOR_OF_AFTER_WORD_OR_COLOPHON_OR_ETC | aft |
AUTHOR_OF_INTRODUCTIONOR_ETC | aui |
BIBLIOGRAPHIC_ANTECEDENT | ant |
BOOK_PRODUCER | bkp |
COLLABORATOR | clb |
COMMENTATOR | cmm |
DESIGNER | dsr |
EDITOR | edt |
ILLUSTRATOR | ill |
LYRICIST | lyr |
METADATA_CONTACT | mdc |
MUSICIAN | mus |
NARRATOR | nrt |
OTHER | oth |
PHOTOGRAPHER | pht |
PRINTER | prt |
REDACTOR | red |
REVIEWER | rev |
SPONSOR | spn |
THESIS_ADVISOR | ths |
TRANSCRIBER | trc |
TRANSLATOR | trl |
Type | Value |
---|---|
UNDEFINED | undefined |
UNKNOWN | unknown |
CREATION | creation |
MODIFICATION | modification |
PUBLICATION | publication |
Type | Value |
---|---|
UNDEFINED | undefined |
UNKNOWN | unknown |
DOI | doi |
ISBN | isbn |
ISBN13 | isbn13 |
ISBN10 | isbn10 |
ISSN | issn |
UUID | uuid |
URI | uri |
Type | Value |
---|---|
UNDEFINED | undefined |
UNKNOWN | unknown |
COVER | cover |
TITLE_PAGE | title-page |
TOC | toc |
INDEX | index |
GLOSSARY | glossary |
ACKNOWLEDGEMENTS | acknowledgements |
BIBLIOGRAPHY | bibliography |
COLOPHON | colophon |
COPYRIGHT_PAGE | copyright-page |
DEDICATION | dedication |
EPIGRAPH | epigraph |
FOREWORD | foreword |
LOI | loi |
LOT | lot |
NOTES | notes |
PREFACE | preface |
TEXT | text |
Type | Value |
---|---|
UNDEFINED | undefined |
UNKNOWN | unknown |
NOT_EXISTS | not_exists |
NOT_SPINE | not_spine |
NOT_NCX | not_ncx |
NOT_SUPPORT_TYPE | not_support_type |
boolean
If true, validation package specifications in IDPF listed below.
used only if input is EPUB file.
mimetype
file must be first file in archive.mimetype
file should not compressed.mimetype
file should only contain string application/epub+zip
.Default: false
boolean
If false, stop parsing when NCX file not exists.
Default: true
?string
If specified, unzip to that path.
only using if input is EPUB file.
Default: undefined
boolean
If true, overwrite to unzipPath when unzip.
only using if unzipPath specified.
Default: true
boolean
If true, styles used for spine is described, and one namespace is given per CSS file or inline style.
Otherwise it CssItem.namespace
, SpineItem.styles
is undefined
.
In any list, InlineCssItem is always positioned after CssItem. (EpubBook.styles
, EpubBook.items
, SpineItem.styles
, ...)
Default: true
string
Prepend given string to namespace for identification.
only available if parseStyle is true.
Default: 'ridi_style'
?string
If specified, added inline styles to all spines.
only available if parseStyle is true.
Default: undefined
If true, ignore any exceptions that occur within parser.
Default: false
?string
If specified, change base path of paths used by spine and css.
HTML: SpineItem
...
<!-- Before -->
<div>
<img src="../Images/cover.jpg">
</div>
<!-- After -->
<div>
<img src="{basePath}/OEBPS/Images/cover.jpg">
</div>
...
CSS: CssItem, InlineCssItem
/* Before */
@font-face {
font-family: NotoSansRegular;
src: url("../Fonts/NotoSans-Regular.ttf");
}
/* After */
@font-face {
font-family: NotoSansRegular;
src: url("{basePath}/OEBPS/Fonts/NotoSans-Regular.ttf");
}
Default: undefined
boolean|function
If true, extract body. Otherwise it returns a full string. If specify a function instead of true, use function to transform body.
false
:
'<!doctype><html>\n<head>\n</head>\n<body style="background-color: #000000;">\n <p>Extract style</p>\n <img src=\"../Images/api-map.jpg\"/>\n</body>\n</html>'
true
:
'<body style="background-color: #000000;">\n <p>Extract style</p>\n <img src=\"../Images/api-map.jpg\"/>\n</body>'
function
:
readOptions.extractBody = (innerHTML, attrs) => {
const string = attrs.map((attr) => {
return ` ${attr.key}=\"${attr.value}\"`;
}).join(' ');
return `<article ${string}>${innerHTML}</article>`;
};
'<article style="background-color: #000000;">\n <p>Extract style</p>\n <img src=\"../Images/api-map.jpg\"/>\n</article>'
Default: false
Boolean
If true, replace file path of anchor in spine with spine index.
...
<spine toc="ncx">
<itemref idref="Section0001.xhtml"/> <!-- index: 0 -->
<itemref idref="Section0002.xhtml"/> <!-- index: 1 -->
<itemref idref="Section0003.xhtml"/> <!-- index: 2 -->
...
</spine>
...
<!-- Before -->
<a href="./Text/Section0002.xhtml#title">Chapter 2</a>
<!-- After -->
<a href="1#title">Chapter 2</a>
Default: false
boolean
Ignore all scripts from within HTML.
Default: false
string[]
Remove at-rules.
Default: []
string[]
Remove selector that point to specified tags.
Default: []
string[]
Remove selector that point to specified ids.
Default: []
string[]
Remove selector that point to specified classes.
Default: []
FAQs
Common EPUB2 data parser for Ridibooks services
The npm package @ridi/epub-parser receives a total of 116 weekly downloads. As such, @ridi/epub-parser popularity was classified as not popular.
We found that @ridi/epub-parser demonstrated a not healthy version release cadence and project activity because the last version was released a year ago. It has 9 open source maintainers collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
PEP 770 proposes adding SBOM support to Python packages to improve transparency and catch hidden non-Python dependencies that security tools often miss.
Security News
Socket CEO Feross Aboukhadijeh discusses open source security challenges, including zero-day attacks and supply chain risks, on the Cyber Security Council podcast.
Security News
Research
Socket researchers uncover how threat actors weaponize Out-of-Band Application Security Testing (OAST) techniques across the npm, PyPI, and RubyGems ecosystems to exfiltrate sensitive data.