New Case Study:See how Anthropic automated 95% of dependency reviews with Socket.Learn More
Socket
Sign inDemoInstall
Socket

@ridi/epub-parser

Package Overview
Dependencies
Maintainers
12
Versions
81
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

@ridi/epub-parser

Common EPUB2 data parser for Ridibooks services

  • 0.3.0-alpha.5
  • Source
  • npm
  • Socket score

Version published
Weekly downloads
47
increased by123.81%
Maintainers
12
Weekly downloads
 
Created
Source

@ridi/epub-parser

Common EPUB2 data parser for Ridibooks services

npm version Build Status codecov

Features

  • EPUB2 parsing
  • EPUB3 parsing
  • Package validation with option
  • Unzip epub file when parsing with options
  • Read files
    • Extract inner HTML of body in Spine with option
    • Change base path of Spine, CSS and Inline style with option
    • Customize CSS, Inline Style with options
  • Encrypt and decrypt function when parsing or reading or unzipping
  • More spec
    • encryption.xml
    • manifest.xml
    • metadata.xml
    • rights.xml
    • signatures.xml
  • Debug mode
  • Environment
    • Node
    • CLI
    • Browser
  • Online demo

Install

npm install @ridi/epub-parser

Usage

Basic:

import { EpubParser } from '@ridi/epub-parser';
// or const { EpubParser } = require('@ridi/epub-parser');

const parser = new EpubParser('./foo/bar.epub' or './unzippedPath');
parser.parse(/* { parseOptions } */).then((book) => {
  parser.readItems(book.spines/*, { readOptions } */).then((results) => {
    ...
  });
  ...
});

with Cryptor:

import { CryptoProvider, Cryptor } from '@ridi/epub-parser';
// or const { CryptoProvider, Cryptor } = require('@ridi/epub-parser');

const { Purpose } = CryptoProvider;
const { Modes, Padding } = Cryptor;

class ContentCryptoProvider extends CryptoProvider {
  constructor(key) {
    super();
    this.cryptor = new Cryptor(Modes.ECB, { key });
  }

  getCryptor(filePath, purpose) {
    return this.cryptor;
  }

  // If use as follows:
  // const provider = new ContentCryptoProvider(...);
  // const parser = new EpubParser('encrypted.epub', provider);
  // const book = await parser.parse({ unzipPath: ... });
  // const firstSpine = await parser.readItem(book.spines[0]);
  //
  // It will be called as follows:
  // 1. run(data, 'encrypted.epub', Purpose.READ_IN_DIR)
  // 2. run(data, 'META-INF/container.xml', Purpose.READ_IN_ZIP)
  // 3. run(data, 'OEBPS/content.opf', Purpose.READ_IN_ZIP)
  // ...
  // 4. run(data, 'mimetype', Purpose.WRITE)
  // ...
  // 5. run(data, 'OEBPS/Text/Section0001.xhtml', Purpose.READ_IN_DIR)
  //
  run(data, filePath, purpose) {
    const cryptor = this.getCryptor(filePath, purpose);
    const padding = Padding.AUTO;
    if (purpose === Purpose.READ_IN_DIR) {
      return cryptor.decrypt(data, padding);
    } else if (purpose === Purpose.WRITE) {
      return cryptor.encrypt(data, padding);
    }
    return data;
  }
}

const cryptoProvider = new ContentCryptoProvider(key);
const parser = new EpubParser('./encrypted.epub' or './unzippedPath', cryptoProvider);

API

parse(parseOptions)

Returns Promise<Book> with:

  • Book: Instance with metadata, spine list, table of contents, etc.

Or throw exception.

parseOptions: ?object

readItem(item, readOptions)

Returns string or Buffer in Promise with:

or throw exception.

item: Item (see: Item Types)
readOptions: ?object

readItems(items, readOptions)

Returns string[] or Buffer[] in Promise with:

or throw exception.

items: Item[] (see: Item Types)
readOptions: ?object

Model

Book

Author

  • name: ?string
  • fileAs: ?string
  • role: string (Default: Author.Roles.UNDEFINED)

DateTime

  • value: ?string
  • event: string (Default: DateTime.Events.UNDEFINED)

Identifier

  • value: ?string
  • scheme: string (Default: Identifier.Schemes.UNDEFINED)

Meta

  • name: ?string
  • content: ?string

Guide

  • title: ?string
  • type: string (Default: Guide.Types.UNDEFINED)
  • href: ?string
  • item: ?Item

Item Types

Item
  • id: ?string
  • href: ?string
  • mediaType: ?string
  • size: ?number
  • isFileExists: boolean (size !== undefined)

SpineItem (extend Item)
  • index: number (Default: -1)
  • isLinear: boolean (Default: true)
  • styles: ?CssItem[]

NcxItem (extend Item)

CssItem (extend Item)
  • namespace: string

InlineCssItem (extend CssItem)
  • style: string (Default: '')

ImageItem (extend Item)
  • isCover: boolean (Default: false)

SvgItem (extend ImageItem)

FontItem (extend Item)

DeadItem (extend Item)
  • reason: string (Default: DeadItem.Reason.UNDEFINED)

NavPoint

  • id: ?string
  • label: ?string
  • src: ?string
  • anchor: ?string
  • depth: number (Default: 0)
  • children: NavPoint[]
  • spine: ?SpineItem

Version

  • major: number
  • minor: number
  • patch: number
  • isValid: boolean (Only 2.x.x is valid because current epub-parser only supports EPUB2.)
  • toString(): string

Parse Options


validatePackage: boolean

If true, validation package specifications in IDPF listed below.

only using if input is EPUB file.

  • Zip header should not corrupt.
  • mimetype file must be first file in archive.
  • mimetype file should not compressed.
  • mimetype file should only contain string application/epub+zip.
  • Should not use extra field feature of ZIP format for mimetype file.

Default: false


allowNcxFileMissing: boolean

If false, stop parsing when NCX file not exists.

Default: true


unzipPath: ?string

If specified, uncompress to that path.

only using if input is EPUB file.

Default: undefined


overwrite: boolean

If true, overwrite to unzipPath when uncompress.

only using if unzipPath specified.

Default: true


ignoreLinear: boolean

If true, ignore index difference caused by isLinear property of SpineItem.

// e.g. If left is false, right is true.
[{ index: 0, isLinear: true, ... },       [{ index: 0, isLinear: true, ... },
{ index: 1, isLinear: true, ... },        { index: 1, isLinear: true, ... },
{ index: -2, isLinear: false, ... },      { index: 2, isLinear: false, ... },
{ index: 3, isLinear: true, ... }]        { index: 3, isLinear: true, ... }]

Default: false


parseStyle: boolean

If true, styles used for spine is described, and one namespace is given per CSS file or inline style.

Otherwise it CssItem.namespace, SpineItem.styles is undefined.

In any list, InlineCssItem is always positioned after CssItem. (Book.styles, Book.items, SpineItem.styles, ...)

Default: true


styleNamespacePrefix: string

Prepend given string to namespace for identification.

only using if parseStyle is true.

Default: 'ridi_style'


Read Options


basePath: ?string

If specified, change base path of paths used by spine and css.

HTML: SpineItem

...
  <!-- Before -->
  <div>
    <img src="../Images/cover.jpg">
  </div>
  <!-- After -->
  <div>
    <img src="{basePath}/OEBPS/Images/cover.jpg">
  </div>
...

CSS: CssItem, InlineCssItem

/* Before */
@font-face {
  font-family: NotoSansRegular;
  src: url("../Fonts/NotoSans-Regular.ttf");
}
/* After */
@font-face {
  font-family: NotoSansRegular;
  src: url("{basePath}/OEBPS/Fonts/NotoSans-Regular.ttf");
}

Default: undefined


extractBody: boolean|function

If true, extract body. Otherwise it returns a full string. If specify a function instead of true, use function to transform body.

false:

'<!doctype><html>\n<head>\n</head>\n<body style="background-color: #000000;">\n  <p>Extract style</p>\n  <img src=\"../Images/api-map.jpg\"/>\n</body>\n</html>'

true:

'<body style="background-color: #000000;">\n  <p>Extract style</p>\n  <img src=\"../Images/api-map.jpg\"/>\n</body>'

function:

readOptions.extractBody = (innerHTML, attrs) => {
  const string = attrs.map((attr) => {
    return ` ${attr.key}=\"${attr.value}\"`;
  }).join(' ');
  return `<article ${string}>${innerHTML}</article>`;
};
'<article style="background-color: #000000;">\n  <p>Extract style</p>\n  <img src=\"../Images/api-map.jpg\"/>\n</article>'

Default: false


serializedAnchor: Boolean

If true, replace file path of anchor in spine with spine index.

...
<spine toc="ncx">
  <itemref idref="Section0001.xhtml"/> <!-- index: 0 -->
  <itemref idref="Section0002.xhtml"/> <!-- index: 1 -->
  <itemref idref="Section0003.xhtml"/> <!-- index: 2 -->
  ...
</spine>
...
<!-- Before -->
<a href="./Text/Section0002.xhtml#title">Chapter 2</a>
<!-- After -->
<a href="1#title">Chapter 2</a>

Default: false


removeAtrules: string[]

Remove at-rules.

Default: []


removeTags: string[]

Remove selector that point to specified tags.

Default: []


removeIds: string[]

Remove selector that point to specified ids.

Default: []


removeClasses: string[]

Remove selector that point to specified classes.

Default: []


Keywords

FAQs

Package last updated on 20 Dec 2018

Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts

SocketSocket SOC 2 Logo

Product

  • Package Alerts
  • Integrations
  • Docs
  • Pricing
  • FAQ
  • Roadmap
  • Changelog

Packages

npm

Stay in touch

Get open source security insights delivered straight into your inbox.


  • Terms
  • Privacy
  • Security

Made with ⚡️ by Socket Inc