New Case Study:See how Anthropic automated 95% of dependency reviews with Socket.Learn More →

@ridi/epub-parser

Package Overview

Dependencies

Advanced tools

Install Socket

Detect and block malicious and high-risk dependencies

Install

@ridi/epub-parser

Common EPUB2 data parser for Ridibooks services

0.3.0-alpha.5
Source
npm

Version published: 6 years ago

Weekly downloads: 47; increased by123.81%

Maintainers: 12

Weekly downloads

Created: 6 years ago

Source

@ridi/epub-parser

Common EPUB2 data parser for Ridibooks services

Features

Install

npm install @ridi/epub-parser

Usage

Basic:

import { EpubParser } from '@ridi/epub-parser';
// or const { EpubParser } = require('@ridi/epub-parser');

const parser = new EpubParser('./foo/bar.epub' or './unzippedPath');
parser.parse(/* { parseOptions } */).then((book) => {
  parser.readItems(book.spines/*, { readOptions } */).then((results) => {
    ...
  });
  ...
});

with Cryptor:

import { CryptoProvider, Cryptor } from '@ridi/epub-parser';
// or const { CryptoProvider, Cryptor } = require('@ridi/epub-parser');

const { Purpose } = CryptoProvider;
const { Modes, Padding } = Cryptor;

class ContentCryptoProvider extends CryptoProvider {
  constructor(key) {
    super();
    this.cryptor = new Cryptor(Modes.ECB, { key });
  }

  getCryptor(filePath, purpose) {
    return this.cryptor;
  }

  // If use as follows:
  // const provider = new ContentCryptoProvider(...);
  // const parser = new EpubParser('encrypted.epub', provider);
  // const book = await parser.parse({ unzipPath: ... });
  // const firstSpine = await parser.readItem(book.spines[0]);
  //
  // It will be called as follows:
  // 1. run(data, 'encrypted.epub', Purpose.READ_IN_DIR)
  // 2. run(data, 'META-INF/container.xml', Purpose.READ_IN_ZIP)
  // 3. run(data, 'OEBPS/content.opf', Purpose.READ_IN_ZIP)
  // ...
  // 4. run(data, 'mimetype', Purpose.WRITE)
  // ...
  // 5. run(data, 'OEBPS/Text/Section0001.xhtml', Purpose.READ_IN_DIR)
  //
  run(data, filePath, purpose) {
    const cryptor = this.getCryptor(filePath, purpose);
    const padding = Padding.AUTO;
    if (purpose === Purpose.READ_IN_DIR) {
      return cryptor.decrypt(data, padding);
    } else if (purpose === Purpose.WRITE) {
      return cryptor.encrypt(data, padding);
    }
    return data;
  }
}

const cryptoProvider = new ContentCryptoProvider(key);
const parser = new EpubParser('./encrypted.epub' or './unzippedPath', cryptoProvider);

API

parse(parseOptions)

Returns Promise<Book> with:

Book: Instance with metadata, spine list, table of contents, etc.

Or throw exception.

readItem(item, readOptions)

Returns string or Buffer in Promise with:

SpineItem, CssItem, InlineCssItem, NcxItem, SvgItem:
- string
Other items:
- Buffer

or throw exception.

item: `Item` (see: Item Types)

readItems(items, readOptions)

Returns string[] or Buffer[] in Promise with:

SpineItem, CssItem, InlineCssItem, NcxItem, SvgItem:
- string[]
Other items:
- Buffer[]

or throw exception.

items: `Item[]` (see: Item Types)

Model

Book

titles: string[]
creators: Author[]
subjects: string[]
description: ?string
publisher: ?string
contributors: Author[]
dates: DateTime[]
type: ?string
format: ?string
identifiers: Identifier[]
source: ?string
language: ?string
relation: ?string
rights: ?string
version: Version
metas: Meta[]
items: Item[]
spines: SpintItem[]
ncx: ?NcxItem
fonts: FontItem[]
cover: ?ImageItem
images: ImageItem[]
styles: CssItem[]
guides: Guide[]
deadItems: DeadItem[]

Author

name: ?string
fileAs: ?string
role: string (Default: Author.Roles.UNDEFINED)

DateTime

value: ?string
event: string (Default: DateTime.Events.UNDEFINED)

Identifier

value: ?string
scheme: string (Default: Identifier.Schemes.UNDEFINED)

Guide

title: ?string
type: string (Default: Guide.Types.UNDEFINED)
href: ?string
item: ?Item

Item Types

Item

id: ?string
href: ?string
mediaType: ?string
size: ?number
isFileExists: boolean (size !== undefined)

SpineItem (extend Item)

index: number (Default: -1)
isLinear: boolean (Default: true)
styles: ?CssItem[]

InlineCssItem (extend CssItem)

style: string (Default: '')

DeadItem (extend Item)

reason: string (Default: DeadItem.Reason.UNDEFINED)

NavPoint

id: ?string
label: ?string
src: ?string
anchor: ?string
depth: number (Default: 0)
children: NavPoint[]
spine: ?SpineItem

Version

major: number
minor: number
patch: number
isValid: boolean (Only 2.x.x is valid because current epub-parser only supports EPUB2.)
toString(): string

Parse Options

validatePackage: `boolean`

If true, validation package specifications in IDPF listed below.

only using if input is EPUB file.

Zip header should not corrupt.
mimetype file must be first file in archive.
mimetype file should not compressed.
mimetype file should only contain string application/epub+zip.
Should not use extra field feature of ZIP format for mimetype file.

Default: false

allowNcxFileMissing: `boolean`

If false, stop parsing when NCX file not exists.

Default: true

unzipPath: `?string`

If specified, uncompress to that path.

only using if input is EPUB file.

Default: undefined

overwrite: `boolean`

If true, overwrite to unzipPath when uncompress.

only using if unzipPath specified.

Default: true

ignoreLinear: `boolean`

If true, ignore index difference caused by isLinear property of SpineItem.

// e.g. If left is false, right is true.
[{ index: 0, isLinear: true, ... },       [{ index: 0, isLinear: true, ... },
{ index: 1, isLinear: true, ... },        { index: 1, isLinear: true, ... },
{ index: -2, isLinear: false, ... },      { index: 2, isLinear: false, ... },
{ index: 3, isLinear: true, ... }]        { index: 3, isLinear: true, ... }]

Default: false

parseStyle: `boolean`

If true, styles used for spine is described, and one namespace is given per CSS file or inline style.

Otherwise it CssItem.namespace, SpineItem.styles is undefined.

In any list, InlineCssItem is always positioned after CssItem. (Book.styles, Book.items, SpineItem.styles, ...)

Default: true

styleNamespacePrefix: `string`

Prepend given string to namespace for identification.

only using if parseStyle is true.

Default: 'ridi_style'

Read Options

basePath: `?string`

If specified, change base path of paths used by spine and css.

HTML: SpineItem

...
  <!-- Before -->
  <div>
    <img src="../Images/cover.jpg">
  </div>
  <!-- After -->
  <div>
    <img src="{basePath}/OEBPS/Images/cover.jpg">
  </div>
...

CSS: CssItem, InlineCssItem

/* Before */
@font-face {
  font-family: NotoSansRegular;
  src: url("../Fonts/NotoSans-Regular.ttf");
}
/* After */
@font-face {
  font-family: NotoSansRegular;
  src: url("{basePath}/OEBPS/Fonts/NotoSans-Regular.ttf");
}

Default: undefined

extractBody: `boolean|function`

If true, extract body. Otherwise it returns a full string. If specify a function instead of true, use function to transform body.

false:

'<!doctype><html>\n<head>\n</head>\n<body style="background-color: #000000;">\n  <p>Extract style</p>\n  <img src=\"../Images/api-map.jpg\"/>\n</body>\n</html>'

true:

'<body style="background-color: #000000;">\n  <p>Extract style</p>\n  <img src=\"../Images/api-map.jpg\"/>\n</body>'

function:

readOptions.extractBody = (innerHTML, attrs) => {
  const string = attrs.map((attr) => {
    return ` ${attr.key}=\"${attr.value}\"`;
  }).join(' ');
  return `<article ${string}>${innerHTML}</article>`;
};

'<article style="background-color: #000000;">\n  <p>Extract style</p>\n  <img src=\"../Images/api-map.jpg\"/>\n</article>'

Default: false

serializedAnchor: `Boolean`

If true, replace file path of anchor in spine with spine index.

...
<spine toc="ncx">
  <itemref idref="Section0001.xhtml"/> <!-- index: 0 -->
  <itemref idref="Section0002.xhtml"/> <!-- index: 1 -->
  <itemref idref="Section0003.xhtml"/> <!-- index: 2 -->
  ...
</spine>
...

<!-- Before -->
<a href="./Text/Section0002.xhtml#title">Chapter 2</a>
<!-- After -->
<a href="1#title">Chapter 2</a>

Default: false

removeAtrules: `string[]`

Remove at-rules.

Default: []

removeTags: `string[]`

Remove selector that point to specified tags.

Default: []

removeIds: `string[]`

Remove selector that point to specified ids.

Default: []

removeClasses: `string[]`

Remove selector that point to specified classes.

Default: []

Keywords

FAQs

What is @ridi/epub-parser?

Is @ridi/epub-parser popular?

Is @ridi/epub-parser well maintained?

Package last updated on 20 Dec 2018

Did you know?

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

@ridi/epub-parser

@ridi/epub-parser

Features

Install

Usage

API

parse(parseOptions)

parseOptions: ?object

readItem(item, readOptions)

item: Item (see: Item Types)

readOptions: ?object

readItems(items, readOptions)

items: Item[] (see: Item Types)

readOptions: ?object

Model

Item Types

SpineItem (extend Item)

NcxItem (extend Item)

CssItem (extend Item)

InlineCssItem (extend CssItem)

ImageItem (extend Item)

SvgItem (extend ImageItem)

FontItem (extend Item)

DeadItem (extend Item)

Parse Options

validatePackage: boolean

allowNcxFileMissing: boolean

unzipPath: ?string

overwrite: boolean

ignoreLinear: boolean

parseStyle: boolean

styleNamespacePrefix: string

Read Options

basePath: ?string

extractBody: boolean|function

serializedAnchor: Boolean

removeAtrules: string[]

removeTags: string[]

removeIds: string[]

removeClasses: string[]

Keywords

Related posts

parseOptions: `?object`

item: `Item` (see: Item Types)

readOptions: `?object`

items: `Item[]` (see: Item Types)

readOptions: `?object`

validatePackage: `boolean`

allowNcxFileMissing: `boolean`

unzipPath: `?string`

overwrite: `boolean`

ignoreLinear: `boolean`

parseStyle: `boolean`

styleNamespacePrefix: `string`

basePath: `?string`

extractBody: `boolean|function`

serializedAnchor: `Boolean`

removeAtrules: `string[]`

removeTags: `string[]`

removeIds: `string[]`

removeClasses: `string[]`