Security News
Research
Data Theft Repackaged: A Case Study in Malicious Wrapper Packages on npm
The Socket Research Team breaks down a malicious wrapper package that uses obfuscation to harvest credentials and exfiltrate sensitive data.
A comic book archive metadata reader and writer.
Comicbox reads CBZ, CBR, CBT, and optionally PDF. Comicbox archives and writes CBZ archives and PDF metadata.
Comicbox reads and writes:
Comicbox's primary purpose is a library for use by Codex comic reader. The API isn't well documented, but you can infer what it does pretty easily here: comicbox.comic_archive as the primary interface.
The command line is increasingly useful and can read and write metadata recursively and extract pages.
Comicbox does not use popular metadata database APIs or have a GUI!
Comictagger is a popular alternative. It does most of what Comicbox does but also automatically tags comics with the ComicVine API and has a desktop UI.
pip install comicbox
Comicbox supports PDFs as an extra when installed like:
pip install comicbox[pdf]
Comicbox generally works without any binary dependencies but requires unrar
be
on the path to convert CBR into CBZ or extract files from CBRs.
Type
comicbox -h
see the CLI help.
comicbox test.cbz -m "{Tags: a,b,c, story_arcs: {d:1,e:'',f:3}" -m "Publisher: SmallComics" -w cr
Will write those tags to comicinfo.xml in the archive.
Be sure to add spaces after colons so they are detected as valid YAML key value pairs. This is easy to forget.
But it's probably better to use the --print action to see what it's going to do before you actually write to the archive:
comicbox test.cbz -m "{Tags: a,b,c, story_arcs: {d:1,e:'',f:3}" -m "Publisher: SmallComics" -p
A recursive example:
comicbox --recurse -m "publisher: 'SC Comics'" -w cr ./SmallComicsComics/
Will recursively change the publisher to "SC Comics" for every comic found in under the SmallComicsComics directory.
the -m
command line argument accepts the YAML language for tags. Certain
characters like \,:;_()$%^@
are part of the YAML language. To successful
include them as data in your tags, look up
"Escaping YAML" documentation online
To delete metadata from the cli you're best off exporting the current metadata, editing the file and then re-importing it with the delete previous metadata option:
# export the current metadata
comicbox --export cix "My Overtagged Comic.cbz"
# Adjust the metadata in an editor.
nvim comicinfo.xml
# Check that importing the metadata will look how you like
comicbox --import comicinfo.xml -p "My Overtagged Comic.cbz"
# Delete all previous metadata from the comic (careful!)
comicbox --delete "My Overtagged Comic.cbz"
# Import the metadata into the file and write it.
comicbox --import comicinfo.xml --write cix "My Overtagged Comic.cbz"
The comicbox.yaml format represents the ComicInfo.xml Web tag as an
identifiers.url
tag. Fear not, you don't have to remember this. The CLI
accepts heterogeneous tag types with the -m
option, so you can type:
comicbox -p -m "Web: https://foo.com" mycomic.cbz
and the identifier tag should appear in comicbox.yaml as:
identifiers:
nss: foo.com
url: https://foo.com
Comicbox actually installs three different packages:
comicbox
The main API and CLI script.comicfn2dict
A separate library for parsing comic filenames into dicts it
also includes a CLI script.pdffile
A utility library for reading and writing PDF files with an API like
Python's ZipFilecomicbox accepts command line arguments but also an optional config file and environment variables.
The variables have defaults specified in a default yaml
The environment variables are the variable name prefixed with COMICBOX_
. (e.g.
COMICBOX_COMICINFOXML=0)
change logging level:
LOGLEVEL=ERROR comicbox -p <path>
You may access most development tasks from the makefile. Run make to see documentation.
I didn't like Comictagger's API, so I built this for myself as an educational exercise and to use as a library for Codex comic reader.
Comicbox supports reading and writing several comic book metadata schemas.
Comicbox includes a pretty good comic archive filename parser. It can extract a number of common fields from comic archive filenames.
Location | Name |
---|---|
Archive | The archive filename |
Import/Export | comicbox-filename.txt |
The pdf metadata standard. Can be exported as an xml file or written directly to the pdf itself.
Adobe PDF Namespace Adobe PDF Standard § 14.3.3 Document Information Dictionary
PDF metadata is only read or written from and to PDF files.
Location | Name |
---|---|
Archive | PDF internal |
Import/Export | pdf-metadata.xml |
keywords
Comicbox will read most any metadata standard it supports from the keywords field. If that fails it will consider the keywords field as a comma delimited "Tags" field.
keywords
By default Comicbox will write ComicInfo XML to the keywords field (e.g.
-w pdf
)
Codex supports this because it uses Comicbox. Other comic readers do not support PDF embedded ComicInfo.xml, but since they already have ComicInfo.xml parsers it's possible that they might someday.
If Comicbox JSON is included in the write formats (e.g. -w pdf,json
) Comicbox
will write comicbox.json to the keywords field instead. It is unlikely that any
other comic reader other than Codex will ever support this.
An old and uncommon comic metadata standard from a defunct comic book reader.
Location | Name |
---|---|
Archive | comet.xml |
Import/Export | comet.xml |
The Comic Book Lover schema. A rare but still encountered JSON schema. It probably survives because Comictagger supports writing it.
Location | Name |
---|---|
Archive | Zip & Rar Comments |
Import/Export | comic-book-info.json |
The Comic Rack schema. The de facto standard of comic book metadata. The Comic Rack reader is defunct, but the Anansi Project now publishes the ComicInfo spec and has compatibly and conservatively extended it.
Anansi ComicInfo v2.1 Spec Also, an unofficial, undocumented Mylar extension to ComicInfo.xml that encodes multiple Story Arcs and Story Arc Numbers as CSV values.
Location | Name |
---|---|
Archive | comicinfo.xml |
Import/Export | comicinfo.xml |
The most useful comic book metadata writer is ComicTagger. It supports the ComicVine API, is extensible to other APIs, and features a nice desktop GUI. Internally, Comictagger keeps a metadata object to work with the schemas it supports. This schema allows the import and export of that schema.
Comictaggger genericmetadata.py
This schema may only be useful to developers. The author of ComicTagger offers no promises as to the stability of this API and I am very lazy, so the chances of this drifting out of date are anyone's guess. It was included because it was easy to do.
Location | Name |
---|---|
Archive | comictagger.json |
Import/Export | comictagger.json |
The comicbox internal data structure which acts as a superset of the above schemas to allow interpolating.
Location | Name |
---|---|
Archive | comicbox.json |
Import/Export | comicbox.json |
YAML is a superset of JSON, so the JSON schema applies here.
Location | Name |
---|---|
Archive | comicbox.yaml |
Import/Export | comicbox.yaml |
The Comicbox CLI uses "flow style" YAML, which is an all on one line format to enter metadata on the command line.
Specifying metadata on the command line like this is additive.
Location | Name |
---|---|
Comicbox CLI | -m --metadata |
Archive | comicbox-cli.yaml |
Import/Export | comicbox-cli.yaml |
There is a special environment variable DEBUG_TRANSFORM
that will print
verbose schema transform information
FAQs
An API for reading comic archive contents and metadata: CBZ, CBR, CBT and PDF
We found that comicbox demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
Research
The Socket Research Team breaks down a malicious wrapper package that uses obfuscation to harvest credentials and exfiltrate sensitive data.
Research
Security News
Attackers used a malicious npm package typosquatting a popular ESLint plugin to steal sensitive data, execute commands, and exploit developer systems.
Security News
The Ultralytics' PyPI Package was compromised four times in one weekend through GitHub Actions cache poisoning and failure to rotate previously compromised API tokens.