@citeproc-rs/wasm
This is a front-end to
citeproc-rs
, a citation
processor written in Rust and compiled to WebAssembly.
It contains builds appropriate for:
- Node.js
- Browsers, using a bundler like Webpack.js
- Browsers directly importing an ES Module from a webserver
Installation / Release channels
There are two release channels:
Stable is each versioned release. (At the time of writing, there are no
versioned releases.) Install with:
yarn add @citeproc-rs/wasm
Canary tracks the master branch on
GitHub. Its version numbers follow
the format 0.0.0-canary-GIT_COMMIT_SHA
, so version ranges in your
package.json
are not meaningful. But you can install the latest one with:
yarn add @citeproc-rs/wasm@canary
yarn add @citeproc-rs/wasm@0.0.0-canary-COMMIT_SHA
If you use NPM, replace yarn add
with npm install
.
Including in your project
For Node.js, simply import the package as normal. Typescript definitions are
provided, though parts of the API that cannot have auto-generated type
definitions are alluded to in doc comments with an accompanying type you can
import.
// Node.js
const { Driver } = require("@citeproc-rs/wasm");
Microsoft Edge
Note the caveats in around Microsoft Edge's TextEncoder/TextDecoder support in
the wasm-bindgen
tutorial.
Using Webpack
When loading on the web, for technical reasons and because the compiled
WebAssembly is large, you must load the package asynchronously. Webpack comes
with the ability to import packages asynchronously like so:
import("@citeproc-rs/wasm")
.then(go)
.catch(console.error);
function go(wasm) {
const { Driver } = wasm;
}
When you do this, your code will trigger a download (and streaming parse) of
the binary, and when that is complete, your go
function will be called. The
download can of course be cached if your web server is set up correctly, making
the whole process very quick.
You can use the regular-import Driver as a TypeScript type anywhere, just don't
use it to call new Driver()
.
React
If you're writing a React app, you may wish to use React.lazy
like so:
import React, { Suspense } from "react";
const AsyncCiteprocEnabledComponent = React.lazy(async () => {
await import("@citeproc-rs/wasm");
return await import("./CiteprocEnabledComponent");
});
const App = () => (
<Suspense
fallback={<div>Loading citation formatting engine...</div>}>
<AsyncCiteprocEnabledComponent />
</Suspense>
);
import { Driver } from "@citeproc-rs/wasm";
Importing it in a script tag (web
target)
To directly import it without a bundler in a (modern) web browser with ES
modules support, the procedure is different. You must:
- Make the
_web
subdirectory of the published NPM package available in a
content directory on your webserver, or use a CDN like unpkg. - Include a
<script type="module">
tag in your page's <body>
, like so:
<script type="module">
import init, { Driver } from './path/to/_web/citeproc_rs_wasm.js';
async function run() {
await init();
}
run()
</script>
Careful: This method does not ensure the package is loaded only once. If
you call init again, it will invalidate any previous Drivers you created.
Importing it in a script tag (no-modules
target)
This is based on the wasm-bindgen guide
entry,
noting the caveats. You will, similarly to the web
target, need to make the
contents of the _no_modules
subdirectory of the published NPM package
available on a webserver or via a CDN. But it has ONE ADDITIONAL FILE to
import via a script tag.
Careful: This method does not ensure the package is loaded only once. If
you call init again, it will invalidate any previous Drivers you created.
<html>
<head>
<meta content="text/html;charset=utf-8" http-equiv="Content-Type"/>
</head>
<body>
<!-- Include these TWO JS files -->
<script src='path/to/@citeproc-rs/wasm/_no_modules/citeproc_rs_wasm_include.js'></script>
<script src='path/to/@citeproc-rs/wasm/_no_modules/citeproc_rs_wasm.js'></script>
<script>
// Like with the `--target web` output the exports are immediately
// available but they won't work until we initialize the module. Unlike
// `--target web`, however, the globals are all stored on a
// `wasm_bindgen` global. The global itself is the initialization
// function and then the properties of the global are all the exported
// functions.
//
// Note that the name `wasm_bindgen` will at some point be configurable with the
// `--no-modules-global` CLI flag (https://github.com/rustwasm/wasm-pack/issues/729)
const { Driver } = wasm_bindgen;
async function run() {
// Note the _bg.wasm ending
await wasm_bindgen('path/to/@citeproc-rs/wasm/_no_modules/citeproc_rs_wasm_bg.wasm');
// Use Driver
}
run();
</script>
</body>
</html>
Usage in Zotero
There is a special build for Zotero and the legacy Firefox ESR extensions API,
which wants a CommonJS module format but without the Node.js fs
APIs, and
no-modules
' loading mechanisms but without the use of window
as a global as
it doesn't exist. The files are in the _zotero
directory of the NPM package.
Usage is essentially the same as no-modules; you'll need all three files:
@citeproc-rs/wasm/_zotero/citeproc_rs_wasm_include.js
@citeproc-rs/wasm/_zotero/citeproc_rs_wasm.js
@citeproc-rs/wasm/_zotero/citeproc_rs_wasm_bg.wasm
Apart from the CommonJS shims, the main difference is that the API will be
loaded onto the Zotero.CiteprocRs
object, in order for it all to be linked
together.
Careful: This method does not ensure the package is loaded only once. If
you call initWasmModule
again, it will invalidate any previous Drivers you
created.
require("citeproc_rs_wasm_include");
const initWasmModule = require("citeproc_rs_wasm");
const wasmBinaryPromise = Zotero.HTTP
.request('GET',
'resource://zotero/citeproc_rs_wasm_bg.wasm',
{ responseType: "arraybuffer" })
.then(xhr => xhr.response);
await initWasmModule(wasmBinaryPromise);
let driver;
try {
driver = new Zotero.CiteprocRs.Driver({...});
} catch (e) {
if (e instanceof Zotero.CiteprocRs.CslStyleError) {
}
}
Usage
Overview
The basic pattern of interactive use is:
- Create a driver instance with your style
- Edit the references or the citation clusters as you please
- Call
driver.batchedUpdates()
- Apply the updates to your document (e.g. GUI)
- Go to step 2 when a user makes a change
Step three is the important one. Each time you edit a cluster or a reference,
it is common for only one or two visible modifications to result. Therefore,
the driver only gives you those clusters or bibliography entries that have
changed, or have been caused to change by an edit elsewhere. You can submit any
number of edits between each call.
The API also allows for non-interactive use. See below.
Error handling
Many Driver methods can throw errors.
If you want to handle the errors from this library specifically, you can, and
this is mainly useful for showing style parse or validation errors. Some error
types have structured data attached to them.
try {
let driver = new Driver({ ... });
} catch (error) {
if (error instanceof CslStyleError) {
console.error("Could not parse CSL, error:", error);
} else if (error instanceof CiteprocRsDriverError) {
console.error("Error in usage of Driver", error);
} else if (error instanceof CiteprocRsError) {
console.error("Catch-all error", error);
} else {
throw error;
}
} finally {
if (driver) {
driver.free()
}
}
The error types must unfortunately be global exports, on window/global/self.
1. Creating a driver instance
First, create a driver. Note that for now, you must also call .free()
on the
Driver when you are finished with it to deallocate its memory, but there is a TC39
proposal
in the implementation phase that will make this unnecessary.
A driver needs at least an XML style string, a fetcher (below), and an output
format (one of "html"
, "rtf"
or "plain"
).
let fetcher = ...;
let driver = new Driver({
style: "<style version=\"1.0\" class=\"note\" ... > ... </style>",
format: "html",
formatOptions: {
linkAnchors: true,
},
localeOverride: "de-DE",
fetcher,
});
await driver.fetchLocales();
driver.free()
The library parses and validates the CSL style input. Any validation errors are
reported, with byte offsets to find the CSL fragment responsible, a descriptive
and useful message (in English) and sometimes even a hint for how to fix it.
See Error Handling for how to access this.
Fetcher
There are hundreds of locales, and the locales you need depend on the style
default, any overrides and any fallback locales defined, so the procedure for
retrieving one is asynchronous to allow for fetching one over HTTP. There's not
much more to it than this:
class Fetcher {
async fetchLocale(lang) {
return await fetch("https://some-cdn-with-locales.com/locales-${lang}.xml")
.then(res => res.text());
}
}
let fetcher = new Fetcher();
let driver = new Driver({ ..., fetcher });
await driver.fetchLocales();
Unless you don't have async
syntax, in which case, return a Promise
directly, e.g. return Promise.resolve("<locale> ... </locale>")
.
Declining to provide a locale fetcher in new Driver
or forgetting to call
await driver.fetchLocales()
results in use of the bundled en-US
locale. You
should also never attempt to use the driver instance while it is fetching locales.
2. Edit the references or the citation clusters
References
You can insert a reference like so. This is a CSL-JSON object.
driver.insertReference({ id: "citekey", type: "book", title: "Title" });
driver.insertReferences([ ... many references ... ]);
driver.resetReferences([ ... deletes any others ... ]);
driver.removeReference("citekey");
Citation Clusters and their Cites
A document consists of a series of clusters, each with a series of cites. Each
cluster has an id
, which is any old string.
driver.initClusters([
{ id: "one", cites: [ {id: "citekey"} ] },
{ id: "two", cites: [ {id: "citekey", locator: "56", label: "page" } ] },
]);
driver.insertCluster({ id: "one", cites: [ { id: "updated_citekey" } ] });
let three = driver.randomClusterId();
driver.insertCluster({ id: three, cites: [ { id: "new_cluster_here" } ] });
These clusters do not contain position information, so reordering is a separate
procedure. Without calling setClusterOrder, the driver considers the document
to be empty.
So, setClusterOrder
expresses the ordering of the clusters within the
document. Each one in the document should appear in this list. You can skip
note numbers, which means there were non-citing footnotes in between. Omitting
note
means it's an in-text reference. Note numbers must be monotonic, but you
can have more than one cluster in the same footnote.
driver.setClusterOrder([ { id: "one", note: 1 }, { id: "two", note: 4 } ]);
You will notice that if an interactive user cuts and pastes a paragraph
containing citation clusters, the whole reordering operation can be expressed
in two calls, one after the cut (with some clusters omitted) and one after the
paste (with those same clusters placed somewhere else). No calls to
insertCluster
need be made.
Uncited items
Sometimes a user wishes to include references in the bibliography even though
they are not mentioned in a citation anywhere in the document.
driver.includeUncited("None");
driver.includeUncited("All");
driver.includeUncited({ Specific: ["citekeyA", "citekeyB"] });
The "All" is based on which references your driver knows about. If you have
this set to "All", simply calling driver.insertReference()
with a new
reference ID will result in an entry being added to the bibliography. Entries
in Specific mode do not have to exist when they are provided here; they can be,
for instance, the citekeys of collection of references in a reference library
which are subsequently provided in full to the driver, at which point they
appear in the bibliography, but not items from elsewhere in the library.
3. Call driver.batchedUpdates()
and apply the diff
This gets you a diff to apply to your document UI. It includes both clusters
that have changed, and bibliography entries that have changed.
let diff = driver.batchedUpdates();
for (let changedCluster of diff.clusters) {
let [id, html] = changedCluster;
myDocument.updateCluster(id, html);
}
if (diff.bibliography != null) {
let bib = diff.bibliography;
for (let key of Object.keys(bib.updatedEntries)) {
let rendered = bib.updatedEntries[key];
myDocument.updateBibEntry(key, rendered);
}
if (bib.entryIds != null) {
myDocument.setBibliographyOrder(bib.entryIds);
}
}
Note, for some intuition, if you call batchedUpdates()
again immediately, the
diff will be empty.
Bibliographies
Beyond the interactive batchedUpdates method, there are two functions for
producing a bibliography statically.
let meta = driver.bibliographyMeta();
let bibliography = driver.makeBibliography();
for (let entry of bibliography) {
console.log(entry.id, entry.value);
}
Preview citation clusters
Sometimes, a user wants to see how a cluster will look while they are editing
it, before confirming the change.
let cluster = { cites: [ { id: "citekey", locator: "45" }, { ... } ] };
let positions = [ ... before, { note: 34 }, ... after ];
let preview = driver.previewCluster(cluster, positions);
let plainPreview = driver.previewCluster(cluster, positions, "plain");
The cluster argument is just a cluster, without an id
field, since it's
ephemeral. The lack of id
field is reflected in the positions
argument as
well.
The positions array is exactly like a call to setClusterOrder
, except exactly
one of the positions omits the id field. This could either:
- Replace an existing cluster's position, and preview a cluster replacement; or
- Represent the position a cluster is hypothetically inserted.
If you passed only one position, it would be like previewing an operation like
"delete the entire document and replace it with this one cluster". That would
mean you would never see "ibid" in a preview. So for maximum utility,
assemble the positions array as you would a call to setClusterOrder
with
exactly the operation you're previewing applied.
The format argument is optional, and works like the format passed to
new Driver
: one of "html"
, "rtf"
or "plain"
. The driver will use that
instead of its normal output format.
AuthorOnly
, SuppressAuthor
& Composite
@citeproc-rs/wasm
supports these flags on clusters (all 3) and cites (except
Composite
), in a similar way to citeproc-js
. See the citeproc-js
documentation on Special Citation
Forms
for reference.
let citeAO = { id: "jones2006", mode: "AuthorOnly" };
let citeSA = { id: "jones2006", mode: "SuppressAuthor" };
let clusterAO = { id: "one", cites: [...], mode: "AuthorOnly" };
let clusterSA = { id: "one", cites: [...], mode: "SuppressAuthor" };
let clusterSA_First = { id: "one", cites: [...], mode: "SuppressAuthor", suppressFirst: 3 };
let clusterC = { id: "one", cites: [...], mode: "Composite" };
let clusterC_Infix = { id: "one", cites: [...], mode: "Composite", infix: ", whose book" };
let clusterC_Full = { id: "one", cites: [...], mode: "Composite", infix: ", whose books", suppressFirst: 0 };
It does support one extra option with SuppressAuthor
and Composite
on
clusters: suppressFirst
, which limits the effect to the first N name groups
(or if cite grouping is disabled, first N names). Setting it to 0 means
unlimited.
<intext>
element with AuthorOnly
etc.
citeproc-rs
supports the <intext>
element described in the citeproc-js
docs linked above, but it is not enabled by default. It also supports <intext and="symbol">
or and="text"
, which will swap out the last intext layout
delimiter (<layout delimiter="; ">
) for either the ampersand or the and
term.
If you want to use the <intext>
element in CSL, you may either:
Option 1: Add a feature flag to the style wishing to use it
<style class="in-text">
<features>
<feature name="custom-intext" />
</features>
...
</style>
AFAIK no other processors support this syntax yet.
Option 2: Enable the custom-intext
feature for all styles via new Driver
let driver = new Driver({ ..., cslFeatures: ["custom-intext"] });
Non-Interactive use, or re-hydrating a previously created document
If you are working non-interactively, or re-hydrating a previously created
document for interactive use, you may want to do one pass over all the clusters
in the document, so that each cluster and bibliography entry reflects the
correct value.
let allNotes = myDocument.footnotes.map(fn => {
return { cluster: getCluster(fn), number: fn.number }
});
driver.resetReferences(myDocument.allReferences);
driver.initClusters(allNotes.map(fn => fn.cluster));
driver.setClusterOrder(allNotes.map(fn => { id: fn.cluster.id, note: fn.number }));
let render = driver.fullRender();
for (let fn of allNotes) {
fn.renderedHtml = render.allClusters[fn.cluster.id];
}
let allBibKeys = render.bibEntries.map(entry => entry.id);
for (let bibEntry of render.bibEntries) {
myDocument.bibliographyMap[entry.id] = entry.value;
}
updateUserInterface(allNotes, myDocument, whatever);
parseStyleMetadata
Sometimes you want information about a CSL style without actually booting up a
whole driver. One important use case is a dependent style, which can't be used
with new Driver()
because it doesn't have the ability to render citations on
its own, and is essentially just a container for three pieces of information:
- A journal name
- An independent parent style
- A possible default-locale override
@citeproc-rs/wasm
provides an API for finding out what's in a CSL style file.
let styleMeta = parseStyleMetadata("<style ...> ... </style>");
This function can still throw a CslStyleError
, but this is less likely than
with new Driver() as it will not actually attempt to parse and validate all the
parts of a style. It will throw if the XML is malformed or if the <info>
block is too invalid to salvage.
Here's how to use parseStyleMetadata
to parse and use a dependent style.
let dependentStyle = "<style ...> ... </style>";
let meta = parseStyleMetadata(dependentStyle);
let isDependent = meta.info.parent != null;
let parentStyleId = isDependent && meta.info.parent.href;
let localeOverride = meta.defaultLocale;
let parentStyle = await downloadStyleWithId(parentStyleId);
let driver = new Driver({
style: parentStyle,
localeOverride,
...
});
await driver.fetchLocales();
let parentMeta = parseStyleMetadata(parentStyle);
if (parentMeta.independentMeta.hasBibliography) {
let bib = driver.makeBibliography();
}
driver.free();
setOutputFormat
and setStyle
If you wish to change the output format of the entire driver, you can use
setOutputFormat(format, formatOptions)
. The format is a string, one of "html" | "rtf" | "plain"
just like the new Driver
method. The options is an optional
argument with the same value as formatOptions
in new Driver
.
setStyle(xmlString)
will change the CSL style used by the driver.
Both of these methods will require throwing out almost all cached computation,
so use sparingly.
If you need to render a preview in a different format, there is an argument on
previewCluster
for doing just that. It does not throw out all the
computation. citeproc-rs
' disambiguation procedures do take formatting into
account, so <i>Title</i>
can be distinct from <b>Title</b>
in HTML and RTF,
but not if the whole driver's output format is "plain"
, since they both look
identical in plain text. previewCluster
will simply translate the formatting
into another format, without re-computing all the disambiguation.