
Product
Rust Support Now in Beta
Socket's Rust support is moving to Beta: all users can scan Cargo projects and generate SBOMs, including Cargo.toml-only crates, with Rust-aware supply chain checks.
= PandocRefeqMathml - ad hoc tool to modify pandoc-converted MathML from LaTeX
== Summary
This Ruby command-line command modifies a MathML file converted with pandoc from LaTeX.
Whereas pandoc is a great text-ish file converter, there are a few caveats, at the time of writing, in converting a LaTeX file to MathML.
A major caveat is the generated MathML does not display the equation numbers that are auto-generated by LaTeX in default for the equation and eqnarray environments, nor their (LaTeX) labels. All the (LaTeX) ref remain as they are, which is a coded message for readers.
Another caveat is the alignments of equations in the eqnarray environment.
This tool is a bit of ad hoc (dirty) hack to correct these points in some basic situations. "Basic" here means just the standard LaTeX commands, not some external package-specific commands.
The full package of this module is found in {PandocRefeqMathml Ruby Gems page}[http://rubygems.org/gems/pandoc_refeq_mathml] (with document created from source annotation with yard) and in {Github}[https://github.com/masasakano/pandoc_refeq_mathml]
== Background and constraints
Pandoc-converted MathML.html from LaTeX lacks equation numbers that are present in the original LaTeX. The {pandoc-crossref}[https://github.com/lierdakil/pandoc-crossref] offers a way to tackle the problem; however its fix is far from perfect with three or four major caveats.
In LaTeX, you may reference equation 1 and 3 in a single +eqnarray+ environment separately. However, because of point (1), it would not be possible in pandoc-generated MathML. Besides, since they are not referenced with equation numbers in the MathML (point 3) in the first place.
This tool (command-line command) offers a way to fix these problems, albeit in a crude way. The command adds equation numbers that are guessed from the text in the annotation fields in ++ and LaTeX aux file (the latter of which is automatically generated as a byproduct when you compile a LaTeX document). Not all the numbers are recovered but only those that are referenced somewhere in the MathML file.
(Note that in principle, it should not be too difficult to modify the program so that all the labelled equations in LaTeX are labelled again in MathML. Nevertheless, it would be tricky to label equations that are not explicitly labelled in LaTeX because implicit numbering information is not available in the LaTeX aux file.)
Essentially, LaTeX has a huge amount of freedom and so I am afraid it would be a somewhat futile effort to deal with every possibility...
=== Output MathML by pandoc-2.19 converted from LaTeX
Ordinary LaTeX inline maths expressions (e.g., +$5^2$+) are expressed as follows:
5π 5\pi
LaTeX's +begin{equation}+ is as follows (n.b., the +
+ tag may not be closed immediately after ++ but another ordinary sentences may follow):
x±ϵ x \pm \epsilon \label{my_xe}
LaTeX's +begin{eqnarray}+ is as follows:
1+x = 1−x = 21x \begin{aligned} 1+x & = & 1-x \nonumber\\ & = & \frac{2}{1x} \label{eq_trivial} \end{aligned}
They are referred to as from another text follows:
Eq.[eq_trivial] was easy...
=== Algorithm
For fixing the alignments to follow the standard eqnarray alignments (right, centre, and left in this order), the program searches for ++ and rewrites the columnalign attributes in the ++ tags.
For fixing the equation numbers and links, the program
Each of the inserted equation number next to the corresponding equation is inside the ++ tags. In ++ (for LaTeX +\eqnarray{}+), it is inserted as a new ++ cell. In both cases, the text is right-aligned with some padding to the left. However, the position is relative to either the equation or the set of the equations that contains the relevant equation (for LaTeX +\eqnarray{}+) and is not like the original LaTeX, where equation numbers inside a pair of parentheses are always located at the right edge of a page in default.
== How to use the command
Once you have installed it according to the standard RubyGems procedure (see section Install), the main Ruby executable (command) pandoc_refeq_mathml should be in your command-search path.
It basically reads a MathML file from either the first command-line argument or STDIN and also a LaTeX aux file specified in a command-line, and then outputs the modified (corrected) MathML to STDOUT.
Any warnings are printed to either STDERR or a log-file specified in a command-line as an option.
Failure in matching the labels from an HTML tag with any of the MathML equations are printed as a warning (to STDERR in default). Although it may genuinely mean the non-existent labels in the original LaTeX source, it is far more likely that the labels belong to one of the sections (or tables of figures), because the algorithm cannot tell what the type (section, table, figure, or equation or else) of each label's origin is.
=== Help doc
The help doc for the command-line interface is displayed with +-h+ (or +--help+) option:
% pandoc_refeq_mathml -h Usage: pandoc_refeq_mathml [options] [--] [MathML.html] > STDOUT pandoc_refeq_mathml [options] [--] < STDIN > STDOUT
Description (Version=0.1): This fixes issues, label-references of equations and eqnarray alignments, of pandoc-converted MathML from LaTeX.
Specific options: -a, --aux [FILENAME] (mandatory) LaTeX aux filename --log [FILENAME] Log filename (Default: STDERR). /dev/null to disable it. --[no-]fixalign Fix eqnarray-alignment problems? (Def: true) -v, --[no-]verbose Run verbosely (Def: true)
Common options: -h, --help Show this message --version Show version
=== Examples
% pandoc_refeq_mathml --aux=mydoc.aux --log=error.log mydoc.html > revised1.html
% head -n 90 mydoc.html | pandoc_refeq_mathml --aux=mydoc.aux --no-fixalign > revised2.html
Also, in the +test/data/+ directory, there is a sample LaTeX file. You can run +make+ in the directory to generate and correct a HTML/MathML file. Read the comment in the +Makefile+ to see options, such as the LaTeX executable in your environment.
== Install
Standard Ruby-gem install procedure is suffice
% gem install pandoc_refeq_mathml
which should also install the dependant {Nokogiri gem}[https://rubygems.org/gems/nokogiri/].
Alternatively, it is possible to download the library file lib/pandoc_refeq_mathml.rb somewhere in your local directory, set the environmental variable RUBYLIB to also point to the directory for the library, and execute
% ruby bin/pandoc_refeq_mathml
where ruby is optional. Note that {Nokogiri gem}[https://rubygems.org/gems/nokogiri/] must be available in your RUBY library path.
In the developer's environment {diff-lcs gem}[https://rubygems.org/gems/diff-lcs] is also required.
This tool requires {Ruby}[http://www.ruby-lang.org] Version 2.0 or above.
== Developer's note
The source code is maintained also in {Github}[https://github.com/masasakano/pandoc_refeq_mathml] with no intuitive interface for annotations.
=== Tests
The Ruby codes under the directory test/ are the test scripts. You can run them from the top directory as ruby test/test_****.rb or simply run make test or rake test.
== Known bugs and ToDo items
== Copyright
Author:: Masa Sakano < info a_t wisebabel dot com > Versions:: The versions of this package follow Semantic Versioning (2.0.0) http://semver.org/ License:: MIT Warranty:: No warranty.
FAQs
Unknown package
We found that pandoc_refeq_mathml demonstrated a not healthy version release cadence and project activity because the last version was released a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Product
Socket's Rust support is moving to Beta: all users can scan Cargo projects and generate SBOMs, including Cargo.toml-only crates, with Rust-aware supply chain checks.
Product
Socket Fix 2.0 brings targeted CVE remediation, smarter upgrade planning, and broader ecosystem support to help developers get to zero alerts.
Security News
Socket CEO Feross Aboukhadijeh joins Risky Business Weekly to unpack recent npm phishing attacks, their limited impact, and the risks if attackers get smarter.