Entangled
Entangled is a solution for Literate Programming, a technique in which the programmer writes a human narrative first, then implementing the program in code blocks.
Literate programming was introduced by Donald Knuth in 1984 and has since then found several surges in popularity. One thing holding back the popularity of literate programming is the lack of maintainability under increasing program complexity. Entangled solves this issue by offering a two-way synchronisation mechanism. You can edit and debug your code as normal in your favourite IDE or text editor. Entangled will make sure that your Markdown files stay up-to-date with your code and vice-versa. Because Entangled works with Markdown, you can use it with most static document generators. To summarise, you keep using:
- your favourite editor: Entangled runs as a daemon in the background, keeping your text files synchronised.
- your favourite programming language: Entangled is agnostic to programming languages.
- your favourite document generator: Entangled is configurable to any dialect of Markdown.
We’re trying to increase the visibility of Entangled. If you like Entangled, please consider adding this badge to the appropriate location in your project:
[![Entangled badge](https://img.shields.io/badge/entangled-Use%20the%20source!-%2300aeff)](https://entangled.github.io/)
Get started
To install Entangled, all you need is a Python (version ≥3.11) installation. If you use poetry
, and you start a new project,
poetry init
poetry add entangled-cli
The poetry init
command will create a pyproject.toml
file and a virtual environment to install Python dependencies in. To activate the virtual environment, run poetry shell
inside the project directory.
Or, if you prefer plain old pip
,
pip install entangled-cli
Use
Run the entangled watch
daemon in the root of your project folder. By default all Markdown files are monitored for fenced code blocks like so:
``` {.rust #hello file="src/world.rs"}
...
```
The syntax of code block properties is the same as CSS properties: #hello
gives the block the hello
identifier, .rust
adds the rust
class and the file
attribute is set to src/world.rs
(quotes are optional). For Entangled to know how to tangle this block, you need to specify a language and a target file. However, now comes the cool stuff. We can split our code in meaningful components by cross-refrences.
Hello World in C++
The combined code-blocks in this example compose a compilable source code for "Hello World". For didactic reasons we don't always give the listing of an entire source file in one go. In stead, we use a system of references known as noweb (after Ramsey 1994).
Inside source fragments you may encounter a line with <<...>>
marks like,
``` {.cpp file=hello_world.cc}
#include <cstdlib>
#include <iostream>
<<example-main-function>>
```
which is then elsewhere specified. Order doesn't matter,
``` {.cpp #hello-world}
std::cout << "Hello, World!" << std::endl;
```
So we can reference the <<hello-world>>
code block later on.
``` {.cpp #example-main-function}
int main(int argc, char **argv)
{
<<hello-world>>
}
```
A definition can be appended with more code as follows (in this case, order does matter!):
``` {.cpp #hello-world}
return EXIT_SUCCESS;
```
These blocks of code can be tangled into source files.
Configuring
Entangled is configured by putting a entangled.toml
in the root of your project.
version = "2.0"
watch_list = ["docs/**/*.md"]
ignore_list = ["docs/**/examples.md"]
You may add languages as follows:
[[languages]]
name = "Java"
identifiers = ["java"]
comment = { open = "//" }
[[languages]]
name = "XML"
identifiers = ["xml", "html", "svg"]
comment = { open = "<!--", close = "-->" }
The identifiers
are the tags that you may use in your code block header to identify the language. Using the above config, you should be able to write:
``` {.html file=index.html}
<!DOCTYPE html>
<html lang="en">
<<header>>
<<body>>
</html>
```
And so on...
Reading from pyproject.toml
If you have a pyproject.toml
file, either because you use poetry
to set up Entangled or because you're actually developing a Python project, you may want to put the configuration in pyproject.toml
instead. Add a tool.entangled
table like so:
[tool.entangled]
version = "2.0"
watch_list = ["docs/**/*.md"]
To add languages in your pyproject.toml
, add tool.entangled.languages
sections.
Be aware that these should be lists, not tables, so you will need to use double brackets, like so:
[[tool.entangled.languages]]
name = "Java"
identifiers = ["java"]
comment = { open = "//" }
Working with Git
When using Entangled in conjunction with Git, there are a few tricks that you may want to know about.
Restoring files when both Markdown and code have changed
When you edited both Markdown and code without the daemon running, you may need to do some tricks to get back into a consistent state.
git add .
git commit -m 'fixed everything' # save everything you did
entangled tangle --force # overwrites some changes you made
git restore src/brilliant_code.c # retrieve from latest commit
entangled stitch # apply changes back to markdown
git add .
git commit --amend # amend your commit to perfection
There may be better/faster ways to do this.
Entangled conflicts after merging branches
Entangled can get confused when you merge, and there is a conflict on .entangled/filedb.json
. This file keeps track of which files are sources for Entangled and which ones are generated by Entangled. That way, Entangled will never overwrite files it isn't supposed to, and the other way around, when you rename a target, the old one gets removed. It is very hard to merge this file though. When you need to, you can regenerate this file using:
entangled tangle -r
This will perform the tangle as if it is the first time, but it won't actually write files.
Hooks
Entangled has a system of hooks: these add actions to the tangling process:
build
trigger actions in a generated Makefile
brei
trigger actions (or tasks) using Brei, which is automatically installed along with Entangled. This is now prefered over the build
hook.quarto_attributes
add attributes to the code block in Quatro style with #|
comments at the top of the code block.shebang
takes the first line if it starts with #!
and puts it at the top of the file.
build
hook
You can enable this hook in entangled.toml
:
version = "2.0"
watch_list = ["docs/**/*.md"]
hooks = ["build"]
Then in your Markdown, you may enter code tagged with the .build
tag.
``` {.python .build target=docs/fig/plot.svg}
from matplotlib import pyplot as plt
import numpy as np
x = np.linspace(-np.pi, np.pi, 100)
y = np.sin(x)
plt.plot(x, y)
plt.savefig("docs/fig/plot.svg")
```
This code will be saved into a Python script in the .entangled/build
directory, or if you specify the file=
attribute some other location. Second, a Makefile
is generated in .entangled/build
, that can be invoked as,
make -f .entangled/build/Makefile
You may configure how code from different languages is evaluated in entangled.toml
. For example, to add Gnuplot support, and also make Julia code run through DaemonMode.jl
, you may do the following:
[hook.build.runners]
Gnuplot = "gnuplot {script}"
Julia = "julia --project=. --startup-file=no -e 'using DaemonMode; runargs()' {script}"
Once you have the code in place to generate figures and markdown tables, you can use the syntax at your disposal to include those into your Markdown. In this example that would be
![My awesome plot](fig/plot.svg)
In the case of tables or other rich content, Standard Markdown (or CommonMark) has no syntax for including other Markdown files, so you'll have to check with your own document generator how to do that. In MkDocs, you could use mkdocs-macro-plugin
, Pandoc has pandoc-include
, etc.
You can also specify intermediate data generation like so:
``` {.python .build target="data/result.csv"}
import numpy as np
import pandas as pd
result = np.random.normal(0.0, 1.0, (100, 2))
df = pd.DataFrame(result, columns=["x", "y"])
df.to_csv("data/result.csv")
```
``` {.python .build target="fig/plot.svg" deps="data/result.csv"}
import pandas as pd
df = pd.read_csv("data/result.csv")
plot = df.plot()
plot.savefig("fig/plot.svg")
```
The snippet for generating the data is given as a dependency for that data; to generate the figure, both result.csv
and the code snippet are dependencies.
quarto_attributes
hook
Sometimes using the build
hook (or the brei
hook, see below), leads to long header lines. It is then better to specify attributes in a header section of your code. The Quarto project came up with a syntax, having the header be indicated by a comment with a vertical bar, e.g. #|
or //|
etc. The quarto_attributes
hook reads those attributes and adds them to the properties of the code block.
Example with the brei
hook:
``` {.python .task}
#| description: Draw a triangle
#| creates: docs/fig/triangle.svg
#| collect: figures
from matplotlib import pyplot as plt
plt.plot([[-1, -0.5], [1, -0.5], [0, 1], [-1, -0.5]])
plt.savefig("docs/fig/triangle.svg")
```
![](fig/triangle.svg)
Using these attributes it is possible to write in Entangled using completely standard Markdown syntax. The following configuration disables the curly braces alltogether, though currently the quarto tags encoding the meta-data will end-up in the tangled code.
version="2.0"
watch_list=["*.typ"]
hooks=["quarto_attributes"]
[markers]
open="^(?P<indent>\\s*)```(?P<properties>.*)$"
close="^(?P<indent>\\s*)```\\s*$"
Then you can write code like so:
```python
#| id: hello
print("Hello, World!")
```
```python
#| file: test.py
if __name__ == "__main__":
<<hello>>
```
The id
attribute is reserved for the code's identifier (normally indicated with #
) and the classes
attribute can be used to indicate a list of classes in addition to the language class already given.
Brei
Entangled has a small build engine (similar to GNU Make) embedded, called Brei. You may give it a list of tasks (specified in TOML) that may depend on one another. Brei will run these when dependencies are newer than the target. Execution is lazy and in parallel. Brei supports:
- Running tasks by passing a script to any configured interpreter, e.g. Bash, Python, Lua etc.
- Redirecting
stdout
or stdin
to or from files. - Defining so called "phony" targets.
- Define
template
for programmable reuse. include
other Brei files, even ones that need to be generated by another task
.- Variable substitution, including writing
stdout
to variables.
Brei is available as a separate package, see the Brei documentation.
Examples
To write out "Hello, World!" to a file msg.txt
, we may do the following,
[[task]]
stdout = "secret.txt"
language = "Python"
script = """
print("Uryyb, Jbeyq!")
"""
To have this message decoded define a pattern,
[pattern.rot13]
stdout = "{stdout}"
stdin = "{stdin}"
language = "Bash"
script = """
tr a-zA-Z n-za-mN-ZA-M
"""
[[call]]
pattern = "rot13"
[call.args]
stdin = "secret.txt"
stdout = "msg.txt"
To define a phony target "all",
[[task]]
name = "all"
requires = ["msg.txt"]
The brei
hook
The following example uses both brei
and quatro_attributes
hooks. To add a Brei task, tag a code block with the .task
class.
First we generate some data.
``` {.python #some-functions}
# define some functions
```
Now we show what that data would look like:
``` {.python .task}
#| description: Generate data
#| creates: data/data.npy
<<some-functions>>
# generate and save data
```
Then we plot in another task.
``` {.python .task}
#| description: Plot data
#| creates: docs/fig/plot.svg
#| requires: data/data.npy
#| collect: figures
# load data and plot
```
The collect
attribute tells the Brei hook to add the docs/fig/plot.svg
target to the figures
collection. All figures can then be rendered as follows, having in entangled.toml
version = "2.0"
watch_list = ["docs/**/*.md"]
hooks = ["quatro_attributes", "brei"]
[brei]
include = [".entangled/tasks.json"]
And run
entangled brei figures
You can use ${variable}
syntax inside Brei tasks just as you would in a stand-alone Brei script.
Support for Document Generators
Entangled has been used successfully with the following document generators. Note that some of these examples were built using older versions of Entangled, but they should work just the same.
Pandoc
Pandoc is a very versatile tool for converting documents in any format. It specifically has very wide support for different forms of Markdown syntax out in the wild, including a filter system that lets you extend the workings of Pandoc. Those filters can be written in any language through an API, for instance Python filters can be written using panflute
, but there is also native support for Lua.
To work with Entangled style literate documents, there is a set of Pandoc filters available. The major downside of Pandoc, is that it offers no help in making your output HTML look beautiful. One option is to use the Bootstrap template, but you may wan't to try out others as well, or design your own. These days a lot can be done with a single well designed CSS file.
- :heavy_plus_sign: dynamic
- :heavy_plus_sign: supports most Markdown syntax out of the box
- :heavy_plus_sign: excellent for science: citation, numbered figures, tables and equations
- :heavy_plus_sign: support for LaTeX
- :heavy_minus_sign: harder to setup
- :heavy_minus_sign: takes work to make look good
Example: Hello World in C++
MkDocs
MkDocs is specifically taylored towards converting Markdown into good looking, easy to navigate HTML, especially when used in combination with the mkdocs-material
theme. To use Entangled style code blocks with MkDocs, you'll need to install the mkdocs-entangled-plugin
as well.
- :heavy_plus_sign: specifically designed for Markdown to HTML, i.e. software documentation
- :heavy_plus_sign: pretty output, out of the box
- :heavy_plus_sign: easy to install
- :heavy_minus_sign: not intended for scientific use: numbering and referencing equations, figures and tables is hard to setup
- :heavy_minus_sign: documentation is on par with most Python projects: Ok for most things, but really hard if you want specifics
Example: TBD
Typst
Typst has a syntax that is similar to Markdown when it comes to code blocks. Set the code block markers in entangled.toml
like so:
version="2.0"
watch_list=["*.typ"]
hooks=["quarto_attributes"]
[markers]
open="^(?P<indent>\\s*)```(?P<properties>.*)$"
close="^(?P<indent>\\s*)```\\s*$"
Documenter.jl
Documenter.jl is the standard tool to write Julia documention in. It has internal support for evaluating code block contents.
Example: Intro to code generation in Julia
PDoc
PDoc is a tool for documenting smaller Python projects. It grabs all documentation from the doc-strings in your Python library and generates a page from that. To have it include its own literate source, I had to use some very ugly hacks.
Example: check-deps, a Universal dependency checker in Python
Docsify
Docsify serves the markdown files and does the conversion to HTML in a Javascript library (in browser).
Example: Guide to C++ on the web through WASM
History
This is a rewrite of Entangled in Python. Older versions were written in Haskell. The rewrite in Python was motivated by ease of installation, larger community and quite frankly, a fit of mental derangement.
Contributing
If you have an idea for improving Entangled, please file an issue before creating a pull request. Code in this repository is formatted using black
and type checked using mypy
.
License
Copyright 2023 Netherlands eScience Center, written by Johan Hidding, licensed under the Apache 2 license, see LICENSE.