Security News
Research
Data Theft Repackaged: A Case Study in Malicious Wrapper Packages on npm
The Socket Research Team breaks down a malicious wrapper package that uses obfuscation to harvest credentials and exfiltrate sensitive data.
Alteza is a static site generator driven by PyPage. Examples of other static site generators can be found here.
Alteza can be thought of as a simpler and more flexible alternative to static site generators like Jekyll, Hugo, Zola, Nextra, etc.
The differentiator with Alteza is that the site author (if familiar with Python) will have a lot more fine-grained control over the output, than what (as far as I'm aware) any of the existing options offer.
The learning curve is also shorter with Alteza. I've tried to follow xmonad's philosophy of keeping things small and simple. Alteza doesn't try to do a lot of things; instead it simply offers the core crucial functionality that is common to most static site generators.
Alteza also imposes very little required structure or a particular "way of doing things" on your website (other than requiring unique names). You retain the freedom to organize your website as you wish. The name Alteza comes from a word that may be translated to illustriousness in Español.
A key design aspect of Alteza is writing little scripts and executing such code to generate your website. Your static site can contain arbitrary Python that is executed at the time of site generation. PyPage, in particular, makes it seamless to include actual Python code inside page templates. (This of course means that you must run Alteza with trusted code, or in an isolated container. For example, in a GitHub action–see instructions below.)
The directory structure is generally mirrored in the generated site.
By default, nothing is copied/published to the generated site.
public: true
variable/field that it is to be published.
link
function is used to link to other files..
are ignored.All file and directory names, except for index page files, at all depth levels must be unique. This is to simplify use of the link(name)
function. With unique file and directory names, one can simply link to a file or directory with just its name, without needing to disambiguate a non-unique name with its path. Note: Directories can only be linked to if the directory contains an index page.
There are two kinds of files: static asset files, and PyPage (i.e. dynamic template/layout or content) files. PyPage files get processed by the PyPage template engine.
Static asset files are not read by Alteza. They are selectively either symlinked or copied to the output directory (you can choose which, with a command-line argument). Here, selectively means that they are exposed in the output directory only if they are linked to from a PyPage file using a special Alteza-provided link(name)
function.
PyPage files are determined based on their file name extension. They are of two kinds:
.md
)..py
before its actual extension (i.e. any file with a .py
before the last .
in its file name). These are Non-Markdown Pypage files.There is an inherited "environment"/env
(this is just a collection of Python variables) that is injected into the lexical scope of every PyPage file, before it is processed/executed by PyPage. This env
is a little different for each PyPage invocation--a copy of the inherited is env
is created for each PyPage file. More on env
in a later point below.
Non-Markdown Pypage files are simply processed with PyPage as-is (and there is no template application step for non-Markdown PyPage files). The .py
part is removed from their name, and the output/result is copied to the generated site.
Markdown files:
Markdown files are first processed with PyPage, with a copy of the inherited env
.
After this, the Markdown file is converted to HTML using the Python-Markdown library.
Third, they have their "front matter" (if any) extracted using the Python-Markdown's library Meta-Data extension/feature.
---
in the Markdown file ends the front matter section.The fields from the YAML front matter the fields are injected into the env
/environment.
The HTML is injected into a content
variable in env
, and this env
is passed to a layout
template specified in configuration, for a second round of processing by PyPage. (Note: PyPage here is invoked on the template.)
Templates are HTML files processed by PyPage. The PyPage-processed Markdown HTML output is passed to the template/layout as the content
variable. The template itself is then executed by PyPage.
The template/layout should use this content
value via PyPage (with {{ content }}
) in order to inject the content
into itself.
The template is specified using a layout
or layoutRaw
variable declared in a __config__.py
file. (More on configuration files in a later point below.)
A layout
variable's value must be the name of a template.
For example, you can write layout: ordinary-page
in the YAML front matter of a Markdown file.
Or, alternatively, you can also write layout = "ordinary_page"
in a __config__.py
file. If a layout
variable is defined like this in a __config__.py
all adjacent and descendant files will inherit this layout
value.
This can be used as a way of defining a default layout/template.
Of course, the default can be overridden in a Markdown file by specifying a layout name in the YAML front matter, or with a new default in a descendant __config__.py
.
Lastly, alternatively, a layoutRaw
can also be defined whose value must be the entire contents of a template PyPage-HTML file. A convenience function readfile
is provided for this. For example, you can write something like layout = readfile('some_layout.html')
in a config file. A layoutRaw
, if specified, takes precedence over layout
. Using this layoutRaw
approach is not recommended.
Layouts/templates may be overridden in descendant __config__.py
files. Or, may be overridden in the Markdown file itself using YAML front matter (by specifying a layout: ...
), or even in a PyPage multiline code tag (not an inline code tag) inside a PyPage file (with a layout = ...
).
Markdown files result in a directory with the base name (i.e. without the .md
extension), with an index.html
file containing the Markdown's output.
The Environment (env
) and Configuration (__config__.py
, etc.):
Note: Python code in both .md
and other .py.*
files are run using Python's built-in exec
(and eval
) functions, and when they're run, we passed in a dictionary for their globals
argument. We call that dict the environment, or env
.
Configuration is done through file(s) called __config__.py
.
First, we recursively go through all directories top-down.
At each directory (descending downward), we execute an __config__.py
file, if one is present. After execution, we absorb any variables in it that do not start with a _
into the env
dict.
The deepest .md
/.py.*
files get executed first. After it executes, we check if a env
contains a field public
that is set as True
. If it does, we mark that file for publication. Other than recording the value of public
after each dynamic file is executed, any modification to env
made by a dynamic file are discarded (and not absorbed, unlike with __config__.py
).
__config__.py
to set public
as True
, as that would make the entire directory and all its descendants public (unless that behavior is exactly what is desired). Reachability with link
(described below) is, in my opinion, a better way to make only reachable content publicly exposed.The Name Registry and the link
function.
The name of every file in the input content is stored in a "name registry" of sorts that's used by link
.
Currently, names, without their file extension, have to be unique across input content. This might change in the future.
The Name Registry will error out if it encounters any non-unique names. (I understand this is a significant limitation, so I might support making this opt-out behavior with a --nonunique
flag in the future.)
Any non-dynamic content file that has been link
-ed to is marked for publication (i.e. copying or symlinking).
A Python function named link
is injected into the top level env
.
This function can be used to get relative links to any other file. link
will automatically determine and return the relative path to a file.
<a href="{{link('some-other-blog-post')}}">
, and the generated site will have a relative link to it (i.e. to its directory if a Markdown file, and to the file itself otherwise).Reachability of files is determined using this function, and unreachable files will be treated as non-public (and thus not exist in the generated site).
This function can be called both with a string identifying a file name, or with a reference to the file object itself. link
will check the type of the argument passed to it, and appropriately handle each type.
This link
function can also be called with string arguments using wiki-style links in Markdown files. For example, a [[Happy Cat]]
in a Markdown file is the equivalent of writing [Happy Cat]({{link('Happy Cat')}})
, or of writing <a href="{{link('Happy Cat')}}">Happy Cat</a>
.
A file name's extension must be omitted while using link
(including the .py*
for any file with .py
before its extension).
link('magic-turtle')
for the file magic-turtle.md
, and link('pygments-styles')
for the file pygments-styles.py.css
.about-me/hobbies/index.md
(or about-me/hobbies/index.py.html
) should just be linked to with a link('hobbies')
.Certain fields, with certain names, hold special meaning, and are called/used by Alteza. One such variable is layout
(and layoutRaw
), which points to the layout/template to be used to render the page (as explained in earlier points above). It can be overriden by descendant directories or pages.
Built-in | Description | ||||||||
---|---|---|---|---|---|---|---|---|---|
link |
The The Note: for Markdown pages, an extra Availability:
| ||||||||
path |
The
This function is good for use inside templates, to reference parent/ancestor templates for injection. For example, writing something like Available everywhere. | ||||||||
dir |
The This object has a fields like
In templates, the Available everywhere. | ||||||||
Title | The title is accessed with page.title . It is picked up either from PyPage code in the page or a title YAML field in the file. If `title` is not defined by the page, then page.realName of the file is used, which is the adjusted name of the file without its extension and idea date prefix (if present) removed. The title isn't properly available to Python inside the page itself, or from __config__.py , since the page has not been processed when these are executed. If page.title is accessed from these (the page or config), or if a title was never defined in the page, then the .realName of the file would be returned.
Note: the title can directly be accessed as Availability:
| ||||||||
YAML fields & other vars |
YAML fields (and other variables defined in PyPage code) of a page are:
Availability (same as
| ||||||||
Last Modified Date & Time |
This is only available on The last modified date & time for a given file is taken from: a. The date & time of the last commit that modified that file, in git history, if the file is inside a git repo. b. The last modified date & time as provided by the file system. There's a The Note: This function calls spawns a Available everywhere. | ||||||||
Idea Date |
This is only available on The "idea date" for a given file is either: a. For a Markdown file, a date prefix before the markdown file's name, in the form b. If not a Markdown file or there's no date prefix, and the file is in a git repo, then the idea date is the date of the first commit that introduced the file into git history. (Note: this breaks if the file was renamed or moved.) c. If there is neither a date prefix and the file is not in a git repo, there is no idea date for that file (i.e. it's There's a The Note: This function calls spawns a Available everywhere. | ||||||||
readfile | This is just a simple built-in function that reads the contents of a file (assuming utf-8 encoding) into a string, and returns it.
Available everywhere.
| ||||||||
sh | This exposes the entire sh library. The current working directory (CWD) would be wherever the file being executed is located (regardless of whether the file is a regular page or index page or __config__.py or template). If the file is a template, the CWD would be that of the page being processed.
See Available everywhere. | ||||||||
skip | This environment variable, if specified, is a list of names of files or directories to be skipped. (It must be of type List[str] , if defined.)
|
Alteza is available as a GitHub action, for use with GitHub Pages. This is the simplest way to use Alteza, if you intend to use it with GitHub Pages. Using the GitHub action will avoid needing to install or configure Alteza. You can easily create & deply an Alteza website onto GitHub Pages using this action.
To use the GitHub action, create a workflow file called something like .github/workflows/alteza.yml
, and paste the following in it:
name: Alteza
on:
workflow_dispatch:
push:
branches: [ "main" ]
jobs:
build:
name: Build Website
runs-on: ubuntu-latest
permissions:
contents: read
pages: write
id-token: write
environment:
name: github-pages
url: ${{ steps.generate.outputs.page_url }}
steps:
- name: Generate Alteza Website
id: generate
uses: arjun-menon/alteza@v0.9.1
with:
path: .
The last parameter path
should specify which directory in your GitHub repo should be rendered into a website. Also, note: make sure to set the branches
for workflow_dispatch
correctly (to your branch) so that this action is triggered on each push.
For an example of this GitHub workflow above in action, see alteza-test (yaml, runs).
You can install Alteza easily with pip:
pip install alteza
Try running alteza -h
to see the command-line options available.
If you've installed Alteza with pip, you can just run alteza
, e.g.:
alteza -h
If you're working on Alteza itself, then run the alteza
module itself, from the project directory directly, e.g. python3 -m alteza -h
.
The -h
argument above will print the list of available arguments:
usage: __main__.py --content CONTENT --output OUTPUT [--clear_output_dir] [--copy_assets] [--seed SEED] [--watch]
[--ignore [IGNORE ...]] [-h]
options:
--content CONTENT (str, required) Directory to read the input content from.
--output OUTPUT (str, required) Directory to write the generated site to.
--clear_output_dir (bool, default=False) Delete the output directory, if it already exists.
--copy_assets (bool, default=False) Copy static assets instead of symlinking to them.
--seed SEED (str, default={}) Seed JSON data to add to the initial root env.
--watch (bool, default=False) Watch for content changes, and rebuild.
--ignore [IGNORE ...]
(List[str], default=[]) Paths to completely ignore.
-h, --help show this help message and exit
As might be obvious above, you set the --content
field to your content directory.
The output directory for the generated site is specified with --output
. You can have Alteza automatically delete it entirely before being written to (including in --watch
mode) by setting the --clear_output_dir
flag.
Normally, Alteza performs a single build and exits. With the --watch
flag, Alteza monitors the file system for changes, and rebuilds the site automatically.
The --ignore
flag is a list of paths to files or directories to ignore. This is useful for ignoring directories like .gitignore
, or other non-pertinent files and directories.
Normal Alteza behavior for static assets is to create symlinks from your generate site to static files in your content directory. You can turn off this behavior with --copy_assets
.
The --seed
flag is a JSON string representing seed data for PyPage processing. This seed is injected into every PyPage document. The seed is not global, and so cannot be modified between files; it is copied into each PyPage execution environment.
To test against test_content
(and generate output to test_output
), run it like this:
python -m alteza --content test_content --output test_output --clear_output_dir
Feel free to send me PRs for this project.
I'm using black
. To re-format the code, just run: black alteza
.
Fwiw, I've configured my IDE (PyCharm) to always auto-format with black
.
To ensure better code quality, Alteza is type-checked with five different type checking systems: Mypy, Meta's Pyre, Microsoft's Pyright, Google's Pytype, and Pyflakes; as well as linted with Pylint.
To run some type checks:
mypy alteza # should have zero errors
pyflakes alteza # should have zero errors
pyre check # should have zero errors as well
pyright alteza # should have zero errors also
pytype alteza # should have zero errors too
Or, all at once with: mypy alteza ; pyflakes alteza ; pyre check ; pyright alteza ; pytype alteza
. Pytype is pretty slow, so feel free to omit it.
Linting policy is very strict. Pylint must issue a perfect 10/10 score, otherwise the Pylint CI check will fail. On a side note, you can see a UML diagram of the Alteza code if you click on any one of the completed workflow runs for the Pylint CI check.
To test whether lints are passing, simply run:
pylint -j 0 alteza
To run it along with all the type checks (excluding pytype
), just run: mypy alteza ; pyre check ; pyright alteza ; pyflakes alteza ; pylint -j 0 alteza
. I run this often.
Of course, when it makes sense, lints should be suppressed next to the relevant line, in code. Also, unlike typical Python code, the naming convention generally-followed in this codebase is camelCase
. Pylint checks for names have mostly been disabled.
Here's the Pylint-generated UML diagram of Alteza's code (that's current as of v0.9.0):
To install dependencies for development, run:
python3 -m pip install -r requirements.txt
python3 -m pip install -r requirements-dev.txt
To use a virtual environment (after creating one with python3 -m venv venv
):
source venv/bin/activate
# ... install requirements ...
# ... do some development ...
deactive # end the venv
This project is licensed under the AGPL v3, but I'm reserving the right to re-license it under a license with fewer restrictions, e.g. the Apache License 2.0, and any PRs constitute consent to re-license as such.
FAQs
Super-flexible Static Site Generator
We found that alteza demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
Research
The Socket Research Team breaks down a malicious wrapper package that uses obfuscation to harvest credentials and exfiltrate sensitive data.
Research
Security News
Attackers used a malicious npm package typosquatting a popular ESLint plugin to steal sensitive data, execute commands, and exploit developer systems.
Security News
The Ultralytics' PyPI Package was compromised four times in one weekend through GitHub Actions cache poisoning and failure to rotate previously compromised API tokens.