mdseo
Analyze Markdown Documents Used In Static Sites For SEO
Table of Contents
Installation
pip install mdseo
Usage
mdseo
provides CLI tools to check various statistics and metadata in markdown files. If an unwanted property is discovered, and error is raised. An overview of the CLI tools are below:
!mdseo_dupe_title -h
usage: mdseo_dupe_title [-h] [--srcdir SRCDIR]
Check for duplicate titles. Ignore with front matter `mdseo-ignore:
[dupe_title]`
optional arguments:
-h, --help show this help message and exit
--srcdir SRCDIR directory of files to check (default: .)
!mdseo_len -h
usage: mdseo_len [-h] [--n N] [--srcdir SRCDIR]
Check if docs contain less than `n` words. Ignore with front matter `mdseo-
ignore: [length]`
optional arguments:
-h, --help show this help message and exit
--n N minimum number of words a document should contain (default:
50)
--srcdir SRCDIR directory of files to check (default: .)
!mdseo_chk_fm -h
usage: mdseo_chk_fm [-h] [--srcdir SRCDIR] [--minlen MINLEN] [--maxlen MAXLEN]
{description,slug,image,authors}
Check front matter for various rules.
positional arguments:
{description,slug,image,authors} front matter field to check
optional arguments:
-h, --help show this help message and exit
--srcdir SRCDIR directory of files to check (default: .)
--minlen MINLEN the minimum character length allowed for the
field
--maxlen MAXLEN the maximum character length allowed for the
field
Examples
Check that description
is between 50 and 300 characters:
!mdseo_chk_fm description --minlen 50 --maxlen 300
Traceback (most recent call last):
File "/Users/hamel/opt/anaconda3/bin/mdseo_chk_fm", line 33, in <module>
sys.exit(load_entry_point('mdseo', 'console_scripts', 'mdseo_chk_fm')())
File "/Users/hamel/github/fastcore/fastcore/script.py", line 113, in _f
tfunc(**merge(args, args_from_prog(func, xtra)))
File "/Users/hamel/github/mdseo/mdseo/core.py", line 101, in chk_fm
return _checker(partial(_min_len_err, key=key, n=minlen),
File "/Users/hamel/github/mdseo/mdseo/core.py", line 89, in _checker
if fnames: raise Exception(f"The following files {msg}:\n\t{files}")
Exception: The following files have the field `description` in their front matter that is less than 50 characters:
./test_files/front_matter3.md
Check that the front matter slug
exists:
!mdseo_chk_fm slug
Traceback (most recent call last):
File "/Users/hamel/opt/anaconda3/bin/mdseo_chk_fm", line 33, in <module>
sys.exit(load_entry_point('mdseo', 'console_scripts', 'mdseo_chk_fm')())
File "/Users/hamel/github/fastcore/fastcore/script.py", line 113, in _f
tfunc(**merge(args, args_from_prog(func, xtra)))
File "/Users/hamel/github/mdseo/mdseo/core.py", line 107, in chk_fm
_checker(partial(_missing_fm, key=key), f"do not have the field `{key}` in their front matter", srcdir)
File "/Users/hamel/github/mdseo/mdseo/core.py", line 89, in _checker
if fnames: raise Exception(f"The following files {msg}:\n\t{files}")
Exception: The following files do not have the field `slug` in their front matter:
./CONTRIBUTING.md
./test_files/false_fm2.md
./test_files/false_fm.md
./test_files/test_docs.md
Check that the front matter slug
is no longer than 45 characters:
!mdseo_chk_fm slug --maxlen 45
Traceback (most recent call last):
File "/Users/hamel/opt/anaconda3/bin/mdseo_chk_fm", line 33, in <module>
sys.exit(load_entry_point('mdseo', 'console_scripts', 'mdseo_chk_fm')())
File "/Users/hamel/github/fastcore/fastcore/script.py", line 113, in _f
tfunc(**merge(args, args_from_prog(func, xtra)))
File "/Users/hamel/github/mdseo/mdseo/core.py", line 104, in chk_fm
return _checker(partial(_max_len_err, key=key, n=maxlen),
File "/Users/hamel/github/mdseo/mdseo/core.py", line 89, in _checker
if fnames: raise Exception(f"The following files {msg}:\n\t{files}")
Exception: The following files have the field `slug` in their front matter that is greater than 45 characters:
./test_files/front_matter_test_docs.md
Check that the front matter authors
exists:
!mdseo_chk_fm authors
Traceback (most recent call last):
File "/Users/hamel/opt/anaconda3/bin/mdseo_chk_fm", line 33, in <module>
sys.exit(load_entry_point('mdseo', 'console_scripts', 'mdseo_chk_fm')())
File "/Users/hamel/github/fastcore/fastcore/script.py", line 113, in _f
tfunc(**merge(args, args_from_prog(func, xtra)))
File "/Users/hamel/github/mdseo/mdseo/core.py", line 107, in chk_fm
_checker(partial(_missing_fm, key=key), f"do not have the field `{key}` in their front matter", srcdir)
File "/Users/hamel/github/mdseo/mdseo/core.py", line 89, in _checker
if fnames: raise Exception(f"The following files {msg}:\n\t{files}")
Exception: The following files do not have the field `authors` in their front matter:
./CONTRIBUTING.md
./test_files/front_matter2.md
./test_files/false_fm2.md
./test_files/front_matter_test_docs.md
./test_files/false_fm.md
./test_files/test_docs.md
Ignoring Checks
You may wish to ignore checks on individual files, there are two ways to do this (1) Through a special front-matter field called mdseo-ignore
or (2) by placing the word mdseo-ignore-all
in your markdown file.
With Front Matter
To ignore a check via front matter, supply the proper value(s) in the mdseo-ignore
field in your front matter. For example, if you wanted to ignore the mdseo_dupe_title
and mdseo_image
checks in a particular markdown file, you would inject the following front matter:
---
mdseo-ignore: [dupe_title, image]
---
You can find these values by consulting the help of the appropriate cli command, for example mdseo_dupe_title -h
says:
... Ignore with front matter `mdseo-ignore:[dupe_title]`
If you want to ignore all seo rules, you can also pass all
like so:
---
mdseo-ignore: all
---
There is a generic function mdseo_chk_fm
that checks the presence, min length and max length of a front matter field. You can ignore any checks conducted by this function by passing in the appropriate fields to mdseo-ignore
. These are the fields that you can ignore:
description
, slug
, image
, authors
For example, if you wanted to ingore all of these fields you could put the following in your front matter:
---
mdseo-ignore: [description,slug,image,authors]
---
With The Keyword mdseo-ignore-all
Some markdown files may not have front matter, or it may not be appropriate to add front matter to a file. In this case you can place the text mdseo-ignore-all
anywhere in the file and all checks will be ignored, the most common way to add this keyword is with a markdown comment:
<-- mdseo-ignore-all -->