github.com/tylerbrittain42/static-site-generator

v0.0.0-20240723012550-791ef10c834b
Source
Go

Version published: 4 months ago

Created: 6 months ago

Source

Static-Site-Generator

Purpose

This project intends to take a series of markdown files and convert them to a series of html files that are ready to be hosted.

Transpiler approach

Lexer converts file to series of tokens
Tokens are passed to Parser (will occur concurrently eventually)
Parser outputs new file

Lexer approach

There are two possible ways to perform this:

Use lexer with a single pass to grab each token
Have lexer perform a second pass when inline values are detected

Single pass

Only one pass meaning no reason to parse anything twice
Can constatly output tokens(even though check would most likely be small)
AST's typically do not care about parenthesis (Would be difficult)

Multi pass

Second pass would be small and can be a simple flag during initial parsing
Most likely only minor slowdown (only slows down on a "large" token and only impacts that token(not line))
More form fitting with most AST implementations

Due to the reasons discussed above, it makes more sense to go with a multipass approach

Lexer Algo

Line is passed in
Determine line type(will be an attribute of all tokens in line)
Begin parsed character by character
Make tokens as needed(appending to slice)
Return token list OR error(specifying line and value that broke it)
Eventually consider streaming tokens???

Token Stuff

Properties

type token struct {
	category   tokenType
	value      string
	innerToken *token // use this to handle inline?
}

Example

# This is a string **with bold** ahaha

Option 1(single token per line with embedded)

type token struct {
	category   h1
	value      This is a string **with bold** ahaha
	innerToken {
                    category bold
                    contents with bold
}

Issues:

What if there are multiple inner tokens?
Why do we need the comlexity of inner tokens
What if a paragraph spans multiple lines?

Option 2(single token per line with toggle/flag to determine process embedded in line)

type token struct {
	category   h1
	value      This is a string <strong>with bold</strong> ahaha
}

Issues:

What if a paragraph spans multiple lines?
Why do we need the comlexity of inner tokens
doesn't seem very flexible in future

Option 3

# This is a string **with bold** ahaha

type token struct {
    TokenType Header
    String #
    line 0
}
type token struct {
    TokenType STRING
    String This is a string
    line 0
}
type token struct {
    TokenType bold
    String **
    line 0
}
type token struct {
    TokenType STRING
    String with bold
    line 0
}
type token struct {
    TokenType bold
    String with bold
    line 0
}
type token struct {
    TokenType bold
    String with bold
    line 0
}
type token struct {
    TokenType STRING
    String ahaha
    line 0
}
type token struct {
    TokenType newline
    String \n
    line 0
}

Issues:

so many tokens
bold start and bold end will be identical

FAQs

What is github.com/tylerbrittain42/static-site-generator?

Package last updated on 23 Jul 2024

Did you know?

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install