Overview
A regex that tokenizes CSS.
var cssTokens = require("css-tokens")
cssString.match(cssTokens)
cssString.replace(cssTokens, function(token) {
if (token === ".foo") {
token = "bar"
}
return token
})
Installation
npm install css-tokens
var cssTokens = require("css-tokens")
Usage
cssTokens
A regex with the g
flag that matches CSS tokens.
The regex always matches, even invalid CSS and the empty string. For
example, cssTokens.exec(string)
never returns null
.
The next match is always directly after the previous. Each token has its own
capturing group.
cssTokens.names
An array of names for each token, in the capturing group order.
Invalid code handling
Unterminated strings are still matched as strings. CSS strings cannot contain
(unescaped) newlines, so unterminated strings simply end at the end of the
line.
Unterminated multi-line comments are also still matched as comments. They
simply go on to the end of the string.
Unterminated unquoted urls are also still matched as unquoted urls. They
continue as long as there are valid characters.
Invalid ASCII characters have their own capturing group.
Limitations
Tokenizing CSS using regexes—in fact, one single regex—won’t be
perfect. But that’s not the point either.
@
or .
followed by a single -
Ideally, @-
and .-
(followed by a non-name character) would be matched as
invalid + operator, but they are in fact matched as names. This could be
fixed, but isn’t to simplify the regex. It doesn’t really matter.
Note that #-
is actually allowed by the spec.
Quoted vs. unquoted urls
The following is hardly a “limitation”, but could be mentioned:
url(http://www.w3.org/2000/svg)
url('http://www.w3.org/2000/svg')
The first line is matched as one single token (unquotedUrl), while the second
is matched as four (name + punctuation + string + punctuation). This could be
fixed, but isn’t to simplify the regex.
Build
index.js is generated by running node generate-index.js
. The regex is written
in regex.coffee. Don’t worry, you don’t need to know anything about
CoffeeScript: regex.coffee should be kept as simple as possible. CoffeeScript
is only used for its block regexes, which have the following benefits:
- Insignificant whitespace.
- Comments.
- No need to escape slashes.
- No need to double-escape everything (as opposed to using
RegExp("regex as a string. One backslash: \\\\")
). - Plenty of syntax highlighters available.
Everything else is written in JavaScript.
License
The X11 (“MIT”) License.