npm install efrt
if your data looks like this:
var data = {
bedfordshire: 'England',
aberdeenshire: 'Scotland',
buckinghamshire: 'England',
argyllshire: 'Scotland',
bambridgeshire: 'England',
cheshire: 'England',
ayrshire: 'Scotland',
banffshire: 'Scotland'
}
you can compress it like this:
import { pack } from 'efrt'
var str = pack(data)
then _very!_ quickly flip it back into:
import { unpack } from 'efrt'
var obj = unpack(str)
obj['bedfordshire']
Yep,
efrt packs category-type data into a very compressed prefix trie format, so that redundancies in the data are shared, and nothing is repeated.
By doing this clever-stuff ahead-of-time, efrt lets you ship much more data to the client-side, without hassle or overhead.
The whole library is 8kb, the unpack half is barely 2kb.
it is based on:
Benchmarks!
Basically,
- get a js object into very compact form
- reduce filesize/bandwidth a bunch
- ensure the unpacking time is negligible
- keep word-lookups on critical-path
import { pack, unpack } from 'efrt'
var foods = {
strawberry: 'fruit',
blueberry: 'fruit',
blackberry: 'fruit',
tomato: ['fruit', 'vegetable'],
cucumber: 'vegetable',
pepper: 'vegetable'
}
var str = pack(foods)
var obj = unpack(str)
console.log(obj.tomato)
or, an Array:
if you pass it an array of strings, it just creates an object with true
values:
const data = [
'january',
'february',
'april',
'june',
'july',
'august',
'september',
'october',
'november',
'december'
]
const packd = pack(data)
const sameArray = Object.keys(unpack(packd))
Reserved characters
the keys of the object are normalized. Spaces/unicode are good, but numbers, case-sensitivity, and some punctuation (semicolon, comma, exclamation-mark) are not (yet) supported.
specialChars = new RegExp('[0-9A-Z,;!:|¦]')
efrt is built-for, and used heavily in compromise, to expand the amount of data it can ship onto the client-side.
If you find another use for efrt, please drop us a line🎈
Performance
efrt is tuned to be very quick to unzip. It is O(1) to lookup. Packing-up the data is the slowest part, which is usually fine:
var compressed = pack(skateboarders)
var trie = unpack(compressed)
trie.hasOwnProperty('tony hawk')
Size
efrt
will pack filesize down as much as possible, depending upon the redundancy of the prefixes/suffixes in the words, and the size of the list.
- list of countries -
1.5k -> 0.8k
(46% compressed) - all adverbs in wordnet -
58k -> 24k
(58% compressed) - all adjectives in wordnet -
265k -> 99k
(62% compressed) - all nouns in wordnet -
1,775k -> 692k
(61% compressed)
but there are some things to consider:
- bigger files compress further (see 🎈 birthday problem)
- using efrt will reduce gains from gzip compression, which most webservers quietly use
- english is more suffix-redundant than prefix-redundant, so non-english words may benefit from other styles
Assuming your data has a low category-to-data ratio, you will hit-breakeven with at about 250 keys. If your data is in the thousands, you can very be confident about saving your users some considerable bandwidth.
Use
IE9+
<script src="https://unpkg.com/efrt@latest/builds/efrt.min.cjs"></script>
<script>
var smaller = efrt.pack(['larry', 'curly', 'moe'])
var trie = efrt.unpack(smaller)
console.log(trie['moe'])
</script>
if you're doing the second step in the client, you can load just the CJS unpack-half of the library(~3k):
const unpack = require('efrt/unpack')
<script src="https://unpkg.com/efrt@latest/builds/efrt-unpack.min.cjs"></script>
<script>
var trie = unpack(compressedStuff)
trie.hasOwnProperty('miles davis')
</script>
Thanks to John Resig for his fun trie-compression post on his blog, and Wiktor Jakubczyc for his performance analysis work
MIT