Dual Window Cache

The highest performance constant complexity cache algorithm.
Maintenance
The source code is maintained on the next source repository.
https://github.com/falsandtru/spica
Efficiency
Mathematical efficiency
Some different cache algorithms require extra memory space to retain evicted keys.
Linear time complexity indicates the existence of batch processing.
Note that admission algorithm doesn't work without eviction algorithm.
Algorithm | Type | Time complexity (Worst case) | Space complexity (Extra) | Key size | Data structures |
---|
LRU | Evict | Constant | Constant | 1x | 1 list |
DWC | Evict | Constant | Constant | 1x | 2 lists |
ARC | Evict | Constant | Linear | 2x | 4 lists |
LIRS | Evict | Linear | Linear | 3-2500x | 2 lists |
TinyLFU | Admit | Linear | Linear | ~1-10x (8bit * 10N * 4) | 5 arrays |
W-TinyLFU | Admit | Linear | Linear | ~1-10x (8bit * 10N * 4) | 1 list 4 arrays |
https://github.com/ben-manes/caffeine/wiki/Efficiency
https://github.com/zhongch4g/LIRS2/blob/master/src/replace_lirs_base.cc
Engineering efficiency
A pointer is 8 bytes, bool and int8 are each 1 byte in C.
8 byte key and value (int64, float64, 8 chars)
Memoize, etc.
Algorithm | Entry overhead | Key size | Total per entry | Attenuation coefficient |
---|
LRU | 16 bytes | 1x | 32 bytes | 100.00% |
DWC | 17 bytes | 1x | 33 bytes | 96.96% |
ARC | 17 bytes | 2x | 58 bytes | 55.17% |
(LIRS) | 33 bytes | 3x | 131 bytes | 24.42% |
(LIRS) | 33 bytes | 10x | 418 bytes | 7.65% |
(TinyLFU) | 56 bytes | 1x | 72 bytes | 44.44% |
W-TinyLFU | 56 bytes | 1x | 72 bytes | 44.44% |
32 byte key and 8 byte value (Session ID / ID)
In-memory KVS, etc.
Algorithm | Entry overhead | Key size | Total per entry | Attenuation coefficient |
---|
LRU | 16 bytes | 1x | 56 bytes | 100.00% |
DWC | 17 bytes | 1x | 57 bytes | 98.24% |
ARC | 17 bytes | 2x | 88 bytes | 63.63% |
(LIRS) | 33 bytes | 3x | 203 bytes | 27.58% |
(LIRS) | 33 bytes | 10x | 658 bytes | 8.51% |
(TinyLFU) | 56 bytes | 1x | 96 bytes | 58.33% |
W-TinyLFU | 56 bytes | 1x | 96 bytes | 58.33% |
16 byte key and 512 byte value (Domain / DNS packet)
DNS cache server, etc.
Algorithm | Entry overhead | Key size | Total per entry | Attenuation coefficient |
---|
LRU | 16 bytes | 1x | 544 bytes | 100.00% |
DWC | 17 bytes | 1x | 545 bytes | 99.81% |
ARC | 17 bytes | 2x | 578 bytes | 94.11% |
(LIRS) | 33 bytes | 3x | 659 bytes | 82.54% |
(LIRS) | 33 bytes | 10x | 1,002 bytes | 54.29% |
(TinyLFU) | 56 bytes | 1x | 584 bytes | 93.15% |
W-TinyLFU | 56 bytes | 1x | 584 bytes | 93.15% |
Resistance
LIRS's burst resistance means the resistance to continuous cache misses for the last LIR entry or the HIR entries.
Algorithm | Type | Scan | Loop | Burst |
---|
LRU | Evict | | | ✓ |
DWC | Evict | ✓ | ✓ | ✓ |
ARC | Evict | ✓ | | ✓ |
LIRS | Evict | ✓ | ✓ | |
TinyLFU | Admit | ✓ | ✓ | |
W-TinyLFU | Admit | ✓ | ✓ | ✓ |
Strategies
- Dynamic partition
- Sampled history injection
- Transitive wide MRU with cyclic replacement
Properties
Generally superior and almost flawless.
- Highest performance
- High hit ratio (DS1, S3, OLTP, GLI)
- Highest hit ratio of all the general-purpose cache algorithms.
- Approximate to W-TinyLFU (DS1, GLI).
- Approximate to ARC (S3, OLTP).
- W-TinyLFU is basically not a general-purpose cache algorithm due to some problems.
- W-TinyLFU is not a general-purpose cache algorithm without dynamic window and incremental reset.
- W-TinyLFU is impossible to efficiently implement without pointer addresses or fast hash functions.
- W-TinyLFU's benchmark settings are not described (Especially suspicious with OLTP).
- Highest engineering hit ratio of all the general cache algorithms.
- As a result of engineering efficiency.
- Low time overhead (High throughput)
- Low latency
- Constant time complexity.
- No batch processing like LIRS and TinyLFU.
- Parallel suitable
- Separated lists are suitable for lock-free processing.
- Efficient
- Low memory usage
- Largest capacity per memory size of all the advanced cache algorithms.
- Constant extra space complexity.
- Retain only keys of resident entries (No history).
- Immediate release of evicted keys
- Primary cache algorithm in the standard library must release memory immediately.
- Low space overhead
- Add only two smallest fields to entries.
- High resistance
- Scan, loop, and burst resistance
- Few tradeoffs
- Not the highest hit ratio
- Highest hit ratio of each workload is resulted by W-TinyLFU or ARC.
- Statistical accuracy dependent
- Very smaller cache size than sufficient can degrade hit ratio.
- Cache size 1,000 or more is recommended.
Tradeoffs
Note that LIRS and TinyLFU are risky cache algorithms.
- LRU
- Low performance
- No resistance
- Scan access clears all entries.
- ARC
- Middle performance
- Inefficient
- High overhead
- Few resistance
- DWC
- Not the highest hit ratio
- Statistical accuracy dependent
- LIRS
- Extremely inefficient
- Spike latency
- Bulk deletion of low-frequency entries takes linear time.
- Vulnerable algorithm
- Continuous cache misses for the last LIR entry or the HIR entries explode key size.
- TinyLFU
- Incomplete algorithm
- TinyLFU is just a vulnerable incomplete base-algorithm of W-TinyLFU.
- Burst access saturates Bloom filters.
- TinyLFU is worse than LRU in theory.
- Language dependent
- Impossible to efficiently implement without pointer addresses or fast hash functions.
- High overhead
- Read and write average 40 array elements per access.
- Restricted delete operation
- Bloom filters don't support delete operation.
- Frequent delete operations degrade performance.
- Spike latency
- Whole reset of Bloom filters takes linear time.
- Vulnerable algorithm
- Burst access degrades performance.
- W-TinyLFU
- Language dependent
- Impossible to efficiently implement without pointer addresses or fast hash functions.
- High overhead
- Read and write average 40 array elements per access.
- Restricted delete operation
- Bloom filters don't support delete operation.
- Frequent delete operations degrade performance.
- Spike latency
- Whole reset of Bloom filters takes linear time.
Hit ratio
Note that another cache algorithm sometimes changes the parameter values per workload to get a favorite result as the paper of TinyLFU has changed the window size of W-TinyLFU.
- DWC's results are measured by the same default parameter values.
- TinyLFU's results are the traces of Caffeine.
- Set the datasets to
./benchmark/trace
(See ./benchmark/ratio.ts
). - Run
npm i
. - Run
npm run bench
. - Click the DEBUG button to open a debug tab.
- Close the previous tab.
- Press F12 key to open devtools.
- Select the console tab.
https://github.com/dgraph-io/benchmarks
https://github.com/ben-manes/caffeine/wiki/Efficiency
https://docs.google.com/spreadsheets/d/1G3deNz1gJCoXBE2IuraUSwLE7H_EMn4Sn2GU0HTpI5Y (https://github.com/jedisct1/rust-arc-cache/issues/1)
DS1

W-TinyLFU, (TinyLFU) > DWC > (LIRS) > ARC > LRU
- DWC is an approximation of W-TinyLFU.
DS1 1,000,000
LRU hit ratio 3.08%
DWC hit ratio 14.73%
DWC - LRU hit ratio delta 11.65%
DWC / LRU hit ratio ratio 477%
DS1 2,000,000
LRU hit ratio 10.74%
DWC hit ratio 27.94%
DWC - LRU hit ratio delta 17.20%
DWC / LRU hit ratio ratio 260%
DS1 3,000,000
LRU hit ratio 18.59%
DWC hit ratio 39.46%
DWC - LRU hit ratio delta 20.87%
DWC / LRU hit ratio ratio 212%
DS1 4,000,000
LRU hit ratio 20.24%
DWC hit ratio 44.20%
DWC - LRU hit ratio delta 23.96%
DWC / LRU hit ratio ratio 218%
DS1 5,000,000
LRU hit ratio 21.03%
DWC hit ratio 50.19%
DWC - LRU hit ratio delta 29.16%
DWC / LRU hit ratio ratio 238%
DS1 6,000,000
LRU hit ratio 33.95%
DWC hit ratio 56.83%
DWC - LRU hit ratio delta 22.88%
DWC / LRU hit ratio ratio 167%
DS1 7,000,000
LRU hit ratio 38.89%
DWC hit ratio 62.55%
DWC - LRU hit ratio delta 23.65%
DWC / LRU hit ratio ratio 160%
DS1 8,000,000
LRU hit ratio 43.03%
DWC hit ratio 70.03%
DWC - LRU hit ratio delta 26.99%
DWC / LRU hit ratio ratio 162%
S3

W-TinyLFU, (TinyLFU) > (LIRS) > DWC > ARC > LRU
- DWC is an approximation of ARC.
S3 100,000
LRU hit ratio 2.32%
DWC hit ratio 10.14%
DWC - LRU hit ratio delta 7.81%
DWC / LRU hit ratio ratio 435%
S3 200,000
LRU hit ratio 4.63%
DWC hit ratio 20.25%
DWC - LRU hit ratio delta 15.61%
DWC / LRU hit ratio ratio 437%
S3 300,000
LRU hit ratio 7.58%
DWC hit ratio 27.39%
DWC - LRU hit ratio delta 19.80%
DWC / LRU hit ratio ratio 360%
S3 400,000
LRU hit ratio 12.03%
DWC hit ratio 32.69%
DWC - LRU hit ratio delta 20.65%
DWC / LRU hit ratio ratio 271%
S3 500,000
LRU hit ratio 22.76%
DWC hit ratio 38.12%
DWC - LRU hit ratio delta 15.35%
DWC / LRU hit ratio ratio 167%
S3 600,000
LRU hit ratio 34.63%
DWC hit ratio 46.82%
DWC - LRU hit ratio delta 12.19%
DWC / LRU hit ratio ratio 135%
S3 700,000
LRU hit ratio 46.04%
DWC hit ratio 55.71%
DWC - LRU hit ratio delta 9.66%
DWC / LRU hit ratio ratio 120%
S3 800,000
LRU hit ratio 56.59%
DWC hit ratio 64.03%
DWC - LRU hit ratio delta 7.43%
DWC / LRU hit ratio ratio 113%
OLTP

ARC > DWC > W-TinyLFU > (LIRS) > LRU > (TinyLFU)
- DWC is an approximation of ARC.
OLTP 250
LRU hit ratio 16.47%
DWC hit ratio 19.59%
DWC - LRU hit ratio delta 3.11%
DWC / LRU hit ratio ratio 118%
OLTP 500
LRU hit ratio 23.44%
DWC hit ratio 29.12%
DWC - LRU hit ratio delta 5.68%
DWC / LRU hit ratio ratio 124%
OLTP 750
LRU hit ratio 28.28%
DWC hit ratio 34.90%
DWC - LRU hit ratio delta 6.62%
DWC / LRU hit ratio ratio 123%
OLTP 1,000
LRU hit ratio 32.83%
DWC hit ratio 37.93%
DWC - LRU hit ratio delta 5.10%
DWC / LRU hit ratio ratio 115%
OLTP 1,250
LRU hit ratio 36.20%
DWC hit ratio 39.96%
DWC - LRU hit ratio delta 3.75%
DWC / LRU hit ratio ratio 110%
OLTP 1,500
LRU hit ratio 38.69%
DWC hit ratio 41.79%
DWC - LRU hit ratio delta 3.09%
DWC / LRU hit ratio ratio 108%
OLTP 1,750
LRU hit ratio 40.78%
DWC hit ratio 43.43%
DWC - LRU hit ratio delta 2.64%
DWC / LRU hit ratio ratio 106%
OLTP 2,000
LRU hit ratio 42.46%
DWC hit ratio 44.70%
DWC - LRU hit ratio delta 2.23%
DWC / LRU hit ratio ratio 105%
GLI

W-TinyLFU, (TinyLFU), (LIRS) > DWC >> ARC > LRU
- DWC is an approximation of W-TinyLFU.
GLI 250
LRU hit ratio 0.93%
DWC hit ratio 15.44%
DWC - LRU hit ratio delta 14.51%
DWC / LRU hit ratio ratio 1658%
GLI 500
LRU hit ratio 0.96%
DWC hit ratio 31.53%
DWC - LRU hit ratio delta 30.56%
DWC / LRU hit ratio ratio 3270%
GLI 750
LRU hit ratio 1.16%
DWC hit ratio 41.55%
DWC - LRU hit ratio delta 40.39%
DWC / LRU hit ratio ratio 3571%
GLI 1,000
LRU hit ratio 11.22%
DWC hit ratio 49.30%
DWC - LRU hit ratio delta 38.08%
DWC / LRU hit ratio ratio 439%
GLI 1,250
LRU hit ratio 21.25%
DWC hit ratio 52.42%
DWC - LRU hit ratio delta 31.16%
DWC / LRU hit ratio ratio 246%
GLI 1,500
LRU hit ratio 36.56%
DWC hit ratio 53.49%
DWC - LRU hit ratio delta 16.92%
DWC / LRU hit ratio ratio 146%
GLI 1,750
LRU hit ratio 45.04%
DWC hit ratio 55.60%
DWC - LRU hit ratio delta 10.55%
DWC / LRU hit ratio ratio 123%
GLI 2,000
LRU hit ratio 57.41%
DWC hit ratio 57.96%
DWC - LRU hit ratio delta 0.54%
DWC / LRU hit ratio ratio 100%
Throughput
60-90% of lru-cache.
No result with 10,000,000 because lru-cache crushes with the next error on the next machine of GitHub Actions.
It is verified that the error was thrown also when benchmarking only lru-cache.
Of course it is verified that DWC works fine under the same condition.
Error: Uncaught RangeError: Map maximum size exceeded
System:
OS: Linux 5.15 Ubuntu 20.04.5 LTS (Focal Fossa)
CPU: (2) x64 Intel(R) Xeon(R) Platinum 8370C CPU @ 2.80GHz
Memory: 5.88 GB / 6.78 GB
Clock: spica/clock
ISCCache: lru-cache
LRUCache: spica/lru
DW-Cache: spica/cache
'Clock new x 1,681,457 ops/sec ±1.18% (117 runs sampled)'
'ISCCache new x 17,725 ops/sec ±0.84% (119 runs sampled)'
'LRUCache new x 32,201,503 ops/sec ±0.26% (123 runs sampled)'
'DW-Cache new x 6,962,509 ops/sec ±0.28% (123 runs sampled)'
'Clock simulation 100 x 20,067,017 ops/sec ±0.27% (122 runs sampled)'
'ISCCache simulation 100 x 13,461,209 ops/sec ±0.36% (122 runs sampled)'
'LRUCache simulation 100 x 16,250,516 ops/sec ±0.53% (123 runs sampled)'
'DW-Cache simulation 100 x 7,809,927 ops/sec ±0.29% (123 runs sampled)'
'Clock simulation 1,000 x 16,387,821 ops/sec ±0.73% (123 runs sampled)'
'ISCCache simulation 1,000 x 11,893,246 ops/sec ±0.27% (123 runs sampled)'
'LRUCache simulation 1,000 x 13,991,155 ops/sec ±0.50% (123 runs sampled)'
'DW-Cache simulation 1,000 x 7,221,664 ops/sec ±0.33% (123 runs sampled)'
'Clock simulation 10,000 x 14,583,648 ops/sec ±0.51% (123 runs sampled)'
'ISCCache simulation 10,000 x 9,693,221 ops/sec ±0.58% (122 runs sampled)'
'LRUCache simulation 10,000 x 10,558,716 ops/sec ±0.82% (122 runs sampled)'
'DW-Cache simulation 10,000 x 6,604,530 ops/sec ±0.25% (123 runs sampled)'
'Clock simulation 100,000 x 8,904,873 ops/sec ±1.76% (118 runs sampled)'
'ISCCache simulation 100,000 x 5,527,676 ops/sec ±1.70% (114 runs sampled)'
'LRUCache simulation 100,000 x 6,309,913 ops/sec ±2.23% (114 runs sampled)'
'DW-Cache simulation 100,000 x 4,970,814 ops/sec ±2.23% (109 runs sampled)'
'Clock simulation 1,000,000 x 4,708,193 ops/sec ±4.16% (102 runs sampled)'
'ISCCache simulation 1,000,000 x 2,396,780 ops/sec ±4.46% (95 runs sampled)'
'LRUCache simulation 1,000,000 x 2,290,716 ops/sec ±3.21% (107 runs sampled)'
'DW-Cache simulation 1,000,000 x 2,088,055 ops/sec ±3.22% (110 runs sampled)'
const key = random() < 0.9
? random() * capacity * 1 | 0
: random() * capacity * 9 + capacity | 0;
cache.get(key) ?? cache.set(key, {});
API
export namespace Cache {
export interface Options<K, V = undefined> {
readonly capacity?: number;
readonly resource?: number;
readonly age?: number;
readonly eagerExpiration?: boolean;
readonly disposer?: (value: V, key: K) => void;
readonly capture?: {
readonly delete?: boolean;
readonly clear?: boolean;
};
readonly window?: number;
readonly sample?: number;
readonly sweep?: {
readonly threshold?: number;
readonly ratio?: number;
readonly window?: number;
readonly room?: number;
readonly range?: number;
readonly shift?: number;
};
}
}
export class Cache<K, V> {
constructor(capacity: number, opts?: Cache.Options<K, V>);
constructor(opts: Cache.Options<K, V>);
add(key: K, value: V, opts?: { size?: number; age?: number; }): boolean;
add(this: Cache<K, undefined>, key: K, value?: V, opts?: { size?: number; age?: number; }): boolean;
put(key: K, value: V, opts?: { size?: number; age?: number; }): boolean;
put(this: Cache<K, undefined>, key: K, value?: V, opts?: { size?: number; age?: number; }): boolean;
set(key: K, value: V, opts?: { size?: number; age?: number; }): this;
set(this: Cache<K, undefined>, key: K, value?: V, opts?: { size?: number; age?: number; }): this;
get(key: K): V | undefined;
has(key: K): boolean;
delete(key: K): boolean;
clear(): void;
resize(capacity: number, resource?: number): void;
readonly length: number;
readonly size: number;
[Symbol.iterator](): Iterator<[K, V], undefined, undefined>;
}