Dual Window Cache

The highest performance constant complexity cache algorithm.
Maintenance
The source code is maintained on the next source repository.
https://github.com/falsandtru/spica
Efficiency
Mathematical efficiency
Some different cache algorithms require extra memory space to retain evicted keys.
Linear time complexity indicates the existence of batch processing.
Note that admission algorithm doesn't work without eviction algorithm.
Algorithm | Type | Time complexity (Worst case) | Space complexity (Extra) | Key size | Data structures |
---|
LRU | Evict | Constant | Constant | 1x | 1 list |
DWC | Evict | Constant | Constant | 1x | 2 lists |
ARC | Evict | Constant | Linear | 2x | 4 lists |
LIRS | Evict | Linear | Linear | 3-2500x | 2 lists |
TinyLFU | Admit | Linear | Linear | ~1-10x (8bit * 10N * 4) | 5 arrays |
W-TinyLFU | Admit | Linear | Linear | ~1-10x (8bit * 10N * 4) | 1 list 4 arrays |
https://github.com/ben-manes/caffeine/wiki/Efficiency
https://github.com/zhongch4g/LIRS2/blob/master/src/replace_lirs_base.cc
Engineering efficiency
A pointer is 8 bytes, bool and int8 are each 1 byte in C.
8 byte key and value (int64, float64, 8 chars)
Memoize, etc.
Algorithm | Entry overhead | Key size | Total per entry | Attenuation coefficient |
---|
LRU | 16 bytes | 1x | 32 bytes | 100.00% |
DWC | 17 bytes | 1x | 33 bytes | 96.96% |
ARC | 17 bytes | 2x | 58 bytes | 55.17% |
(LIRS) | 33 bytes | 3x | 131 bytes | 24.42% |
(LIRS) | 33 bytes | 10x | 418 bytes | 7.65% |
(TinyLFU) | 56 bytes | 1x | 72 bytes | 44.44% |
W-TinyLFU | 56 bytes | 1x | 72 bytes | 44.44% |
32 byte key and 8 byte value (Session ID / ID)
In-memory KVS, etc.
Algorithm | Entry overhead | Key size | Total per entry | Attenuation coefficient |
---|
LRU | 16 bytes | 1x | 56 bytes | 100.00% |
DWC | 17 bytes | 1x | 57 bytes | 98.24% |
ARC | 17 bytes | 2x | 88 bytes | 63.63% |
(LIRS) | 33 bytes | 3x | 203 bytes | 27.58% |
(LIRS) | 33 bytes | 10x | 658 bytes | 8.51% |
(TinyLFU) | 56 bytes | 1x | 96 bytes | 58.33% |
W-TinyLFU | 56 bytes | 1x | 96 bytes | 58.33% |
16 byte key and 512 byte value (Domain / DNS packet)
DNS cache server, etc.
Algorithm | Entry overhead | Key size | Total per entry | Attenuation coefficient |
---|
LRU | 16 bytes | 1x | 544 bytes | 100.00% |
DWC | 17 bytes | 1x | 545 bytes | 99.81% |
ARC | 17 bytes | 2x | 578 bytes | 94.11% |
(LIRS) | 33 bytes | 3x | 659 bytes | 82.54% |
(LIRS) | 33 bytes | 10x | 1,002 bytes | 54.29% |
(TinyLFU) | 56 bytes | 1x | 584 bytes | 93.15% |
W-TinyLFU | 56 bytes | 1x | 584 bytes | 93.15% |
Resistance
LIRS's burst resistance means the resistance to continuous cache misses for the last LIR entry or the HIR entries.
Algorithm | Type | Scan | Loop | Burst |
---|
LRU | Evict | | | ✓ |
DWC | Evict | ✓ | ✓ | ✓ |
ARC | Evict | ✓ | | ✓ |
LIRS | Evict | ✓ | ✓ | |
TinyLFU | Admit | ✓ | ✓ | |
W-TinyLFU | Admit | ✓ | ✓ | ✓ |
Strategies
- Dynamic partition
- Sampled history injection
- Transitive wide MRU with cyclic replacement
Properties
Generally superior and almost flawless.
- Highest performance
- High hit ratio (DS1, S3, OLTP, GLI)
- Highest hit ratio of all the general-purpose cache algorithms.
- Approximate to W-TinyLFU (DS1, GLI).
- Approximate to ARC (S3, OLTP).
- W-TinyLFU is basically not a general-purpose cache algorithm due to some problems.
- W-TinyLFU is not a general-purpose cache algorithm without dynamic window and incremental reset.
- W-TinyLFU is impossible to efficiently implement without pointer addresses or fast hash functions.
- W-TinyLFU's benchmark settings are not described (Especially suspicious with OLTP).
- Highest engineering hit ratio of all the general cache algorithms.
- As a result of engineering efficiency.
- Low time overhead (High throughput)
- Low latency
- Constant time complexity.
- No batch processing like LIRS and TinyLFU.
- Parallel suitable
- Separated lists are suitable for lock-free processing.
- Efficient
- Low memory usage
- Largest capacity per memory size of all the advanced cache algorithms.
- Constant extra space complexity.
- Retain only keys of resident entries (No history).
- Immediate release of evicted keys
- Primary cache algorithm in the standard library must release memory immediately.
- Low space overhead
- Add only two smallest fields to entries.
- High resistance
- Scan, loop, and burst resistance
- Few tradeoffs
- Not the highest hit ratio
- Highest hit ratio of each workload is resulted by W-TinyLFU or ARC.
- Statistical accuracy dependent
- Very smaller cache size than sufficient can degrade hit ratio.
- Cache size 1,000 or more is recommended.
Tradeoffs
Note that LIRS and TinyLFU are risky cache algorithms.
- LRU
- Low performance
- No resistance
- Scan access clears all entries.
- ARC
- Middle performance
- Inefficient
- High overhead
- Few resistance
- DWC
- Not the highest hit ratio
- Statistical accuracy dependent
- LIRS
- Extremely inefficient
- Spike latency
- Bulk deletion of low-frequency entries takes linear time.
- Vulnerable algorithm
- Continuous cache misses for the last LIR entry or the HIR entries explode key size.
- TinyLFU
- Incomplete algorithm
- TinyLFU is just a vulnerable incomplete base-algorithm of W-TinyLFU.
- Burst access saturates Bloom filters.
- TinyLFU is worse than LRU in theory.
- Language dependent
- Impossible to efficiently implement without pointer addresses or fast hash functions.
- High overhead
- Read and write average 40 array elements per access.
- Restricted delete operation
- Bloom filters don't support delete operation.
- Frequent delete operations degrade performance.
- Spike latency
- Whole reset of Bloom filters takes linear time.
- Vulnerable algorithm
- Burst access degrades performance.
- W-TinyLFU
- Language dependent
- Impossible to efficiently implement without pointer addresses or fast hash functions.
- High overhead
- Read and write average 40 array elements per access.
- Restricted delete operation
- Bloom filters don't support delete operation.
- Frequent delete operations degrade performance.
- Spike latency
- Whole reset of Bloom filters takes linear time.
Hit ratio
Note that another cache algorithm sometimes changes the parameter values per workload to get a favorite result as the paper of TinyLFU has changed the window size of W-TinyLFU.
- DWC's results are measured by the same default parameter values.
- TinyLFU's results are the traces of Caffeine.
- Set the datasets to
./benchmark/trace
(See ./benchmark/ratio.ts
). - Run
npm i
. - Run
npm run bench
. - Click the DEBUG button to open a debug tab.
- Close the previous tab.
- Press F12 key to open devtools.
- Select the console tab.
https://github.com/dgraph-io/benchmarks
https://github.com/ben-manes/caffeine/wiki/Efficiency
https://docs.google.com/spreadsheets/d/1G3deNz1gJCoXBE2IuraUSwLE7H_EMn4Sn2GU0HTpI5Y (https://github.com/jedisct1/rust-arc-cache/issues/1)
DS1

W-TinyLFU, (TinyLFU) > DWC > (LIRS) > ARC > LRU
- DWC is an approximation of W-TinyLFU.
DS1 1,000,000
LRU hit ratio 3.08%
DWC hit ratio 14.73%
DWC - LRU hit ratio delta 11.65%
DWC / LRU hit ratio ratio 477%
DS1 2,000,000
LRU hit ratio 10.74%
DWC hit ratio 27.94%
DWC - LRU hit ratio delta 17.20%
DWC / LRU hit ratio ratio 260%
DS1 3,000,000
LRU hit ratio 18.59%
DWC hit ratio 39.46%
DWC - LRU hit ratio delta 20.87%
DWC / LRU hit ratio ratio 212%
DS1 4,000,000
LRU hit ratio 20.24%
DWC hit ratio 44.20%
DWC - LRU hit ratio delta 23.96%
DWC / LRU hit ratio ratio 218%
DS1 5,000,000
LRU hit ratio 21.03%
DWC hit ratio 50.19%
DWC - LRU hit ratio delta 29.16%
DWC / LRU hit ratio ratio 238%
DS1 6,000,000
LRU hit ratio 33.95%
DWC hit ratio 56.83%
DWC - LRU hit ratio delta 22.88%
DWC / LRU hit ratio ratio 167%
DS1 7,000,000
LRU hit ratio 38.89%
DWC hit ratio 62.55%
DWC - LRU hit ratio delta 23.65%
DWC / LRU hit ratio ratio 160%
DS1 8,000,000
LRU hit ratio 43.03%
DWC hit ratio 70.03%
DWC - LRU hit ratio delta 26.99%
DWC / LRU hit ratio ratio 162%
S3

W-TinyLFU, (TinyLFU) > (LIRS) > DWC > ARC > LRU
- DWC is an approximation of ARC.
S3 100,000
LRU hit ratio 2.32%
DWC hit ratio 10.14%
DWC - LRU hit ratio delta 7.81%
DWC / LRU hit ratio ratio 435%
S3 200,000
LRU hit ratio 4.63%
DWC hit ratio 20.25%
DWC - LRU hit ratio delta 15.61%
DWC / LRU hit ratio ratio 437%
S3 300,000
LRU hit ratio 7.58%
DWC hit ratio 27.39%
DWC - LRU hit ratio delta 19.80%
DWC / LRU hit ratio ratio 360%
S3 400,000
LRU hit ratio 12.03%
DWC hit ratio 32.69%
DWC - LRU hit ratio delta 20.65%
DWC / LRU hit ratio ratio 271%
S3 500,000
LRU hit ratio 22.76%
DWC hit ratio 38.12%
DWC - LRU hit ratio delta 15.35%
DWC / LRU hit ratio ratio 167%
S3 600,000
LRU hit ratio 34.63%
DWC hit ratio 46.82%
DWC - LRU hit ratio delta 12.19%
DWC / LRU hit ratio ratio 135%
S3 700,000
LRU hit ratio 46.04%
DWC hit ratio 55.71%
DWC - LRU hit ratio delta 9.66%
DWC / LRU hit ratio ratio 120%
S3 800,000
LRU hit ratio 56.59%
DWC hit ratio 64.03%
DWC - LRU hit ratio delta 7.43%
DWC / LRU hit ratio ratio 113%
OLTP

ARC > DWC > W-TinyLFU > (LIRS) > LRU > (TinyLFU)
- DWC is an approximation of ARC.
OLTP 250
LRU hit ratio 16.47%
DWC hit ratio 19.59%
DWC - LRU hit ratio delta 3.11%
DWC / LRU hit ratio ratio 118%
OLTP 500
LRU hit ratio 23.44%
DWC hit ratio 29.12%
DWC - LRU hit ratio delta 5.68%
DWC / LRU hit ratio ratio 124%
OLTP 750
LRU hit ratio 28.28%
DWC hit ratio 34.90%
DWC - LRU hit ratio delta 6.62%
DWC / LRU hit ratio ratio 123%
OLTP 1,000
LRU hit ratio 32.83%
DWC hit ratio 37.93%
DWC - LRU hit ratio delta 5.10%
DWC / LRU hit ratio ratio 115%
OLTP 1,250
LRU hit ratio 36.20%
DWC hit ratio 39.96%
DWC - LRU hit ratio delta 3.75%
DWC / LRU hit ratio ratio 110%
OLTP 1,500
LRU hit ratio 38.69%
DWC hit ratio 41.79%
DWC - LRU hit ratio delta 3.09%
DWC / LRU hit ratio ratio 108%
OLTP 1,750
LRU hit ratio 40.78%
DWC hit ratio 43.43%
DWC - LRU hit ratio delta 2.64%
DWC / LRU hit ratio ratio 106%
OLTP 2,000
LRU hit ratio 42.46%
DWC hit ratio 44.70%
DWC - LRU hit ratio delta 2.23%
DWC / LRU hit ratio ratio 105%
GLI

W-TinyLFU, (TinyLFU), (LIRS) > DWC >> ARC > LRU
- DWC is an approximation of W-TinyLFU.
GLI 250
LRU hit ratio 0.93%
DWC hit ratio 15.44%
DWC - LRU hit ratio delta 14.51%
DWC / LRU hit ratio ratio 1658%
GLI 500
LRU hit ratio 0.96%
DWC hit ratio 31.53%
DWC - LRU hit ratio delta 30.56%
DWC / LRU hit ratio ratio 3270%
GLI 750
LRU hit ratio 1.16%
DWC hit ratio 41.55%
DWC - LRU hit ratio delta 40.39%
DWC / LRU hit ratio ratio 3571%
GLI 1,000
LRU hit ratio 11.22%
DWC hit ratio 49.30%
DWC - LRU hit ratio delta 38.08%
DWC / LRU hit ratio ratio 439%
GLI 1,250
LRU hit ratio 21.25%
DWC hit ratio 52.42%
DWC - LRU hit ratio delta 31.16%
DWC / LRU hit ratio ratio 246%
GLI 1,500
LRU hit ratio 36.56%
DWC hit ratio 53.49%
DWC - LRU hit ratio delta 16.92%
DWC / LRU hit ratio ratio 146%
GLI 1,750
LRU hit ratio 45.04%
DWC hit ratio 55.60%
DWC - LRU hit ratio delta 10.55%
DWC / LRU hit ratio ratio 123%
GLI 2,000
LRU hit ratio 57.41%
DWC hit ratio 57.96%
DWC - LRU hit ratio delta 0.54%
DWC / LRU hit ratio ratio 100%
Throughput
No result with 10,000,000 because lru-cache crushes with the next error on the next machine of GitHub Actions.
It is verified that the error was thrown also when benchmarking only lru-cache.
Of course it is verified that DWC works fine under the same condition.
Error: Uncaught RangeError: Map maximum size exceeded
System:
OS: Linux 5.15 Ubuntu 20.04.5 LTS (Focal Fossa)
CPU: (2) x64 Intel(R) Xeon(R) Platinum 8370C CPU @ 2.80GHz
Memory: 5.88 GB / 6.78 GB
Clock: spica/clock
ISCCache: lru-cache
LRUCache: spica/lru
DW-Cache: spica/cache
'Clock new x 1,650,836 ops/sec ±1.94% (94 runs sampled)'
'ISCCache new x 18,042 ops/sec ±0.66% (105 runs sampled)'
'LRUCache new x 30,098,951 ops/sec ±0.23% (106 runs sampled)'
'DW-Cache new x 7,021,323 ops/sec ±0.30% (105 runs sampled)'
'Clock simulation 100 10% x 9,762,593 ops/sec ±0.36% (107 runs sampled)'
'ISCCache simulation 100 10% x 8,761,469 ops/sec ±0.38% (107 runs sampled)'
'LRUCache simulation 100 10% x 10,769,407 ops/sec ±0.28% (107 runs sampled)'
'DW-Cache simulation 100 10% x 7,242,192 ops/sec ±0.50% (105 runs sampled)'
'Clock simulation 1,000 10% x 9,601,967 ops/sec ±0.48% (107 runs sampled)'
'ISCCache simulation 1,000 10% x 7,986,140 ops/sec ±0.58% (106 runs sampled)'
'LRUCache simulation 1,000 10% x 9,735,550 ops/sec ±0.41% (106 runs sampled)'
'DW-Cache simulation 1,000 10% x 6,592,345 ops/sec ±0.37% (107 runs sampled)'
'Clock simulation 10,000 10% x 9,344,809 ops/sec ±0.40% (105 runs sampled)'
'ISCCache simulation 10,000 10% x 7,193,304 ops/sec ±0.83% (106 runs sampled)'
'LRUCache simulation 10,000 10% x 8,881,517 ops/sec ±0.41% (104 runs sampled)'
'DW-Cache simulation 10,000 10% x 6,020,040 ops/sec ±0.50% (106 runs sampled)'
'Clock simulation 100,000 10% x 5,948,133 ops/sec ±1.22% (101 runs sampled)'
'ISCCache simulation 100,000 10% x 3,654,505 ops/sec ±1.47% (101 runs sampled)'
'LRUCache simulation 100,000 10% x 5,615,930 ops/sec ±1.35% (100 runs sampled)'
'DW-Cache simulation 100,000 10% x 4,255,377 ops/sec ±1.79% (97 runs sampled)'
'Clock simulation 1,000,000 10% x 2,605,647 ops/sec ±3.98% (93 runs sampled)'
'ISCCache simulation 1,000,000 10% x 1,453,643 ops/sec ±2.92% (95 runs sampled)'
'LRUCache simulation 1,000,000 10% x 2,081,983 ops/sec ±4.23% (88 runs sampled)'
'DW-Cache simulation 1,000,000 10% x 2,598,274 ops/sec ±4.42% (89 runs sampled)'
'Clock simulation 100 90% x 25,014,146 ops/sec ±0.33% (107 runs sampled)'
'ISCCache simulation 100 90% x 22,495,828 ops/sec ±0.74% (105 runs sampled)'
'LRUCache simulation 100 90% x 20,969,655 ops/sec ±0.84% (107 runs sampled)'
'DW-Cache simulation 100 90% x 9,730,398 ops/sec ±0.32% (107 runs sampled)'
'Clock simulation 1,000 90% x 23,025,311 ops/sec ±0.51% (107 runs sampled)'
'ISCCache simulation 1,000 90% x 19,347,819 ops/sec ±0.34% (107 runs sampled)'
'LRUCache simulation 1,000 90% x 18,240,448 ops/sec ±0.28% (107 runs sampled)'
'DW-Cache simulation 1,000 90% x 11,382,934 ops/sec ±0.19% (108 runs sampled)'
'Clock simulation 10,000 90% x 20,506,917 ops/sec ±0.25% (105 runs sampled)'
'ISCCache simulation 10,000 90% x 15,441,103 ops/sec ±1.24% (105 runs sampled)'
'LRUCache simulation 10,000 90% x 13,104,661 ops/sec ±0.61% (105 runs sampled)'
'DW-Cache simulation 10,000 90% x 8,747,757 ops/sec ±0.92% (107 runs sampled)'
'Clock simulation 100,000 90% x 12,049,875 ops/sec ±1.49% (100 runs sampled)'
'ISCCache simulation 100,000 90% x 8,173,371 ops/sec ±1.17% (102 runs sampled)'
'LRUCache simulation 100,000 90% x 8,188,424 ops/sec ±2.08% (100 runs sampled)'
'DW-Cache simulation 100,000 90% x 5,973,422 ops/sec ±2.65% (100 runs sampled)'
'Clock simulation 1,000,000 90% x 5,578,321 ops/sec ±4.20% (92 runs sampled)'
'ISCCache simulation 1,000,000 90% x 2,963,294 ops/sec ±2.91% (95 runs sampled)'
'LRUCache simulation 1,000,000 90% x 2,235,658 ops/sec ±2.83% (95 runs sampled)'
'DW-Cache simulation 1,000,000 90% x 1,931,442 ops/sec ±2.32% (98 runs sampled)'
'ISCCache simulation 100 90% expire x 4,172,541 ops/sec ±5.34% (94 runs sampled)'
'DW-Cache simulation 100 90% expire x 8,241,722 ops/sec ±0.42% (107 runs sampled)'
'ISCCache simulation 1,000 90% expire x 4,169,949 ops/sec ±3.98% (97 runs sampled)'
'DW-Cache simulation 1,000 90% expire x 8,218,212 ops/sec ±0.30% (107 runs sampled)'
'ISCCache simulation 10,000 90% expire x 3,539,574 ops/sec ±4.02% (98 runs sampled)'
'DW-Cache simulation 10,000 90% expire x 6,338,384 ops/sec ±1.07% (105 runs sampled)'
'ISCCache simulation 100,000 90% expire x 2,429,074 ops/sec ±4.48% (94 runs sampled)'
'DW-Cache simulation 100,000 90% expire x 1,977,169 ops/sec ±2.71% (86 runs sampled)'
'ISCCache simulation 1,000,000 90% expire x 448,719 ops/sec ±5.09% (82 runs sampled)'
'DW-Cache simulation 1,000,000 90% expire x 629,254 ops/sec ±3.81% (98 runs sampled)'
API
export namespace Cache {
export interface Options<K, V = undefined> {
readonly capacity?: number;
readonly resource?: number;
readonly age?: number;
readonly eagerExpiration?: boolean;
readonly disposer?: (value: V, key: K) => void;
readonly capture?: {
readonly delete?: boolean;
readonly clear?: boolean;
};
readonly window?: number;
readonly sample?: number;
readonly sweep?: {
readonly threshold?: number;
readonly ratio?: number;
readonly window?: number;
readonly room?: number;
readonly range?: number;
readonly shift?: number;
};
}
}
export class Cache<K, V> {
constructor(capacity: number, opts?: Cache.Options<K, V>);
constructor(opts: Cache.Options<K, V>);
add(key: K, value: V, opts?: { size?: number; age?: number; }): boolean;
add(this: Cache<K, undefined>, key: K, value?: V, opts?: { size?: number; age?: number; }): boolean;
put(key: K, value: V, opts?: { size?: number; age?: number; }): boolean;
put(this: Cache<K, undefined>, key: K, value?: V, opts?: { size?: number; age?: number; }): boolean;
set(key: K, value: V, opts?: { size?: number; age?: number; }): this;
set(this: Cache<K, undefined>, key: K, value?: V, opts?: { size?: number; age?: number; }): this;
get(key: K): V | undefined;
has(key: K): boolean;
delete(key: K): boolean;
clear(): void;
resize(capacity: number, resource?: number): void;
readonly length: number;
readonly size: number;
[Symbol.iterator](): Iterator<[K, V], undefined, undefined>;
}