memoize-fs
Node.js solution for memoizing/caching a function and its return state into the file system
Motivation
This project is inspired by the memoize project by Mariusz Nowak aka medikoo.
The motivation behind this module is that sometimes you have to persist cached function calls but you do not want to deal with an extra process
(ie. managing a Redis store).
Memoization is best technique to save on memory or CPU cycles when we deal with repeated operations. For detailed insight see:
http://en.wikipedia.org/wiki/Memoization
Features
Installation
In your project path:
npm install memoize-fs --save
Usage
const assert = require('assert')
const memoizeFs = require('memoize-fs')
const memoizer = memoizeFs({ cachePath: './some-cache' })
console.log(memoizer)
async function main () {
let idx = 0
const func = function foo (a, b) {
idx += a + b
return idx
}
const memoizedFn = await memoizer.fn(func)
const resultOne = await memoizedFn(1, 2)
assert.strictEqual(resultOne, 3)
assert.strictEqual(idx, 3)
const resultTwo = await memoizedFn(1, 2)
assert.strictEqual(resultTwo, 3)
assert.strictEqual(idx, 3)
}
main().catch(console.error)
NOTE: that memoized function is always an async function and
the result of it is a Promise (if not await
-ed as seen in above example)!
Signature
See Types and Options sections for more info.
const memoizer = memoizeFs(MemoizeOptions)
console.log(memoizer)
const memoizedFn = memoizer.fn(FunctionToMemoize, Options?)
Memoizing asynchronous functions
memoize-fs assumes a function asynchronous if the last argument it accepts is of type function
and that function itself accepts at least one argument.
So basically you don't have to do anything differently than when memoizing synchronous functions. Just make sure the above condition is fulfilled.
Here is an example of memoizing a function with a callback:
var funAsync = function (a, b, cb) {
setTimeout(function () {
cb(null, a + b);
}, 100);
};
memoize.fn(funAsync).then(function (memFn) {
memFn(1, 2, function (err, sum) { if (err) { throw err; } console.log(sum); }).then(function () {
return memFn(1, 2, function (err, sum) { if (err) { throw err; } console.log(sum); });
}).then(function () {
}).catch( );
}).catch( );
Memoizing promisified functions
You can also memoize a promisified function. memoize-fs assumes a function promisified if its result is thenable
which means that the result is an object with a property then
of type function
(read more about JavaScript promises here).
So again it's the same as with memoizing synchronous functions.
Here is an example of memoizing a promisified function:
var funPromisified = function (a, b) {
return new Promise(function (resolve, reject) {
setTimeout(function () { resolve(a + b); }, 100);
});
};
memoize.fn(funPromisified).then(function (memFn) {
memFn(1, 2).then(function (result) {
assert.strictEqual(result, 3);
return memFn(1, 2);
}).then(function (result) {
assert.strictEqual(result, 3);
}).catch( );
}).catch( );
Types
export interface Options {
cacheId?: string;
salt?: string;
maxAge?: number;
force?: boolean;
astBody?: boolean;
noBody?: boolean;
serialize?: (val?: any) => string;
deserialize?: (val?: string) => any;
}
export type MemoizeOptions = Options & { cachePath: string };
export type FnToMemoize = (...args: any[]) => any;
export interface Memoizer {
fn: (fnToMemoize: FunctionToMemoize, options?: Options) => Promise<FunctionToMemoize>;
invalidate: (id?: string) => Promise<any>;
getCacheFilePath: (fnToMemoize: FunctionToMemoize, options: Options) => string;
}
declare function memoizeFs(options: MemoizeOptions): Memoizer;
export = memoizeFs;
Options
When memoizing a function all below options can be applied in any combination.
The only required option is cachePath
.
cachePath
Path to the location of the cache on the disk. This option is always required.
cacheId
By default all cache files are saved into the root cache which is the folder specified by the cachePath option:
var path = require('path')
var memoizer = require('memoize-fs')({ cachePath: path.join(__dirname, '../../cache') })
The cacheId
option which you can specify during memoization of a function resolves to the name of a subfolder created inside the root cache folder.
Cached function calls will be cached inside that folder:
memoizer.fn(fnToMemoize, { cacheId: 'foobar' })
salt
Functions may have references to variables outside their own scope. As a consequence two functions which look exactly the same
(they have the same function signature and function body) can return different results even when executed with identical arguments.
In order to avoid the same cache being used for two different functions you can use the salt
option
which mutates the hash key created for the memoized function which in turn defines the name of the cache file:
memoizer.fn(fnToMemoize, { salt: 'foobar' })
maxAge
With maxAge
option you can ensure that cache for given call is cleared after a predefined period of time (in milliseconds).
memoizer.fn(fnToMemoize, { maxAge: 10000 })
force
The force
option forces the re-execution of an already memoized function and the re-caching of its outcome:
memoizer.fn(fnToMemoize, { force: true })
astBody
If you want to use the function AST instead the function body when generating the hash (see serialization), set the option astBody
to true
. This allows the function source code to be reformatted without busting the cache. See https://github.com/borisdiakur/memoize-fs/issues/6 for details.
memoizer.fn(fnToMemoize, { astBody: true })
noBody
If for some reason you want to omit the function body when generating the hash (see serialization), set the option noBody
to true
.
memoizer.fn(fnToMemoize, { noBody: true })
retryOnInvalidCache
By default, undefined
is returned when trying to read an invalid cache file. For example, when trying to parse an empty file with JSON.parse
. By enabling retryOnInvalidCache
, the memoized function will be called again, and a new cache file will be written.
memoizer.fn(fnToMemoize, { retryOnInvalidCache: true })
serialize and deserialize
These two options allows you to control how the serialization and deserialization process works.
By default we use basic JSON.stringify
and JSON.parse
, but you may need more advanced stuff.
In the following example we are using Yahoo's serialize-javascript
to be able to cache properly the return result of memoized function containing a function
.
const memoizeFs = require('memoize-fs')
const serialize = require('serialize-javascript')
const deserialize = (serializedJsString) => eval(`(${serializedJsString})`)
const memoizer = memoizeFs({ cachePath: './cache', serialize, deserialize })
function someFn (a) {
const bar = 123
setTimeout(() => {}, a * 10)
return {
bar,
getBar() { return a + bar }
}
}
memoizer.fn(someFn)
Manual cache invalidation
You can delete the root cache (all cache files inside the folder specified by the cachePath option):
memoizer.invalidate().then(() => { console.log('cache cleared') })
You can also pass the cacheId argument to the invalidate method. This way you only delete the cache inside the subfolder with given id.
memoizer.invalidate('foobar').then(() => { console.log('cache for "foobar" cleared') })
Serialization
See also the options.seriliaze
and options.deserialize
.
memoize-fs uses JSON to serialize the results of a memoized function.
It also uses JSON, when it tries to serialize the arguments of the memoized function in order to create a hash
which is used as the name of the cache file to be stored or retrieved.
The hash is created from the serialized arguments, the function body and the salt (if provided as an option).
You can generate this hash using memoize.getCacheFilePath
:
var memoizer = require('memoize-fs')({ cachePath: './' })
memoizer.getCacheFilePath(function () {}, ['arg', 'arg'], { cacheId: 'foobar' })
Since memoize-fs is using JSON for serialization, you should know how it works around some of its "limitations":
- It ignores circular references silently
- It ignores arguments and attributes of type function silently
- It converts
NaN
to undefined
silently - It converts all objects, no matter what class they were an instance of, to objects with prototype
Object
(see #16)
Some "limitations" can not (yet?) be worked around:
- Serializing huge objects will fail with one of the following two error messages
RangeError: Invalid string length
at Object.stringify (native)
at stringifyResult (node_modules/memoize-fs/index.js:x:y) -> line where memoize-fs uses JSON.stringify
FATAL ERROR: JS Allocation failed - process out of memory
Common pitfalls
-
Be carefull when memoizing a function which uses variables from the outer scope.
The value of these variables may change during runtime but the cached result will remain the same
when calling the memoized function with the same arguments as the first time when the result was cached.
-
You should know about how memoize-fs handles serialization under the hood.
Contributing
Issues and Pull-requests are absolutely welcome. If you want to submit a patch, please make sure that you follow this simple rule:
All code in any code-base should look like a single person typed it, no matter how
many people contributed. — idiomatic.js
Lint with:
npm run lint
Test with:
npm run mocha
Check code coverage with:
npm run cov
Then please commit with a detailed commit message.