glob-cache
Best and fastest file globbing solution for Node.js - can use any glob
library like glob
, globby
or fast-glob
! Streaming, Promise and Hook
APIs, with built in caching layer using cacache
. Makes you Instant Fast™.
Please consider following this project's author,
Charlike Mike Reagent, and :star: the project
to show your :heart: and support.
If you have any how-to kind of questions, please read the Contributing
Guide and Code of Conduct documents.
For bugs reports and feature requests, please create an issue
or ping @tunnckoCore at Twitter.
Project is semantically versioned & automatically released
from GitHub Actions with
Lerna.
Topic | Contact |
---|
Any legal or licensing questions, like private or commerical use | |
For any critical problems and security reports | |
Consulting, professional support, personal or team training | |
For any questions about Open Source, partnerships and sponsoring | |
Table of Contents
(TOC generated by verb using
markdown-toc)
Install
This project requires Node.js >=10.18 (see
Support & Release Policy).
Install it using yarn or
npm.
We highly recommend to use Yarn when you
think to contribute to this project.
$ yarn add glob-cache
API
Generated using jest-runner-docs.
A mirror of globCache.stream
and so an "async generator" function, returning
an AsyncIterable. This mirror exists because it's a common practice to have a
(globPatterns, options)
signature.
Signature
function(patterns, options)
Params
patterns
{string|Array} - string or array of glob patternsoptions
{object} - see globCache.stream
options
Examples
const globCache = require('glob-cache');
const iterable = globCache(['src/*.js', 'test/*.{js,ts}'], {
cwd: './foo/bar',
});
const iter = globCache.stream({
include: ['src/*.js', 'test/*.{js,ts}'],
cwd: './foo/bar',
});
Match files and folders with glob patterns, by default using
fast-glob's .stream()
. This function is
async generator and
returns "async iterable", so you can use the for await ... of
loop. Note that
this loop should be used inside an async function
. Each item is a
Context object, which is also passed to each hook.
Signature
function(options)
Params
options.cwd
{string} - working directory, defaults to process.cwd()
options.include
{string|Array} - string or array of string glob patternsoptions.patterns
{string|Array} - alias of options.include
options.exclude
{string|Array} - ignore glob patterns, passed to
options.globOptions.ignore
options.ignore
{string|Array} - alias of options.exclude
options.hooks
{object} - an object with hooks functions, each hook
passed with Contextoptions.hooks.found
{Function} - called when a cache for a file is foundoptions.hooks.notFound
{Function} - called when file is not found in
cache (usually the first hit)options.hooks.changed
{Function} - called always when source file
differs the cache fileoptions.hooks.notChanged
{Function} - called when both source file and
cache file are "the same"options.hooks.always
{Function} - called always, no matter of the stateoptions.glob
{Function} - a function (patterns, options) => {}
or
globbing library like glob, globby, fast-globoptions.globOptions
{object} - options passed to the options.glob
libraryoptions.cacheLocation
{string} - a filepath location of the cache,
defaults to .cache/glob-cache
in options.cwd
returns
{AsyncIterable}
Examples
const globCache = require('glob-cache');
(async () => {
const iterable = globCache.stream({
include: 'src/*.js',
cacheLocation: './foo-cache',
});
for await (const ctx of iterable) {
console.log(ctx);
}
})();
Using the Promise API allows you to use the Hooks API, and it's actually the
recommended way of using the hooks api. By default, if the returned promise
resolves, it will be an empty array. That's intentional, because if you are
using the hooks api it's unnecessary to pollute the memory putting huge objects
to a "result array". So if you want results array to contain the Context objects
you can pass buffered: true
option.
Signature
function(options)
Params
options
{object} - see globCache.stream
options, in addition here we
have options.buffered
toooptions.buffered
{boolean} - if true
returned array will contain
Context objects, default false
returns
{Promise} - if options.buffered: true
resolves to
Array<Context>
, otherwise empty array
Examples
const globCache = require('glob-cache');
const globby = require('globby');
(async () => {
const res = await globCache.promise({
include: 'src/*.js',
cacheLocation: './.cache/awesome-cache',
glob: globby.stream,
hooks: {
changed(ctx) {},
always(ctx) {},
},
});
console.log(res);
const results = await globCache.promise({
include: 'src/*.js',
exclude: 'src/bar.js',
buffered: true,
});
console.log(results);
})();
Context and how it works
Each context contains a { file, cacheFile, cacheLocation, cacache }
and more
properties. The file
one represents the fresh file loaded from the system, the
cacheFile
represents the file from the cache. Both has path
, size
and
integrity
properties, plus more.
The cacheFile
can be null
if it's the first hit (not found in cache), in
such case the ctx.notFound
will be true
and on next runs this will be
false
. When using the Hooks API, the options.hooks.notFound()
or
options.hooks.found()
will be called.
Important to note is that cacheFile
don't have a contents
property, but has
path
which points to the place of the cache file on the disk.
The interesting one is the ctx.changed
. This one is the reason for the whole
existance of this module. If both the "source" file and cache file are the same
(based on cacache), e.g. same size and integrity (which means the
contents/shasum are equal), then ctx.changed === false
, otherwise this will be
true
. Simply said, when you change your file(s) matched by a the given glob
pattern(s), then it will be ctx.changed === true
and the
options.hooks.changed()
will be called. Depending on whether it's the first
call or not, either options.hooks.found
or options.hooks.notFound
will also
be called.
If you are using the Hooks API (e.g. globCache.promise
plus options.hooks
),
there is also one more key point and that's that we have options.hooks.always
hook function, which might be useful if you want more control, and so you can
decide what to do or make more additional checks - for example, listen the
mtime
- or track the dependencies of the file. Tracking dependencies is
something that some test runner may benefit.
Because all that, we also expose cacache to the Context, so you can update
or clean the cache - it's up to you.
Example Context (the options.hooks.changed
, options.hooks.notFound
and
options.hooks.always
hooks are called)
{
file: {
path: '/home/charlike/github/tunnckoCore/opensource/packages/glob-cache/test/index.js',
contents: <Buffer 27 75 73 65 20 73 74 72 69 63 74 27 3b 0a 0a 63 6f 6e 73 74 20 70 61 74 68 20 3d 20 72 65 71 75 69 72 65 28 27 70 61 74 68 27 29 3b 0a 63 6f 6e 73 74 ... 350 more bytes>,
size: 427,
integrity: 'sha512-p5daDYwu9vhNNjT9vfRrWHXIwwlPxeqeub4gs3qMZ88J//ONUH7Je2Muu9o+MxjA1Fv3xwbgkBdjcHgdj7ar4A=='
},
cacheFile: null,
cacheLocation: '/home/charlike/github/tunnckoCore/opensource/packages/glob-cache/test/fixture-cache',
cacache: { /* cacache instance */ },
changed: true,
notFound: true
}
And when you run it more times (with no changes), the cacheFile
will not be
null
anymore, like so
{
file: {
path: '/home/charlike/github/tunnckoCore/opensource/packages/glob-cache/test/index.js',
contents: <Buffer 27 75 73 65 20 73 74 72 69 63 74 27 3b 0a 0a 63 6f 6e 73 74 20 70 61 74 68 20 3d 20 72 65 71 75 69 72 65 28 27 70 61 74 68 27 29 3b 0a 63 6f 6e 73 74 ... 350 more bytes>,
size: 427,
integrity: 'sha512-p5daDYwu9vhNNjT9vfRrWHXIwwlPxeqeub4gs3qMZ88J//ONUH7Je2Muu9o+MxjA1Fv3xwbgkBdjcHgdj7ar4A=='
},
cacheFile: {
key: '/home/charlike/github/tunnckoCore/opensource/packages/glob-cache/test/index.js',
integrity: 'sha512-p5daDYwu9vhNNjT9vfRrWHXIwwlPxeqeub4gs3qMZ88J//ONUH7Je2Muu9o+MxjA1Fv3xwbgkBdjcHgdj7ar4A=='
path: '/home/charlike/github/tunnckoCore/opensource/packages/glob-cache/fixture-cache/content-v2/sha512/78/84/a154130fdefee002a708cee1ae570db54b1a278fed9b7a3847c73b2545bd48947c2cd192d365f9d87653f098f80d98b4ee37923ba467dbc314acf0f42e39',
size: 427,
stat: Stat {}
time: 1579561781331,
metadata: undefined
},
cacheLocation: '/home/charlike/github/tunnckoCore/opensource/packages/glob-cache/fixture-cache',
cacache: { /* cacache instance */ },
changed: false,
notFound: false
}
As you can see above, both the file.integrity
and cacheFile.integrity
are
the same, also the size
, so the both files are equal (and so
ctx.changed: false
) - the options.hooks.notChanged
will be called.
Below example shows usage of changed
hook and Workers.
const globCache = require('glob-cache')
const JestWorker = require('jest-worker');
let worker = null;
(async () => {
await globCache.promise({
include: 'packages/*/src/**/*.js'
hooks: {
async changed(ctx) {
worker =
worker ||
new JestWorker(require.resolve('./my-awesome-worker-or-runner.js'), {
numWorkers: 7,
forkOptions: { stdio: 'inherit' },
});
await worker.default(ctx);
await worker.end();
},
}
});
})();
Above you're looking on a basic solution similar to what's done in Jest with the
difference that Jest can detect changes only if it's a Git project. At least the
--onlyChanged
works that way (with Git requirement) - which isn't a big
problem of course since mostly every project is using Git, but anyway.
The point is, that you can do whatever you want in custom conditions based on
your preferences and needs.
In above example you may wonder why we are instatiating JestWorker inside the
if
statement. That's because if you instantiate it before the call of
globCache
(where is the let worker
assignment) then you have no way to end
the worker in any meaningful and easy way.
Similar implementation you can see in the
hela-eslint-workers
branch where using glob-cache
we are trying to speed up ESLint a bit, by
putting eslint.executeOnFiles
or eslint.executeOnText
inside a worker. The
thing is that it doesn't help much, because ESLint is just slow - for the same
reason even the jest-runner-eslint
doesn't help much with performance. The
complexity in ESLint is O(n) - the more configs and plugins you have in your
config, the more slow it will run even on a single file - it's inevitable and a
huge problem. I'm not saying all that just to hate. It's just because of the
synchornous design of ESLint and the way it works. A big pain point is not only
that it exposes & uses only sync methods, but also the architecture of resolving
huge amount of configs and plugins. That may change if
RFC#9 is accepted, for which I have big
hopes. Even if it's accepted it will take few major releases.
back to top
Contributing
Guides and Community
Please read the Contributing Guide and Code of
Conduct documents for advices.
For bug reports and feature requests, please join our community
forum and open a thread there with prefixing the title of the thread with the
name of the project if there's no separate channel for it.
Consider reading the
Support and Release Policy
guide if you are interested in what are the supported Node.js versions and how
we proceed. In short, we support latest two even-numbered Node.js release lines.
Support the project
Become a Partner or Sponsor? :dollar: Check the OpenSource
Commision (tier). :tada: You can get your company logo, link & name on this
file. It's also rendered on package's page in npmjs.com and
yarnpkg.com sites too! :rocket:
Not financial support? Okey!
Pull requests,
stars and all kind of
contributions
are always welcome. :sparkles:
Contributors
This project follows the
all-contributors
specification. Contributions of any kind are welcome!
Thanks goes to these wonderful people
(emoji key), consider showing
your support to them:
back to top
License
Copyright (c) 2020-present, Charlike Mike Reagent
<opensource@tunnckocore.com>
& contributors.
Released under the MPL-2.0 License.