file-entry-cache
Advanced tools
Comparing version 9.1.0 to 10.0.0
{ | ||
"name": "file-entry-cache", | ||
"version": "9.1.0", | ||
"description": "Super simple cache for file metadata, useful for process that work o a given series of files and that only need to repeat the job on the changed ones since the previous run of the process", | ||
"repository": "jaredwray/file-entry-cache", | ||
"version": "10.0.0", | ||
"description": "A lightweight cache for file metadata, ideal for processes that work on a specific set of files and only need to reprocess files that have changed since the last run", | ||
"type": "module", | ||
"main": "./dist/index.cjs", | ||
"module": "./dist/index.js", | ||
"types": "./dist/index.d.ts", | ||
"exports": { | ||
".": { | ||
"require": "./dist/index.cjs", | ||
"import": "./dist/index.js" | ||
} | ||
}, | ||
"repository": "https://github.com/jaredwray/cacheable.git", | ||
"author": "Jared Wray <me@jaredwray.com>", | ||
"license": "MIT", | ||
"author": { | ||
"name": "Jared Wray", | ||
"url": "https://jaredwray.com" | ||
}, | ||
"main": "cache.js", | ||
"files": [ | ||
"cache.js" | ||
], | ||
"engines": { | ||
"node": ">=18" | ||
}, | ||
"scripts": { | ||
"clean": "rimraf ./coverage /node_modules ./package-lock.json ./yarn.lock ./pnpm-lock.yaml", | ||
"test": "xo --fix && c8 mocha -R spec test/specs/cache.js test/relative.js", | ||
"test:relative": "rimraf ./rfixtures ./tfixtures && mocha test/relative.js", | ||
"test:ci": "xo && c8 --reporter=lcov mocha -R spec test/specs/cache.js test/relative.js", | ||
"perf": "node perf.js" | ||
}, | ||
"prepush": [ | ||
"npm run test" | ||
], | ||
"precommit": [ | ||
"npm run test" | ||
], | ||
"private": false, | ||
"keywords": [ | ||
@@ -40,21 +28,23 @@ "file cache", | ||
"devDependencies": { | ||
"c8": "^10.1.2", | ||
"chai": "^4.3.10", | ||
"glob-expand": "^0.2.1", | ||
"mocha": "^10.5.1", | ||
"rimraf": "^5.0.7", | ||
"webpack": "^5.92.1", | ||
"write": "^2.0.0", | ||
"xo": "^0.58.0" | ||
"@types/node": "^22.7.4", | ||
"@vitest/coverage-v8": "^2.1.1", | ||
"rimraf": "^6.0.1", | ||
"tsup": "^8.3.0", | ||
"typescript": "^5.6.2", | ||
"vitest": "^2.1.1", | ||
"xo": "^0.59.3" | ||
}, | ||
"dependencies": { | ||
"flat-cache": "^5.0.0" | ||
"flat-cache": "^6.1.0" | ||
}, | ||
"xo": { | ||
"rules": { | ||
"unicorn/prefer-module": "off", | ||
"n/prefer-global/process": "off", | ||
"unicorn/prevent-abbreviations": "off" | ||
} | ||
"files": [ | ||
"dist", | ||
"license" | ||
], | ||
"scripts": { | ||
"build": "rimraf ./dist && tsup src/index.ts --format cjs,esm --dts --clean", | ||
"test": "xo --fix && vitest run --coverage", | ||
"test:ci": "xo && vitest run", | ||
"clean": "rimraf ./dist ./coverage ./node_modules" | ||
} | ||
} | ||
} |
244
README.md
@@ -0,116 +1,200 @@ | ||
[<img align="center" src="https://cacheable.org/symbol.svg" alt="Cacheable" />](https://github.com/jaredwray/cacheable) | ||
# file-entry-cache | ||
> Super simple cache for file metadata, useful for process that work on a given series of files and that only need to repeat the job on the changed ones since the previous run of the process | ||
> A lightweight cache for file metadata, ideal for processes that work on a specific set of files and only need to reprocess files that have changed since the last run | ||
[![NPM Version](https://img.shields.io/npm/v/file-entry-cache.svg?style=flat)](https://npmjs.org/package/file-entry-cache) | ||
[![tests](https://github.com/jaredwray/file-entry-cache/actions/workflows/tests.yaml/badge.svg?branch=master)](https://github.com/jaredwray/file-entry-cache/actions/workflows/tests.yaml) | ||
[![codecov](https://codecov.io/github/jaredwray/file-entry-cache/graph/badge.svg?token=37tZMQE0Sy)](https://codecov.io/github/jaredwray/file-entry-cache) | ||
[![npm](https://img.shields.io/npm/dm/file-entry-cache)](https://npmjs.com/package/file-entry-cache) | ||
[![codecov](https://codecov.io/gh/jaredwray/cacheable/graph/badge.svg?token=lWZ9OBQ7GM)](https://codecov.io/gh/jaredwray/cacheable) | ||
[![tests](https://github.com/jaredwray/cacheable/actions/workflows/tests.yml/badge.svg)](https://github.com/jaredwray/cacheable/actions/workflows/tests.yml) | ||
[![npm](https://img.shields.io/npm/dm/flat-cache.svg)](https://www.npmjs.com/package/flat-cache) | ||
[![npm](https://img.shields.io/npm/v/flat-cache)](https://www.npmjs.com/package/flat-cache) | ||
[![GitHub](https://img.shields.io/github/license/jaredwray/cacheable)](https://github.com/jaredwray/cacheable/blob/main/LICENSE) | ||
# Features | ||
## install | ||
- Lightweight cache for file metadata | ||
- Ideal for processes that work on a specific set of files | ||
- Persists cache to Disk via `reconcile()` or `persistInterval` on `cache` options. | ||
- Uses `checksum` to determine if a file has changed | ||
- Supports `relative` and `absolute` paths | ||
- Ability to rename keys in the cache. Useful when renaming directories. | ||
- ESM and CommonJS support with Typescript | ||
# Table of Contents | ||
- [Installation](#installation) | ||
- [Getting Started](#getting-started) | ||
- [Changes from v9 to v10](#changes-from-v9-to-v10) | ||
- [Global Default Functions](#global-default-functions) | ||
- [FileEntryCache Options (FileEntryCacheOptions)](#fileentrycache-options-fileentrycacheoptions) | ||
- [API](#api) | ||
- [Get File Descriptor](#get-file-descriptor) | ||
- [Using Checksums to Determine if a File has Changed (useCheckSum)](#using-checksums-to-determine-if-a-file-has-changed-usechecksum) | ||
- [Setting Additional Meta Data](#setting-additional-meta-data) | ||
- [How to Contribute](#how-to-contribute) | ||
- [License and Copyright](#license-and-copyright) | ||
# Installation | ||
```bash | ||
npm i --save file-entry-cache | ||
npm install file-entry-cache | ||
``` | ||
## Usage | ||
# Getting Started | ||
The module exposes two functions `create` and `createFromFile`. | ||
```javascript | ||
import fileEntryCache from 'file-entry-cache'; | ||
const cache = fileEntryCache.create('cache1'); | ||
let fileDescriptor = cache.getFileDescriptor('file.txt'); | ||
console.log(fileDescriptor.changed); // true as it is the first time | ||
fileDescriptor = cache.getFileDescriptor('file.txt'); | ||
console.log(fileDescriptor.changed); // false as it has not changed | ||
// do something to change the file | ||
fs.writeFileSync('file.txt', 'new data foo bar'); | ||
// check if the file has changed | ||
fileDescriptor = cache.getFileDescriptor('file.txt'); | ||
console.log(fileDescriptor.changed); // true | ||
``` | ||
## `create(cacheName, [directory, useCheckSum, currentWorkingDir])` | ||
- **cacheName**: the name of the cache to be created | ||
- **directory**: Optional the directory to load the cache from | ||
- **usecheckSum**: Whether to use md5 checksum to verify if file changed. If false the default will be to use the mtime and size of the file. | ||
- **currentWorkingDir**: Optional the current working directory to use when resolving relative paths | ||
Save it to Disk and Reconsile files that are no longer found | ||
```javascript | ||
import fileEntryCache from 'file-entry-cache'; | ||
const cache = fileEntryCache.create('cache1'); | ||
let fileDescriptor = cache.getFileDescriptor('file.txt'); | ||
console.log(fileDescriptor.changed); // true as it is the first time | ||
fileEntryCache.reconcile(); // save the cache to disk and remove files that are no longer found | ||
``` | ||
## `createFromFile(pathToCache, [useCheckSum, currentWorkingDir])` | ||
- **pathToCache**: the path to the cache file (this combines the cache name and directory) | ||
- **useCheckSum**: Whether to use md5 checksum to verify if file changed. If false the default will be to use the mtime and size of the file. | ||
- **currentWorkingDir**: Optional the current working directory to use when resolving relative paths | ||
Load the cache from a file: | ||
```js | ||
// loads the cache, if one does not exists for the given | ||
// Id a new one will be prepared to be created | ||
var fileEntryCache = require('file-entry-cache'); | ||
```javascript | ||
import fileEntryCache from 'file-entry-cache'; | ||
const cache = fileEntryCache.createFromFile('/path/to/cache/file'); | ||
let fileDescriptor = cache.getFileDescriptor('file.txt'); | ||
console.log(fileDescriptor.changed); // false as it has not changed from the saved cache. | ||
``` | ||
var cache = fileEntryCache.create('testCache'); | ||
# Changes from v9 to v10 | ||
var files = expand('../fixtures/*.txt'); | ||
There have been many features added and changes made to the `file-entry-cache` class. Here are the main changes: | ||
- Added `cache` object to the options to allow for more control over the cache | ||
- Added `hashAlgorithm` to the options to allow for different checksum algorithms. Note that if you load from file it most likely will break if the value was something before. | ||
- Updated more on using Relative or Absolute paths. We now support both on `getFileDescriptor()`. You can read more on this in the `Get File Descriptor` section. | ||
- Migrated to Typescript with ESM and CommonJS support. This allows for better type checking and support for both ESM and CommonJS. | ||
- Once options are passed in they get assigned as properties such as `hashAlgorithm` and `currentWorkingDirectory`. This allows for better control and access to the options. For the Cache options they are assigned to `cache` such as `cache.ttl` and `cache.lruSize`. | ||
- Added `cache.persistInterval` to allow for saving the cache to disk at a specific interval. This will save the cache to disk at the interval specified instead of calling `reconsile()` to save. (`off` by default) | ||
- Added `getFileDescriptorsByPath(filePath: string): FileEntryDescriptor[]` to get all the file descriptors that start with the path specified. This is useful when you want to get all the files in a directory or a specific path. | ||
- Added `renameAbsolutePathKeys(oldPath: string, newPath: string): void` will rename the keys in the cache from the old path to the new path. This is useful when you rename a directory and want to update the cache without reanalyzing the files. | ||
- Using `flat-cache` v6 which is a major update. This allows for better performance and more control over the cache. | ||
- On `FileEntryDescriptor.meta` if using typescript you need to use the `meta.data` to set additional information. This is to allow for better type checking and to avoid conflicts with the `meta` object which was `any`. | ||
// the first time this method is called, will return all the files | ||
var oFiles = cache.getUpdatedFiles(files); | ||
# Global Default Functions | ||
- `create(cacheId: string, cacheDirectory?: string, useCheckSum?: boolean, currentWorkingDirectory?: string)` - Creates a new instance of the `FileEntryCache` class | ||
- `createFromFile(cachePath: string, useCheckSum?: boolean, currentWorkingDirectory?: string)` - Creates a new instance of the `FileEntryCache` class and loads the cache from a file. | ||
// this will persist this to disk checking each file stats and | ||
// updating the meta attributes `size` and `mtime`. | ||
// custom fields could also be added to the meta object and will be persisted | ||
// in order to retrieve them later | ||
cache.reconcile(); | ||
# FileEntryCache Options (FileEntryCacheOptions) | ||
- `currentWorkingDirectory?` - The current working directory. Used when resolving relative paths. | ||
- `useCheckSum?` - If `true` it will use a checksum to determine if the file has changed. Default is `false` | ||
- `hashAlgorithm?` - The algorithm to use for the checksum. Default is `md5` but can be any algorithm supported by `crypto.createHash` | ||
- `cache.ttl?` - The time to live for the cache in milliseconds. Default is `0` which means no expiration | ||
- `cache.lruSize?` - The number of items to keep in the cache. Default is `0` which means no limit | ||
- `cache.useClone?` - If `true` it will clone the data before returning it. Default is `false` | ||
- `cache.expirationInterval?` - The interval to check for expired items in the cache. Default is `0` which means no expiration | ||
- `cache.persistInterval?` - The interval to save the data to disk. Default is `0` which means no persistence | ||
- `cache.cacheDir?` - The directory to save the cache files. Default is `./cache` | ||
- `cache.cacheId?` - The id of the cache. Default is `cache1` | ||
- `cache.parse?` - The function to parse the data. Default is `flatted.parse` | ||
- `cache.stringify?` - The function to stringify the data. Default is `flatted.stringify` | ||
// use this if you want the non visited file entries to be kept in the cache | ||
// for more than one execution | ||
// | ||
// cache.reconcile( true /* noPrune */) | ||
# API | ||
// on a second run | ||
var cache2 = fileEntryCache.create('testCache'); | ||
- `constructor(options?: FileEntryCacheOptions)` - Creates a new instance of the `FileEntryCache` class | ||
- `useCheckSum: boolean` - If `true` it will use a checksum to determine if the file has changed. Default is `false` | ||
- `hashAlgorithm: string` - The algorithm to use for the checksum. Default is `md5` but can be any algorithm supported by `crypto.createHash` | ||
- `currentWorkingDirectory: string` - The current working directory. Used when resolving relative paths. | ||
- `getHash(buffer: Buffer): string` - Gets the hash of a buffer used for checksums | ||
- `createFileKey(filePath: string): string` - Creates a key for the file path. This is used to store the data in the cache based on relative or absolute paths. | ||
- `deleteCacheFile(filePath: string): void` - Deletes the cache file | ||
- `destroy(): void` - Destroys the cache. This will also delete the cache file. If using cache persistence it will stop the interval. | ||
- `removeEntry(filePath: string): void` - Removes an entry from the cache. This can be `relative` or `absolute` paths. | ||
- `reconcile(): void` - Saves the cache to disk and removes any files that are no longer found. | ||
- `hasFileChanged(filePath: string): boolean` - Checks if the file has changed. This will return `true` if the file has changed. | ||
- `getFileDescriptor(filePath: string, options?: { useCheckSum?: boolean, currentWorkingDirectory?: string }): FileEntryDescriptor` - Gets the file descriptor for the file. Please refer to the entire section on `Get File Descriptor` for more information. | ||
- `normalizeEntries(entries: FileEntryDescriptor[]): FileEntryDescriptor[]` - Normalizes the entries to have the correct paths. This is used when loading the cache from disk. | ||
- `analyzeFiles(files: string[])` will return `AnalyzedFiles` object with `changedFiles`, `notFoundFiles`, and `notChangedFiles` as FileDescriptor arrays. | ||
- `getUpdatedFiles(files: string[])` will return an array of `FileEntryDescriptor` objects that have changed. | ||
- `getFileDescriptorsByPath(filePath: string): FileEntryDescriptor[]` will return an array of `FileEntryDescriptor` objects that starts with the path specified. | ||
- `renameAbsolutePathKeys(oldPath: string, newPath: string): void` - Renames the keys in the cache from the old path to the new path. This is useful when you rename a directory and want to update the cache without reanalyzing the files. | ||
// will return now only the files that were modified or none | ||
// if no files were modified previous to the execution of this function | ||
var oFiles = cache.getUpdatedFiles(files); | ||
# Get File Descriptor | ||
// if you want to prevent a file from being considered non modified | ||
// something useful if a file failed some sort of validation | ||
// you can then remove the entry from the cache doing | ||
cache.removeEntry('path/to/file'); // path to file should be the same path of the file received on `getUpdatedFiles` | ||
// that will effectively make the file to appear again as modified until the validation is passed. In that | ||
// case you should not remove it from the cache | ||
The `getFileDescriptor(filePath: string, options?: { useCheckSum?: boolean, currentWorkingDirectory?: string }): FileEntryDescriptor` function is used to get the file descriptor for the file. This function will return a `FileEntryDescriptor` object that has the following properties: | ||
// if you need all the files, so you can determine what to do with the changed ones | ||
// you can call | ||
var oFiles = cache.normalizeEntries(files); | ||
- `key: string` - The key for the file. This is the relative or absolute path of the file. | ||
- `changed: boolean` - If the file has changed since the last time it was analyzed. | ||
- `notFound: boolean` - If the file was not found. | ||
- `meta: FileEntryMeta` - The meta data for the file. This has the following prperties: `size`, `mtime`, `ctime`, `hash`, `data`. Note that `data` is an object that can be used to store additional information. | ||
- `err` - If there was an error analyzing the file. | ||
// oFiles will be an array of objects like the following | ||
entry = { | ||
key: 'some/name/file', the path to the file | ||
changed: true, // if the file was changed since previous run | ||
meta: { | ||
size: 3242, // the size of the file | ||
mtime: 231231231, // the modification time of the file | ||
data: {} // some extra field stored for this file (useful to save the result of a transformation on the file | ||
} | ||
} | ||
We have added the ability to use `relative` or `absolute` paths. If you pass in a `relative` path it will use the `currentWorkingDirectory` to resolve the path. If you pass in an `absolute` path it will use the path as is. This is useful when you want to use `relative` paths but also want to use `absolute` paths. | ||
If you do not pass in `currentWorkingDirectory` in the class options or in the `getFileDescriptor` function it will use the `process.cwd()` as the default `currentWorkingDirectory`. | ||
```javascript | ||
const fileEntryCache = new FileEntryCache(); | ||
const fileDescriptor = fileEntryCache.getFileDescriptor('file.txt', { currentWorkingDirectory: '/path/to/directory' }); | ||
``` | ||
## Motivation for this module | ||
Since this is a relative path it will use the `currentWorkingDirectory` to resolve the path. If you want to use an absolute path you can do the following: | ||
I needed a super simple and dumb **in-memory cache** with optional disk persistence (write-back cache) in order to make | ||
a script that will beautify files with `esformatter` to execute only on the files that were changed since the last run. | ||
```javascript | ||
const fileEntryCache = new FileEntryCache(); | ||
const filePath = path.resolve('/path/to/directory', 'file.txt'); | ||
const fileDescriptor = fileEntryCache.getFileDescriptor(filePath); | ||
``` | ||
In doing so the process of beautifying files was reduced from several seconds to a small fraction of a second. | ||
This will save the key as the absolute path. | ||
This module uses [flat-cache](https://www.npmjs.com/package/flat-cache) a super simple `key/value` cache storage with | ||
optional file persistance. | ||
If there is an error when trying to get the file descriptor it will return an ``notFound` and `err` property with the error. | ||
The main idea is to read the files when the task begins, apply the transforms required, and if the process succeed, | ||
then store the new state of the files. The next time this module request for `getChangedFiles` will return only | ||
the files that were modified. Making the process to end faster. | ||
```javascript | ||
const fileEntryCache = new FileEntryCache(); | ||
const fileDescriptor = fileEntryCache.getFileDescriptor('no-file'); | ||
if (fileDescriptor.err) { | ||
console.error(fileDescriptor.err); | ||
} | ||
This module could also be used by processes that modify the files applying a transform, in that case the result of the | ||
transform could be stored in the `meta` field, of the entries. Anything added to the meta field will be persisted. | ||
Those processes won't need to call `getChangedFiles` they will instead call `normalizeEntries` that will return the | ||
entries with a `changed` field that can be used to determine if the file was changed or not. If it was not changed | ||
the transformed stored data could be used instead of actually applying the transformation, saving time in case of only | ||
a few files changed. | ||
if (fileDescriptor.notFound) { | ||
console.error('File not found'); | ||
} | ||
``` | ||
In the worst case scenario all the files will be processed. In the best case scenario only a few of them will be processed. | ||
# Using Checksums to Determine if a File has Changed (useCheckSum) | ||
## Important notes | ||
- The values set on the meta attribute of the entries should be `stringify-able` ones if possible, flat-cache uses `circular-json` to try to persist circular structures, but this should be considered experimental. The best results are always obtained with non circular values | ||
- All the changes to the cache state are done to memory first and only persisted after reconcile. | ||
By default the `useCheckSum` is `false`. This means that the `FileEntryCache` will use the `mtime` and `ctime` to determine if the file has changed. If you set `useCheckSum` to `true` it will use a checksum to determine if the file has changed. This is useful when you want to make sure that the file has not changed at all. | ||
## License | ||
```javascript | ||
const fileEntryCache = new FileEntryCache(); | ||
const fileDescriptor = fileEntryCache.getFileDescriptor('file.txt', { useCheckSum: true }); | ||
``` | ||
MIT (c) Jared Wray | ||
You can pass `useCheckSum` in the FileEntryCache options, as a property `.useCheckSum` to make it default for all files, or in the `getFileDescriptor` function. Here is an example where you set it globally but then override it for a specific file: | ||
```javascript | ||
const fileEntryCache = new FileEntryCache({ useCheckSum: true }); | ||
const fileDescriptor = fileEntryCache.getFileDescriptor('file.txt', { useCheckSum: false }); | ||
``` | ||
# Setting Additional Meta Data | ||
In the past we have seen people do random values on the `meta` object. This can cause issues with the `meta` object. To avoid this we have `data` which can be anything. | ||
```javascript | ||
const fileEntryCache = new FileEntryCache(); | ||
const fileDescriptor = fileEntryCache.getFileDescriptor('file.txt'); | ||
fileDescriptor.meta.data = { myData: 'myData' }; //anything you want | ||
``` | ||
# How to Contribute | ||
You can contribute by forking the repo and submitting a pull request. Please make sure to add tests and update the documentation. To learn more about how to contribute go to our main README [https://github.com/jaredwray/cacheable](https://github.com/jaredwray/cacheable). This will talk about how to `Open a Pull Request`, `Ask a Question`, or `Post an Issue`. | ||
# License and Copyright | ||
[MIT © Jared Wray](./LICENSE) |
Sorry, the diff of this file is not supported yet
License Policy Violation
LicenseThis package is not allowed per your license policy. Review the package's license to ensure compliance.
Found 1 instance in 1 package
Major refactor
Supply chain riskPackage has recently undergone a major refactor. It may be unstable or indicate significant internal changes. Use caution when updating to versions that include significant changes.
Found 1 instance in 1 package
Filesystem access
Supply chain riskAccesses the file system, and could potentially read sensitive data.
Found 1 instance in 1 package
No repository
Supply chain riskPackage does not have a linked source code repository. Without this field, a package will have no reference to the location of the source code use to generate the package.
Found 1 instance in 1 package
License Policy Violation
LicenseThis package is not allowed per your license policy. Review the package's license to ensure compliance.
Found 1 instance in 1 package
No repository
Supply chain riskPackage does not have a linked source code repository. Without this field, a package will have no reference to the location of the source code use to generate the package.
Found 1 instance in 1 package
49817
7
7
888
200
Yes
2
1
+ Added@keyv/serialize@1.0.1(transitive)
+ Addedbase64-js@1.5.1(transitive)
+ Addedbuffer@6.0.3(transitive)
+ Addedcacheable@1.8.4(transitive)
+ Addedflat-cache@6.1.2(transitive)
+ Addedhookified@1.5.0(transitive)
+ Addedieee754@1.2.1(transitive)
+ Addedkeyv@5.2.1(transitive)
- Removedflat-cache@5.0.0(transitive)
- Removedjson-buffer@3.0.1(transitive)
- Removedkeyv@4.5.4(transitive)
Updatedflat-cache@^6.1.0