AWS CLI s3 sync for Node.js
AWS CLI s3 sync
for Node.js is a modern TypeScript client to perform S3 sync operations between file systems and S3 buckets, in the spirit of the official AWS CLI command.
AWS CLI installation is NOT required by this module.
Features
- Sync a local file system with a remote Amazon S3 bucket
- Sync a remote Amazon S3 bucket with a local file system (multipart uploads are supported)
- Sync two remote Amazon S3 buckets
- Sync only new and updated objects
- Support AWS CLI options
--delete
, --dryrun
, --size-only
, --include
and --exclude
- Support AWS SDK native command input options
- Monitor sync progress
- Sync any number of objects (no 1000 objects limit)
- Transfer objects concurrently
- Manage differences in folder structures easily through relocation
Why should I use this module?
- There is no way to achieve S3 sync using the AWS SDK for JavaScript v3 alone
- AWS CLI installation is NOT required
- The module contains no external dependency
- The AWS SDK peer dependency is up-to-date (AWS SDK for JavaScript v3)
- The module overcomes a set of common limitations listed at the bottom of this README
Table of Contents
- Getting Started
- Install
- Quick Start
- Client initialization
- Sync a remote S3 bucket with the local file system
- Sync the local file system with a remote S3 bucket
- Sync two remote S3 buckets
- Monitor transfer progress
- Abort sync
- Use AWS SDK command input options
- Relocate objects during sync
- Filter source files
- API Reference
- Change Log
- Benchmark
Getting Started
Install
npm i s3-sync-client
Quick Start
Client initialization
S3SyncClient
is a wrapper for the AWS SDK S3Client
class.
import S3Client from '@aws-sdk/client-s3';
import { S3SyncClient } from 's3-sync-client';
const s3Client = new S3Client({ });
const { sync } = new S3SyncClient({ client: s3Client });
Sync a remote S3 bucket with the local file system
await sync('/path/to/local/dir', 's3://mybucket2');
await sync('/path/to/local/dir', 's3://mybucket2', { partSize: 100 * 1024 * 1024 });
await sync('/path/to/local/dir', 's3://mybucket2/zzz', { del: true });
Sync the local file system with a remote S3 bucket
await sync('s3://mybucket', '/path/to/some/local', { del: true });
const diff = await sync('s3://mybucket2', '/path/to/local/dir', { dryRun: true });
console.log(diff);
Sync two remote S3 buckets
await sync('s3://my-source-bucket', 's3://my-target-bucket', { del: true });
Monitor sync progress
import { TransferMonitor } from 's3-sync-client';
const monitor = new TransferMonitor();
monitor.on('progress', (progress) => console.log(progress));
await sync('s3://mybucket', '/path/to/local/dir', { monitor });
const timeout = setInterval(() => console.log(monitor.getStatus()), 2000);
try {
await sync('s3://mybucket', '/path/to/local/dir', { monitor });
} finally {
clearInterval(timeout);
}
Abort sync
import { AbortController } from '@aws-sdk/abort-controller';
import { TransferMonitor } from 's3-sync-client';
const abortController = new AbortController();
setTimeout(() => abortController.abort(), 30000);
await sync('s3://mybucket', '/path/to/local/dir', { abortSignal: abortController.signal });
Use AWS SDK command input options
import mime from 'mime-types';
await sync('s3://mybucket', '/path/to/local/dir', {
commandInput: {
ACL: 'aws-exec-read',
},
});
await sync('s3://mybucket1', 's3://mybucket2', {
commandInput: (input) => ({
ContentType: mime.lookup(input.Key) || 'text/html',
}),
});
Filter source files
await sync('s3://my-source-bucket', 's3://my-target-bucket', {
filters: [
{ exclude: () => true },
{ include: (key) => key.endsWith('.txt') },
{ include: (key) => key.startsWith('flowers/') },
],
});
Relocate objects during sync
await sync('s3://my-source-bucket', 's3://my-target-bucket', {
relocations: [
(currentPath) =>
currentPath.startsWith('a/b/')
? currentPath.replace('a/b/', 'zzz/')
: currentPath
],
});
await sync('s3://mybucket/flowers/red', '/path/to/local/dir');
Note: relocations are applied after every other options such as filters.
Additional examples are available in the repo test directory.
API Reference
A complete API reference is available in the repo docs directory.
Class: S3SyncClient
new S3SyncClient(options)
options
<S3SyncClientConfig>
client.sync(localDir, bucketPrefix[, options])
Sync a remote S3 bucket with the local file system.
Similar to AWS CLI aws s3 sync localDir bucketPrefix [options]
.
localDir
<string> Local directorybucketPrefix
<string> Remote bucket name which may contain a prefix appended with a /
separatoroptions
<SyncBucketWithLocalOptions>
dryRun
<boolean> Equivalent to CLI --dryrun
optiondel
<boolean> Equivalent to CLI --delete
optionsizeOnly
<boolean> Equivalent to CLI --size-only
optionrelocations
<Relocation[]> Allows uploading objects to remote folders without mirroring the source directory structure. Each relocation is as a callback taking a string posix path param and returning a relocated string posix path.filters
<Filter[]> Almost equivalent to CLI --exclude
and --include
options. Filters can be specified using plain objects including either an include
or exclude
property. The include
and exclude
properties are functions that take an object key and return a boolean.abortSignal
<AbortSignal> Allows aborting the synccommandInput
<CommandInput<PutObjectCommandInput>> | <CommandInput<CreateMultipartUploadCommandInput>> Set any of the SDK <PutObjectCommandInput> or <CreateMultipartUploadCommandInput> options to uploadsmonitor
<TransferMonitor>
- Attach
progress
event to receive upload progress notifications - Call
getStatus()
to retrieve progress info on demand
maxConcurrentTransfers
<number> Each upload generates a Promise which is resolved when a local object is written to the S3 bucket. This parameter sets the maximum number of upload promises that might be running concurrently.partSize
<number> Set the part size in bytes for multipart uploads. Default to 5 MB.
- Returns: <Promise<SyncBucketWithLocalCommandOutput>> Fulfills with sync operations upon success.
client.sync(bucketPrefix, localDir[, options])
Sync the local file system with a remote S3 bucket.
Similar to AWS CLI aws s3 sync bucketPrefix localDir [options]
.
bucketPrefix
<string> Remote bucket name which may contain a prefix appended with a /
separatorlocalDir
<string> Local directoryoptions
<SyncLocalWithBucketOptions>
dryRun
<boolean> Equivalent to CLI --dryrun
optiondel
<boolean> Equivalent to CLI --delete
optionsizeOnly
<boolean> Equivalent to CLI --size-only
optionrelocations
<Relocation[]> Allows downloading objects to local directories without mirroring the source folder structure. Each relocation is as a callback taking a string posix path param and returning a relocated string posix path.filters
<Filter[]> Almost equivalent to CLI --exclude
and --include
options. Filters can be specified using plain objects including either an include
or exclude
property. The include
and exclude
properties are functions that take an object key and return a boolean.abortSignal
<AbortSignal> Allows aborting the synccommandInput
<CommandInput<GetObjectCommandInput>> Set any of the SDK <GetObjectCommandInput> options to downloadsmonitor
<TransferMonitor>
- Attach
progress
event to receive download progress notifications - Call
getStatus()
to retrieve progress info on demand
maxConcurrentTransfers
<number> Each download generates a Promise which is resolved when a remote object is written to the local file system. This parameter sets the maximum number of download promises that might be running concurrently.
- Returns: <Promise<SyncLocalWithBucketCommandOutput>> Fulfills with sync operations upon success.
client.sync(sourceBucketPrefix, targetBucketPrefix[, options])
Sync two remote S3 buckets.
Similar to AWS CLI aws s3 sync sourceBucketPrefix targetBucketPrefix [options]
.
sourceBucketPrefix
<string> Remote reference bucket name which may contain a prefix appended with a /
separatortargetBucketPrefix
<string> Remote bucket name to sync which may contain a prefix appended with a /
separatoroptions
<SyncBucketWithBucketOptions>
dryRun
<boolean> Equivalent to CLI --dryrun
optiondel
<boolean> Equivalent to CLI --delete
optionsizeOnly
<boolean> Equivalent to CLI --size-only
optionrelocations
<Relocation[]> Allows copying objects to remote folders without mirroring the source folder structure. Each relocation is as a callback taking a string posix path param and returning a relocated string posix path.filters
<Filter[]> Almost equivalent to CLI --exclude
and --include
options. Filters can be specified using plain objects including either an include
or exclude
property. The include
and exclude
properties are functions that take an object key and return a boolean.abortSignal
<AbortSignal> Allows aborting the synccommandInput
<CommandInput<CopyObjectCommandInput>> Set any of the SDK <CopyObjectCommandInput> options to copy operationsmonitor
<TransferMonitor>
- Attach
progress
event to receive copy progress notifications - Call
getStatus()
to retrieve progress info on demand
maxConcurrentTransfers
<number> Each copy generates a Promise which is resolved after the object has been copied. This parameter sets the maximum number of copy promises that might be running concurrently.
- Returns: <Promise<SyncBucketWithBucketCommandOutput>> Fulfills with sync operations upon success.
Change Log
See CHANGELOG.md.
Benchmark
AWS CLI s3 sync
for Node.js has been developed to solve the S3 sync limitations of the existing GitHub repo and NPM modules.
Most of the existing repo and NPM modules suffer one or more of the following limitations:
- requires AWS CLI to be installed
- uses Etag to perform file comparison (Etag should be considered an opaque field and shouldn't be used)
- limits S3 bucket object listing to 1000 objects
- supports syncing bucket with local, but doesn't support syncing local with bucket
- doesn't support multipart uploads
- uses outdated dependencies
- is unmaintained
The following JavaScript modules suffer at least one of the limitations: