Research
Security News
Malicious npm Packages Inject SSH Backdoors via Typosquatted Libraries
Socket’s threat research team has detected six malicious npm packages typosquatting popular libraries to insert SSH backdoors.
Dump Mysql, Postgres, SQLServer, and ElasticSearch databases directly to AWS S3, Google Cloud Storage, or Azure.
$ dbcp --help
cli.js [sourceFile] [targetFile]
$ dbcp data.parquet data.jsonl.gz
$ dbcp data.jsonl.gz s3://bucket/data.csv.gz
Copy from or to MySQL, PostgreSQL, SQLServer, LevelDB, MongoDB, and ElasticSearch directly to/from files on Amazon Web Services (AWS) S3, Google Cloud Storage (GCS), Microsoft Azure, SMB, HTTP, or another database.
Automatically converts between supported formats JSON, ND-JSON, CSV, SQL, Parquet, and TFRecord (with optional gzip compression).
Either --sourceType
or --sourceFile
and --targetType
or --targetFile
are required. Other options can be shortened, e.g --user
instead of --sourceUser
. Only a database-to-database copy requires both --sourceUser
and --targetUser
. The file format and compression is inferred from the filename.
dbcp
pipes Readable Node.JS streams to Writable streams. No intermediate storage is required.
transformObject
/transformObjectStream
API can be used for streaming transforms of Big Data.transformBytes
/transformBytesStream
to render progress updates.dbcp
supports sharding. It can split or join groups of files.dbcp
can convert files from one format to another.dbcp
supports compound inserts, which can insert groups of associated rows from multiple tables.dbcp
can translate SQL dialects, e.g. dump a Postgres table to .sql file with SQLServer CREATE and INSERT syntax.ReadableStreamTree
and WritableStreamTree
$ npm install -g dbcp
$ dbcp --help
$ npm init
$ npm install dbcp
$ ./node_modules/.bin/dbcp --help
import { AnyFileSystem } from '@wholebuzz/fs/lib/fs'
import { LocalFileSystem } from '@wholebuzz/fs/lib/local'
import { S3FileSystem } from '@wholebuzz/fs/lib/s3'
import { dbcp } from 'dbcp'
import StreamTree from 'tree-stream'
const fileSystem = new AnyFileSystem([
{ urlPrefix: 's3://', fs: new S3FileSystem() },
{ urlPrefix: '', fs: new LocalFileSystem() }
])
await dbcp({
fileSystem,
targetFile: 's3://foo/bar.jsonl',
// e.g. from level (https://www.npmjs.com/package/level)) database
sourceStream: StreamTree.readable(levelIteratorStream(leveldb.iterator())),
})
import { openNullWritable } from '@wholebuzz/fs/lib/stream'
import { dbcp } from 'dbcp'
import { Transform } from 'stream'
// Supply transformObject and a do-nothing Writable for targetStream.
await dbcp({
fileSystem,
sourceFiles: [ { url: '/tmp/foobar.csv.gz' } ],
targetStream: [ openNullWritable() ],
transformObject: (x) => { console.log('test', x) },
})
// Or alternatively supply targetStream with targetFormat = object
await dbcp({
fileSystem,
sourceFiles: [ { url: '/tmp/foobar.csv.gz' } ],
// Without targetFormat = object, transform() would receive Buffer
targetFormat: DatabaseCopyFormat.object,
targetStream: [
StreamTree.writable(new Transform({
objectMode: true,
transform(data, _, cb) {
console.log('test', data)
cb()
},
}))
],
})
PASS src/index.test.ts (85.9 s)
✓ Should hash test data as string
✓ Should hash test data stream
✓ Should copy local file
✓ Should read local directory
✓ Should convert to JSON from ND-JSON and back
✓ Should convert to sharded JSON from ND-JSON and back
✓ Should convert to Parquet from ND-JSON and back
✓ Should convert to TFRecord from ND-JSON and back
✓ Should load to level from ND-JSON and dump to JSON after external sort
✓ Should restore to and dump compound data
✓ Should restore to and dump from Elastic Search to ND-JSON
✓ Should restore to and dump from MongoDB to ND-JSON
✓ Should restore to and dump from Postgres to ND-JSON
✓ Should restore to and dump from Postgres to SQL
✓ Should not hang on error
✓ Should copy from Postgres to Mysql
✓ Should copy from Postgres to SQL Server
✓ Should dump from Postgres to Parquet file
✓ Should dump from MySQL to Parquet file
✓ Should dump from SQL Server to Parquet file
export async function dbcp(args: DatabaseCopyOptions)
export interface DatabaseCopyOptions {
batchSize?: number
columnType?: Record<string, string>
compoundInsert?: boolean
contentType?: string
copySchema?: DatabaseCopySchema
engineOptions?: any
externalSortBy?: string[]
extra?: Record<string, any>
extraOutput?: boolean
fileSystem?: FileSystem
group?: boolean
groupLabels?: boolean
limit?: number
orderBy?: string[]
query?: string
shardBy?: string
schema?: Column[]
schemaFile?: string
sourceConnection?: Record<string, any>
sourceElasticSearch?: Client
sourceFormat?: DatabaseCopyFormat
sourceFiles?: DatabaseCopySourceFile[] | Record<string, DatabaseCopySourceFile>
sourceHost?: string
sourceLevel?: level.LevelDB | LevelUp
sourceName?: string
sourceKnex?: Knex
sourcePassword?: string
sourceShards?: number
sourceStream?: ReadableStreamTree
sourceTable?: string
sourceType?: DatabaseCopySourceType
sourcePort?: number
sourceUser?: string
targetConnection?: Record<string, any>
targetElasticSearch?: Client
targetFormat?: DatabaseCopyFormat
targetFile?: string
targetHost?: string
targetKnex?: Knex
targetLevel?: level.LevelDB | LevelUp
targetName?: string
targetPassword?: string
targetShards?: number
targetStream?: WritableStreamTree[]
targetTable?: string
targetType?: DatabaseCopyTargetType
targetPort?: number
targetUser?: string
tempDirectory?: string
transformObject?: (x: unknown) => unknown
transformObjectStream?: () => Duplex
transformBytes?: (x: string) => string
transformBytesStream?: () => Duplex
where?: Array<string | any[]>
}
$ dbcp --help
cli.js [sourceFile] [targetFile]
Options:
--help Show help [boolean]
--version Show version number [boolean]
--compoundInsert Compound insert mode can insert associated rows from
multiple tables. [boolean]
--contentType Content type [string]
--dataOnly Dump only the data, not the schema (data definitions).
[boolean]
--dbname Database [string]
--externalSortBy Sort data by property(s) with external-sorting [array]
--format
[choices: "csv", "json", "jsonl", "ndjson", "object", "parquet", "tfrecord",
"sql"]
--group Group inputs with equinvalent orderBy [boolean]
--host Database host [string]
--limit Database query LIMIT [number]
--orderBy Database query ORDER BY [array]
--password Database password [string]
--port Database port [string]
--query Query [string]
--schemaFile Use schema file if required, instead of schema inspection.
[string]
--schemaOnly Dump only the object definitions (schema), not data.
[boolean]
--shardBy Shard (or split) the data based on key [string]
--shards The number of shards to split or join the data [number]
--sourceFile Source file [array]
--sourceFormat
[choices: "csv", "json", "jsonl", "ndjson", "object", "parquet", "tfrecord",
"sql"]
--sourceHost Source host [string]
--sourceName Source database [string]
--sourcePassword Source database password [string]
--sourcePort Source database port [string]
--sourceShards Source shards [number]
--sourceTable Source database table [string]
--sourceType Source database type
[string] [choices: "athena", "es", "level", "mssql", "mysql", "postgresql",
"sqlite"]
--sourceUser Source database user [string]
--table Database table [string]
--targetFile Target file [string]
--targetFormat
[choices: "csv", "json", "jsonl", "ndjson", "object", "parquet", "tfrecord",
"sql"]
--targetHost Target host [string]
--targetName Target database [string]
--targetPassword Target database password [string]
--targetPort Target database port [string]
--targetShards Target shards [number]
--targetTable Target database table [string]
--targetType Target database type
[string] [choices: "athena", "es", "level", "mssql", "mysql", "postgresql",
"sqlite"]
--targetUser Target database user [string]
--user Database user [string]
--where Database query WHERE [array]
$ dbcp \
--sourceType postgresql \
--host localhost \
--dbname postgres \
--port 5433 \
--user postgres \
--password postgres \
--table foobar \
--targetFile gs://bucket/file.json.gz
$ dbcp \
--sourceType mysql \
--host localhost \
--dbname mydb \
--port 8083 \
--user root \
--password wp \
--table foobar \
--format jsonl \
--targetFile s3://bucket/object.jsonl.gz
$ dbcp \
--targetType mysql \
--host localhost \
--dbname mydb \
--port 8083 \
--user root \
--password wp \
--table foobar \
--sourceFile s3://bucket/object.jsonl.gz
$ dbcp \
--sourceType mssql \
--host localhost \
--dbname mymsdb \
--port 1433 \
--user SA \
--password "MyP@ssw0rd#" \
--table foobar \
--targetFile=-
$ dbcp \
--sourceType mongodb \
--host localhost \
--port 27017 \
--user root \
--password example \
--dbname test_db \
--table dbcptest \
--targetFile output-SSSS-of-NNNN.jsonl.gz \
--targetShards 4 \
--shardBy id
$ ls output*
-rw-r--r-- 1 user staff 782701 Feb 4 10:59 output-0001-of-0004.jsonl.gz
-rw-r--r-- 1 user staff 771980 Feb 4 10:59 output-0003-of-0004.jsonl.gz
-rw-r--r-- 1 user staff 794959 Feb 4 10:59 output-0000-of-0004.jsonl.gz
-rw-r--r-- 1 user staff 788720 Feb 4 10:59 output-0002-of-0004.jsonl.gz
$ dbcp gs://bucket/archive.csv.gz | jq . | less
$ dbcp s3://bucket/object.json.gz gs://bucket/file.json.gz
$ dbcp foobar.jsonl bazbat.json
$ dbcp "https://www.w3.org/People/mimasa/test/imgformat/img/w3c_home.png" foo.png
$ dbcp "./foo.png" "http://my.api/upload" --contentType "image/png"
$ dbcp --schemaOnly --sourceFile ./sample.jsonl.gz --targetType athena --targetFile ddl.sql
$ dbcp --schemaOnly --sourceFile ./sample.jsonl.gz --targetType postgresql --targetFile ddl.sql
FAQs
Dump Mysql, Postgres, SQLServer, and ElasticSearch databases directly to AWS S3, Google Cloud Storage, or Azure.
The npm package dbcp receives a total of 10 weekly downloads. As such, dbcp popularity was classified as not popular.
We found that dbcp demonstrated a not healthy version release cadence and project activity because the last version was released a year ago. It has 2 open source maintainers collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Research
Security News
Socket’s threat research team has detected six malicious npm packages typosquatting popular libraries to insert SSH backdoors.
Security News
MITRE's 2024 CWE Top 25 highlights critical software vulnerabilities like XSS, SQL Injection, and CSRF, reflecting shifts due to a refined ranking methodology.
Security News
In this segment of the Risky Business podcast, Feross Aboukhadijeh and Patrick Gray discuss the challenges of tracking malware discovered in open source softare.