Security News
Weekly Downloads Now Available in npm Package Search Results
Socket's package search now displays weekly downloads for npm packages, helping developers quickly assess popularity and make more informed decisions.
couchdb-dump
Advanced tools
Tools to dump, modify, and load documents in CouchDB from the command line. (Same basic concept as mysqldump, but much more and for CouchDB)
A set of three command line tools that perform the following functions. One outputs all documents (including any attachments) in a CouchDB database, one lets you provide a function that can modify the documents in that output, and one takes that output as input and loads it back into a CouchDB database.
Reading and writing the data is done via stdin and stdout, respectively. The output of cdbdump
is a JSON document containing a "docs" array element which contains every document in the database. The cdbmorph
command takes the output of cdbdump
and allows you to modify the documents in it by passing each of them through a function that you supply. The cdbload
command takes an input which is exactly the same as the output of cdbdump
or cdbmorph
and writes every document in it into the target database.
Internally, cdbdump and cdbload just calling on CouchDB's _all_docs and _bulk_docs endpoints. To do what it does, this package uses the power of Node.js streams with the help of the following modules...
Many thanks to the authors of those amazing packages for making this possible.
npm install -g couchdb-dump
The following will dump the contents of a CouchDB database called myhugedatabase running on port 5984 on localhost. The output is written to a file called myhugedatabase.json.
cdbdump -d myhugedatabase > myhugedatabase.json
If you are doing this for archiving purposes you could do something cool like this to extract and gzip it in one step ...
cdbdump -d myhugedatabase | gzip > myhugedatabase.json.gz
Both of the following command examples will load all the documents in the myhugedatabase.json file into a CouchDB database called myhugeduplicate.
cdbload -d myhugeduplicate < myhugedatabase.json
OR
cat myhugedatabase.json | cdbload -d myhugeduplicate
You can even do this ...
cdbdump -d myhugedatabase | cdbload -d myhugeduplicate
... which streams all the docs from one CouchDB database into a second one. While this works well, you should probably take a look at using CouchDB's awesome built-in replication features instead.
But suppose you need to manipulate the documents in the cdbdump
output before you load them into CouchDB with cdbload
. You can do that with the included cdbmorph
command.
Lets assume you need to add a new key and value to all documents which meet certain criteria, and you need to delete documents which meet some other criteria. So you write a function as follows and save it in a file called morph.js. For example:
module.exports = function(doc, cb){
if(doc.somekey && doc.somekey === 'somevalue'){
doc.anotherkey = 'anothervalue';
}
if(doc.someotherkey && doc.someotherkey === 'someothervalue'){
doc._deleted = true;
}
cb(null, doc);
}
Now you can run the documents in the myhugedatabase.json file from the example above through this function and feed the output into a file or directly back into a CouchDB database as follows ...
cat myhugedatabase.json | cdbmorph -f ./morph.js | cdbload -d myhugemodified
You can also pipe output directly from cdbdump
to cdbmorph
to cdbload
like this ...
cdbdump -d myhugedatabase | cdbmorph -f ./morph.js | cdbload -d myhugemodified
You can even put modified documents back into a CouchDB database by loading the stream of changed documents back into the source database, simulating the feel of in-place-updates, like this ...
cdbdump -k -d myhugedatabase | cdbmorph -f ./morph.js | cdbload -d myhugedatabase
NOTE: This is not magic and it is not "real" in-place-updates so all the rules of CouchDB document updates with respect to _rev values, etc, still apply. Be mindful of this and other activity on the database if you do something like this. Also, be sure to apply the -k flag on the cdbdump
command so that the documents in the output will include their _rev values.
cdbdump
Full UsageIf you execute the cdbdump
command with no arguments or with --help, the following usage information will be printed on the console and the command will exit.
usage: cdbdump [-u username] [-p password] [-h host] [-P port] [-r protocol] [-s json-stringify-space] [-k dont-strip-revs] [-D design doc] [-v view] -d database
The username and password, if supplied, will be used for authentication via Basic Auth (RFC2617). Currently this is the only supported authentication method.
If you want to constrain the documents exported to only those that belong to a CouchDB View, you can pass the -D and -v options to specify the design document and view function name respectively. This won't work with views that reduce or do other fancy things.
The -s parameter for cdbdump
is used as the third parameter to JSON.stringify() for the amount of white space to use if you want the output to be pretty-printed.
By default, the _rev
element of every document is stripped out of the output of cdbdump
. This allows the list of documents to be used as input to cdbload
. If the -k parameter is given to cdbdump
, then the _rev
elements will not be stripped out and this will cause CouchDB to be unable to easily load these documents through the _bulk_docs
endpoint because every document will error with a mismatched revision message (assuming you are loading into an empty database). See CouchDB _bulk_docs documentation for more details and ways you can manipulate the cdbdump
output to get around that in case you need to keep the _rev
values in your dump.
host = localhost
port = 5984
protocol = http
json-stringify-space = 0
dont-strip-revs = false
cdbload
Full UsageIf you execute the cdbload
command with no arguments or with --help, the following usage information will be printed on the console and the command will exit.
usage: cdbload [-u username] [-p password] [-h host] [-P port] [-r protocol] [-v verbose] -d database
The -v parameter for cdbload
will print CouchDB's response body to the console. That will be an array with one JSON result object for every object loaded in!!
host = localhost
port = 5984
protocol = http
verbose = false
cdbmorph
Full UsageIf you execute the cdbmorph
command with no arguments or with --help, the following usage information will be printed on the console and the command will exit.
usage: cdbmorph [-s json-stringify-space] -f path-to-morphfunction.js
The -f parameter is used to provide the path to the file containing the javascript function that will be applied to every document in the stream. The path can be an absolute path:
cat dump.json | cdbmorph -f ~/morph.js
... or a relative path to the file from current working directory that the cdbmorph
command will be run from:
cat dump.json | cdbmorph -f ../../../morph.js
The file containing the morph function will be required into the script as follows:
var morphfunction = require(path.resolve(process.cwd(), path-to-morphfunction.js));
The morph function takes two arguments which are a CouchDB document and a callback.
doc
- A CouchDB document (json).callback(err, doc)
- A callback function used to return the morphed document or an error. If an error is returned the unchanged document is included in the output stream and the error is printed on error.console. Returning null
as the doc argument causes the original document to be excluded from the output.module.exports = function(doc, cb){
if(doc.dontmod){
return cb(null, doc); //unmodified doc is passed through to the output
}
if(doc.theexclusionkey){
return cb(); //doc will not be in the output
}
if(doc.somekey && doc.somekey === 'somevalue'){
doc.anotherkey = 'anothervalue'; //doc is modified in the output
}
if(doc.someotherkey && doc.someotherkey === 'someothervalue'){
doc._deleted = true; //doc is a 'CouchDB deleted document' in the output
}
cb(null, doc);
}
The -s parameter for cdbmorph
is used as the third parameter to JSON.stringify() for the amount of white space to use if you want the output to be pretty-printed.
cdbmorph
As shown in the usage examples above, it is possible to pass the documents in a CouchDB database through cdbmorph
for modification as follows:
cdbdump -k -d dbname | cdbmorph -f ./morph.js | cdbload -d dbanme
To make this work you must be sure to apply the -k flag on the cdbdump
command so that the documents in the output will include their _rev values. By default those would be stripped out which is only appropriate if you are loading the documents into a database where they don't already exist.
Be mindful of all the normal rules of CouchDB document updates with respect to _rev values, etc when performing this kind of operation.
json-stringify-space = 0
I originally wrote this because I couldn't find a cli dump tool for CouchDB that allowed me to pipe output directly. Since then I also found these other excellent options which you should definitely check out as well:
With the addition of the cdbmorph
command in release 2.2.0, this tool is now more than just a CouchDB dump command. It also provides convenient batch document manipulation features on the command line.
Still no automated tests in place. :\
Fork and PR. Thanks!!
cdbmorph
command that lets you manipulate the documents in the stream between cdbdump
and cdbload
using a function you provide.FAQs
Tools to dump, modify, and load documents in CouchDB from the command line. (Same basic concept as mysqldump, but much more and for CouchDB)
The npm package couchdb-dump receives a total of 7 weekly downloads. As such, couchdb-dump popularity was classified as not popular.
We found that couchdb-dump demonstrated a not healthy version release cadence and project activity because the last version was released a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
Socket's package search now displays weekly downloads for npm packages, helping developers quickly assess popularity and make more informed decisions.
Security News
A Stanford study reveals 9.5% of engineers contribute almost nothing, costing tech $90B annually, with remote work fueling the rise of "ghost engineers."
Research
Security News
Socket’s threat research team has detected six malicious npm packages typosquatting popular libraries to insert SSH backdoors.