Castor Load
Traverse a directory to build a MongoDB collection with the found files. Then it enables to keep directory and collection synchronised.
Contributors
Installation
With npm do:
$ npm install castor-load
Tests
Use mocha to run the tests.
$ npm install mocha
$ mocha test
API Documentation
Constructor Loader(String directory, [Object options])
Create an new object to synchronise directory with MongoDB collection
###Options
connexionURI
- string - URL to connect to MongoDB (see documentation, if not specified, it can look up the environment variable "MONGO_URL" ; default : 'mongodb://localhost:27017/test/'ignore
- array - List of files to ignore (Regex accepted) : default : emptyinclude
- array - List of node types to handle (directory and/or files) : default : ['files']collectionName
- string - MongoDB collection name : default : automaticconcurrency
- number - Define how many files/documents can be processed in parallel : default : 1maxFileSize
- string - Maximum size of files, beyond which they will be rejected : default : 128mbdelay
- number - Delay of file processing when the stack is full (milliseconds) : default : 1000writeConcern
- number/string - Write concern level used for insertions. (see documentation)watch
- boolean - enable tree watching after the initial synchronization. default: truedateConfig
- date - (Optional) arbitrary date appended to files metadata. Files whose dateConfig has changed will be resynchronized, regardless of their modification date.strictCompare
- boolean - always check file content when a change is detected, redardless of its modification date. Should be used when files can get multiple changes in a short period of time. default : falsemodifier
- function(baseDoc) - a function to modify the base document of a file upon its creation.
var options = {
"connexionURI" : "mongodb://localhost:27017/test/",
"ignore" : [ "**/.*", "*~", "*.sw?", "*.old", "*.bak", "**/node_modules"]
};
var fr = new Loader(__dirname, options);
Loader.use([String pattern,] Function middleware)
Add a middleware to be executed on either all files or those matching the given pattern. The middleware is given the document associated with the file, and a callback that can be can be called in two ways :
- if the file matches a single document, then it should be called once with a potential error and the final document.
- if the file must be exploded in mulitple subdocuments, then it should be called multiple times with either a subdocument or an error, and one last time without any argument when all subdocuments have been submitted.
var fr = new Loader(__dirname);
fr.use('**/*.txt', function (doc, submit) {
doc.name = doc.basename.toUpperCase();
submit(null, doc);
});
fr.use('**/*.csv', function (doc, submit) {
require('fs').readFile(doc.location, function (err, content) {
content.split('\n').forEach(function (line) {
var clonedDoc = {};
for (var p in doc) { clonedDoc[p] = doc[p]; }
clonedDoc.content = line;
submit(clonedDoc);
});
submit();
});
});
Loader.sync(Function callback)
Start synchronization between the directory and the MongoDB collection.
callback will be called after a complete analysis. Its argument is the number of files/directories that were either cancelled or (re)synchronized with the database.
var fr = new Loader(__dirname);
fr.sync(function(processed) {
console.log('Synchronization done, %d files were either cancelled or checked', %d);
});
Loader.syncr.connect(Function callback)
Open a MongoDB connection, or use the existing one. The callback returns a potential error object and a handle to the working collection. Use this if you want to perform some actions on the collection before you start synchronizing.
var fr = new Loader(__dirname);
fr.syncr.connect(function(err, collection) {
collection.ensureIndex({ 'filename': 1 }, function (err) {
console.log('Added an index on filename, now starting synchronization');
fr.sync();
});
});
Events
Name(arguments) | Description |
---|
browseOver(found) | emitted when the tree is entirely browsed, with the number of items that should be synchronized. |
watching() | emitted when the initial synchronization is done and the watcher is ready. |
checked(err, file) | when a file has been (re)synchronized during the initial synchronization |
cancelled(err, file) | when an ignored file has been removed from the DB |
added(err, file) | when a file added in the tree has been synchronized with the DB |
changed(err, file) | when a modified file has been resynchronized with the DB |
dropped(err, file) | when a file has been unlinked and its related documents marked as deleted |
preCheck(file) | when a file is about to be checked. Can be emitted on initial sync, or when a file has been added or changed. |
preCancel(file) | when a file is about to be cancelled. Can be emitted on initial sync, or when a file has been added or changed. |
preDrop(file) | when a file is about to be marked as deleted |