New Case Study:See how Anthropic automated 95% of dependency reviews with Socket.Learn More
Socket
Sign inDemoInstall
Socket

data-tide-js

Package Overview
Dependencies
Maintainers
0
Versions
2
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

data-tide-js

A powerful javascript library to process big chunks of data

  • 0.1.1
  • latest
  • Source
  • npm
  • Socket score

Version published
Maintainers
0
Created
Source

DataTide

⚠️ Warning: This library is currently under development and IS NOT suitable for production usage.

DataTide is a high-performance Node.js library for processing large datasets using worker threads. It provides a simple, stream-based API for parallel data processing with built-in error handling and backpressure support.

⚡ Features

  • 🚀 Parallel processing using worker threads
  • 📊 Stream-based processing for handling large datasets
  • 🔄 Automatic backpressure handling
  • ⚡ Support for both synchronous and asynchronous transformations
  • 🎯 Configurable error handling strategies
  • 🔒 Basic security checks for transform functions
  • 📝 TypeScript support with full type definitions

🚨 Security Notice

This library uses eval() to deserialize transform functions in worker threads. While basic security checks are implemented, it may not be completely safe against all forms of code injection. Use with caution and avoid processing untrusted input.

⚠️ Warning: The methodology used to serialize and deserialize functions is experimental and may change in the future.

📦 Installation

npm install data-tide-js

🚀 Quick Start

import DataTide from "data-tide-js";
import { ProcessStep } from "data-tide-js/types";

// Create a DataTide instance
const dataTide = new DataTide({
  keepOrder: true, // Maintain input order
  failureBehavior: "ignore-row", // Skip failed rows
  concurrency: 4, // Number of worker threads
});

// Define processing steps
const steps: ProcessStep<number, number>[] = [
  {
    name: "double",
    transform: (num: number) => num * 2,
  },
  {
    name: "add-ten",
    transform: async (num: number) => {
      await someAsyncOperation();
      return num + 10;
    },
  },
];

// Process array data
const result = await dataTide.process([1, 2, 3, 4, 5], steps);
console.log(result); // [12, 14, 16, 18, 20]

// Or process streams
const inputStream = createReadStream("input.json");
const transformStream = await dataTide.process(inputStream, steps);
transformStream.pipe(createWriteStream("output.json"));

⚙️ Configuration

DataTideOptions

  • keepOrder (boolean, default: false): Maintain the order of processed items
  • failureBehavior ('fail-all' | 'ignore-row' | 'early-return', default: 'fail-all'): How to handle errors
  • concurrency (number, default: CPU cores): Number of worker threads to use

Error Handling Strategies

  • fail-all: Stop processing and throw error on first failure
  • ignore-row: Skip failed items and continue processing
  • early-return: Stop processing but return successfully processed items

🔍 API Reference

DataTide

Constructor
constructor(options?: Partial<DataTideOptions>)
Methods
process<T, R>(data: T[] | Readable, steps: ProcessStep<T, R>[]): Promise<R[] | Transform>

ProcessStep<T, R>

interface ProcessStep<T = unknown, R = unknown> {
  transform: (data: T) => Promise<R> | R;
  name?: string;
}

⚠️ Limitations

  • Transform functions cannot use imports or require statements
  • System calls (process, require, etc.) are not allowed in transforms
  • Maximum execution time per step is 30 seconds
  • Worker threads may consume significant memory for large datasets

🐛 Known Issues

  1. Memory usage may spike with large datasets
  2. Worker creation may fail in restricted environments
  3. Transform function serialization has limitations

🤝 Contributing

Contributions are welcome! Please read our contributing guidelines before submitting pull requests.

📄 License

MIT License - see LICENSE file for details

🐛 Reporting Issues

Please report any issues on our GitHub issue tracker.

Keywords

FAQs

Package last updated on 23 Jan 2025

Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts

SocketSocket SOC 2 Logo

Product

  • Package Alerts
  • Integrations
  • Docs
  • Pricing
  • FAQ
  • Roadmap
  • Changelog

Packages

npm

Stay in touch

Get open source security insights delivered straight into your inbox.


  • Terms
  • Privacy
  • Security

Made with ⚡️ by Socket Inc