Huge News!Announcing our $40M Series B led by Abstract Ventures.Learn More →

jruby_parallel_processing

Package Overview

Dependencies

Advanced tools

Install Socket

Detect and block malicious and high-risk dependencies

Install

jruby_parallel_processing

0.1.0
Rubygems

Version published: 3 months ago

Maintainers: 1

Created: 3 months ago

Source

JRubyParallelProcessing

Overview

JRubyParallelProcessing is a gem designed for parallel processing tasks in JRuby. It offers efficient data processing and API request handling using multiple threads, along with a task queue system for managing and executing tasks with priority and retries.

Installation

Add this gem to your Gemfile:

gem 'jruby_parallel_processing'

Usage

DataProcessor

DataProcessor is a class designed for parallel data processing. It allows you to split data into chunks and process them across multiple threads, which significantly speeds up task execution. Additionally, it supports middleware hooks for customizing the behavior before and after processing.

Basic Example

require 'jruby_parallel_processing'
data = (1..100).to_a
processor = JRubyParallelProcessing::DataProcessor.new(data_array: data, in_threads: 4)
processor.process do |item|
# Your processing logic here
puts item
end

Features

Efficient Parallel Processing: Process data using multiple threads.
Chunk Management: Automatically splits data into manageable chunks.
Stream Support: Handles data processing from streams (e.g., IO, StringIO).
Middleware: Supports before and after processing middleware for custom logic injection.

Configuration

data_array: Array of data to be processed.
stream: Stream object (e.g., IO, StringIO) for processing data line-by-line.
in_threads: Number of threads for parallel processing (default is 4).
chunk_size: Size of chunks for processing.
logger: Logger object for custom logging.
queue_size: Size of the queue for stream processing (default is 100).
timeout: Timeout for stream processing (default is 10 seconds).

Example Usage

Middleware Example

processor = JRubyParallelProcessing::DataProcessor.new(data_array: data, in_threads: 4)
# Add before processing middleware
processor.add_middleware(:before_process) do
  puts "Starting processing..."
end
# Add after processing middleware
processor.add_middleware(:after_process) do
  puts "Finished processing."
end
processor.process do |item|
  puts item
end

Data Processing from Stream:

require 'stringio'
require 'jruby_parallel_processing'
stream = StringIO.new("line 1\nline 2\nline 3\n")
processor = JRubyParallelProcessing::DataProcessor.new(stream: stream, in_threads: 2)
processor.process do |line|
puts line
end

TaskQueue

TaskQueue is a class for managing and executing tasks with priority, retries, and custom configurations. It allows for efficient task scheduling and error handling.

Basic Example

require 'jruby_parallel_processing'
task_queue = JRubyParallelProcessing::TaskQueue.new(max_retries: 3, retry_delay: 0.1, max_queue_size: 10)
task_queue.add_task(1) do
# Your task logic here
puts "Task executed"
end
task_queue.process_tasks

Features

Adds tasks to the queue with priority.
Retries failed tasks with configurable retry count and delay.
Handles task execution using a fixed thread pool.

Configuration

max_retries: Maximum number of retries for failed tasks (default is 3).
retry_delay: Delay between retries in seconds (default is 0.1).
max_queue_size: Maximum number of tasks in the queue (default is 10).
logger: Logger object for custom logging.

ApiRequestProcessor

ApiRequestProcessor handles parallel API requests with built-in error handling and retries.

Basic Example

urls = ["https://api.example.com/data", "https://api.example.com/other"]
processor = JRubyParallelProcessing::ApiRequestProcessor.new(urls, in_threads: 4)
results = processor.process(http_method: :get) do |response|
puts "Received response: #{response.body}"
end

Features

Parallel API requests with configurable HTTP methods, timeouts, and retries.
Supports custom headers and parameters for requests.
Automatically parses responses in various formats.

Configuration

urls: Array of URLs for API requests.
in_threads: Number of threads for parallel API requests (default is 4).
timeout: Timeout for API requests (default is 5 seconds).
retries: Number of retries for failed requests (default is 3).
logger: Logger object for custom logging.
http_method: HTTP method to use (default is GET).
headers: Custom headers for requests.
params: Parameters for POST/PUT requests.

DistributedWorker

DistributedWorker is a class for managing distributed tasks across multiple worker nodes. It enables task distribution, prioritization, and ensures that workers remain active through a heartbeat mechanism.

Basic Example

require 'jruby_parallel_processing'
# Initialize a DistributedWorker instance
worker = JRubyParallelProcessing::DistributedWorker.new("localhost", 8787)
# Queue a task with a given priority
result = worker.execute_task(-> { puts "Task executed" }, priority: 5)
puts result[:status]  # :queued
# To connect to an existing worker
worker_url = "druby://localhost:8787"
remote_worker = JRubyParallelProcessing::DistributedWorker.connect_to_worker(worker_url)
# Send a heartbeat to the worker
remote_worker.send_heartbeat

Features

Distributed Task Execution: Distribute tasks to different worker nodes.
Task Prioritization: Tasks are queued with priority, ensuring high-priority tasks are processed first.
Heartbeat Mechanism: Ensures worker nodes remain active and can report their status.
Fault-Tolerant Task Queue: Handles errors during task queueing and processing.

Configuration Options

host: Host address for the DistributedWorker service (default is localhost).
port: Port for the DistributedWorker service (default is 8787).
priority: Priority of the task being queued (default is 0).

License

This gem is licensed under the MIT License.

FAQs

What is jruby_parallel_processing?

Is jruby_parallel_processing well maintained?

Package last updated on 06 Sep 2024

Did you know?

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

jruby_parallel_processing

JRubyParallelProcessing

Table of Contents

Overview

Installation

Usage

DataProcessor

Basic Example

Features

Configuration

Example Usage

Middleware Example

Data Processing from Stream:

TaskQueue

Basic Example

Features

Configuration

ApiRequestProcessor

Basic Example

Features

Configuration

DistributedWorker

Basic Example

Features

Configuration Options

License

Related posts

jruby_parallel_processing

JRubyParallelProcessing

Table of Contents

Overview

Installation

Usage

DataProcessor

Basic Example

Features

Configuration

Example Usage

Middleware Example

Data Processing from Stream:

TaskQueue

Basic Example

Features

Configuration

ApiRequestProcessor

Basic Example

Features

Configuration

DistributedWorker

Basic Example

Features

Configuration Options

License

Related posts

Data Theft Repackaged: A Case Study in Malicious Wrapper Packages on npm

Malicious npm Package Typosquats Popular TypeScript ESLint Plugin, Exfiltrates Data and Enables Remote Exploitation