🚀 Big News: Socket Acquires Coana to Bring Reachability Analysis to Every Appsec Team.Learn more
Socket
Book a DemoInstallSign in
Socket

jruby_parallel_processing

Package Overview
Dependencies
Maintainers
1
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

jruby_parallel_processing

0.1.0
Rubygems
Version published
Maintainers
1
Created
Source

JRubyParallelProcessing

Table of Contents

Overview

JRubyParallelProcessing is a gem designed for parallel processing tasks in JRuby. It offers efficient data processing and API request handling using multiple threads, along with a task queue system for managing and executing tasks with priority and retries.

Installation

Add this gem to your Gemfile:

gem 'jruby_parallel_processing'

Usage

DataProcessor

DataProcessor is a class designed for parallel data processing. It allows you to split data into chunks and process them across multiple threads, which significantly speeds up task execution. Additionally, it supports middleware hooks for customizing the behavior before and after processing.

Basic Example

require 'jruby_parallel_processing'
data = (1..100).to_a
processor = JRubyParallelProcessing::DataProcessor.new(data_array: data, in_threads: 4)
processor.process do |item|
# Your processing logic here
puts item
end

Features

  • Efficient Parallel Processing: Process data using multiple threads.
  • Chunk Management: Automatically splits data into manageable chunks.
  • Stream Support: Handles data processing from streams (e.g., IO, StringIO).
  • Middleware: Supports before and after processing middleware for custom logic injection.

Configuration

  • data_array: Array of data to be processed.
  • stream: Stream object (e.g., IO, StringIO) for processing data line-by-line.
  • in_threads: Number of threads for parallel processing (default is 4).
  • chunk_size: Size of chunks for processing.
  • logger: Logger object for custom logging.
  • queue_size: Size of the queue for stream processing (default is 100).
  • timeout: Timeout for stream processing (default is 10 seconds).

Example Usage

Middleware Example

processor = JRubyParallelProcessing::DataProcessor.new(data_array: data, in_threads: 4)
# Add before processing middleware
processor.add_middleware(:before_process) do
  puts "Starting processing..."
end
# Add after processing middleware
processor.add_middleware(:after_process) do
  puts "Finished processing."
end
processor.process do |item|
  puts item
end

Data Processing from Stream:

require 'stringio'
require 'jruby_parallel_processing'
stream = StringIO.new("line 1\nline 2\nline 3\n")
processor = JRubyParallelProcessing::DataProcessor.new(stream: stream, in_threads: 2)
processor.process do |line|
puts line
end

TaskQueue

TaskQueue is a class for managing and executing tasks with priority, retries, and custom configurations. It allows for efficient task scheduling and error handling.

Basic Example

require 'jruby_parallel_processing'
task_queue = JRubyParallelProcessing::TaskQueue.new(max_retries: 3, retry_delay: 0.1, max_queue_size: 10)
task_queue.add_task(1) do
# Your task logic here
puts "Task executed"
end
task_queue.process_tasks

Features

  • Adds tasks to the queue with priority.
  • Retries failed tasks with configurable retry count and delay.
  • Handles task execution using a fixed thread pool.

Configuration

  • max_retries: Maximum number of retries for failed tasks (default is 3).
  • retry_delay: Delay between retries in seconds (default is 0.1).
  • max_queue_size: Maximum number of tasks in the queue (default is 10).
  • logger: Logger object for custom logging.

ApiRequestProcessor

ApiRequestProcessor handles parallel API requests with built-in error handling and retries.

Basic Example

urls = ["https://api.example.com/data", "https://api.example.com/other"]
processor = JRubyParallelProcessing::ApiRequestProcessor.new(urls, in_threads: 4)
results = processor.process(http_method: :get) do |response|
puts "Received response: #{response.body}"
end

Features

  • Parallel API requests with configurable HTTP methods, timeouts, and retries.
  • Supports custom headers and parameters for requests.
  • Automatically parses responses in various formats.

Configuration

  • urls: Array of URLs for API requests.
  • in_threads: Number of threads for parallel API requests (default is 4).
  • timeout: Timeout for API requests (default is 5 seconds).
  • retries: Number of retries for failed requests (default is 3).
  • logger: Logger object for custom logging.
  • http_method: HTTP method to use (default is GET).
  • headers: Custom headers for requests.
  • params: Parameters for POST/PUT requests.

DistributedWorker

DistributedWorker is a class for managing distributed tasks across multiple worker nodes. It enables task distribution, prioritization, and ensures that workers remain active through a heartbeat mechanism.

Basic Example

require 'jruby_parallel_processing'
# Initialize a DistributedWorker instance
worker = JRubyParallelProcessing::DistributedWorker.new("localhost", 8787)
# Queue a task with a given priority
result = worker.execute_task(-> { puts "Task executed" }, priority: 5)
puts result[:status]  # :queued
# To connect to an existing worker
worker_url = "druby://localhost:8787"
remote_worker = JRubyParallelProcessing::DistributedWorker.connect_to_worker(worker_url)
# Send a heartbeat to the worker
remote_worker.send_heartbeat

Features

  • Distributed Task Execution: Distribute tasks to different worker nodes.
  • Task Prioritization: Tasks are queued with priority, ensuring high-priority tasks are processed first.
  • Heartbeat Mechanism: Ensures worker nodes remain active and can report their status.
  • Fault-Tolerant Task Queue: Handles errors during task queueing and processing.

Configuration Options

  • host: Host address for the DistributedWorker service (default is localhost).
  • port: Port for the DistributedWorker service (default is 8787).
  • priority: Priority of the task being queued (default is 0).

License

This gem is licensed under the MIT License.

FAQs

Package last updated on 06 Sep 2024

Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts