= Grand Central Dispatch for MacRuby
== Introduction
This article explains how to use Grand Central Dispatch (GCD) from MacRuby, and is adapted from {Introducing Blocks and Grand Central Dispatch}[http://developer.apple.com/mac/articles/cocoa/introblocksgcd.html] at the {Apple Developer Connection}[http://developer.apple.com/].
=== About GCD
GCD is a revolutionary approach to multicore computing that is woven throughout the fabric of {Mac OS X}[http://www.apple.com/macosx/] version 10.6 Snow Leopard. GCD combines an easy-to-use programming model with highly-efficient system services to radically simplify the code needed to make best use of multiple processors. The technologies in GCD improve the performance, efficiency, and responsiveness of Snow Leopard out of the box, and will deliver even greater benefits as more developers adopt them.
The central insight of GCD is shifting the responsibility for managing threads and their execution from applications to the operating system. As a result, programmers can write less code to deal with concurrent operations in their applications, and the system can perform more efficiently on single-processor machines, large multiprocessor servers, and everything in between. Without a pervasive approach such as GCD, even the best-written application cannot deliver optimal performance, because it doesn't have full insight into everything else happening in the system.
=== The MacRuby Dispatch module
GCD is natively implemented as a C API and runtime engine. MacRuby 0.5 introduces a new "Dispatch" module, which provides a Ruby wrapper around that API. This allows Ruby blocks to be scheduled on queues for asynchronous and concurrent execution either explicitly or in response to various kinds of events, with GCD automatically mapping queues to threads as needed. The Dispatch module provides four primary abstractions that mirror the C API:
Dispatch::Queue:: The basic unit for organizing blocks. Several queues are created by default, and applications may create additional queues for their own use.
Dispatch::Group:: Allows applications to track the progress of blocks submitted to queues and take action when the blocks complete.
Dispatch::Source:: Monitors and coalesces low-level system events so that they can be responded to asynchronously via simple event handlers.
Dispatch::Semaphore:: Synchronizes threads via a combination of waiting and signalling.
=== What You Need
The examples all assume you run the latest macirb and require the +dispatch+ library:
$ macirb
require 'dispatch'
We also assume that you are already familiar with Ruby, though not necessarily MacRuby. No prior knowledge of C or GCD is assumed or required, but the {dispatch(3) man page}[http://developer.apple.com/mac/library/DOCUMENTATION/Darwin/Reference/ManPages/man3/dispatch.3.html] may be helpful if you wish to better understand the underlying semantics.
== Dispatch::Job: Easy Concurrency
The easiest way to perform concurrent work is via a +Job+ object. Say you have a complex, long-running calculation you want to happen in the background. Create a job by passing Dispatch::Job's initializer the block you want to execute:
job = Dispatch::Job.new { Math.sqrt(10**100) }
This atomically[http://en.wikipedia.org/wiki/Atomic_operation] adds the block to GCD's default concurrent queue, then returns immediately so you don't stall the main thread.
Concurrent queues schedule as many simultaneous blocks as they can on a first-in/first-out (FIFO[http://en.wikipedia.org/wiki/FIFO]) basis, as long as there are threads available. If there are spare CPUs, the system will automatically create more threads -- and reclaim them when idle -- allowing GCD to dynamically scale the number of threads based on the overall system load. Thus (unlike with threads, which choke when you create too many) you can generally create as many jobs as you want, and GCD will do the right thing.
=== Job#value: Asynchronous Return Values
The downside of asynchrony is that you don't know exactly when your job will execute. Fortunately, +Dispatch::Job+ attempts to duck-type +Thread[http://ruby-doc.org/core/classes/Thread.html]+, so you can call +value[http://ruby-doc.org/core/classes/Thread.html#M000460]+ to obtain the result of executing that block:
@result = job.value
puts "value (sync): #{@result} => 1.0e+50"
This will wait until the value has been calculated, allowing it to be used as an {explicit Future}[http://en.wikipedia.org/wiki/Futures_and_promises]. However, this may stall the main thread indefinitely, which reduces the benefits of concurrency.
Wherever possible, you should instead attempt to figure out exactly when and why you need to know the result of asynchronous work. Then, call +value+ with a block to also perform that work asynchronously once the value has been calculated -- all without blocking the main thread:
job.value {|v| puts "value (async): #{v} => 1.0e+50" } # (eventually)
=== Job#join: Job Completion
If you just want to track completion, you can call +join[http://ruby-doc.org/core/classes/Thread.html#M000462]+, which waits without returning the result:
job.join
puts "join done (sync)"
Similarly, call +join+ with a block to run asynchronously once the work has been completed
job.join { puts "join done (async)" }
=== Job#add: Coordinating Work
More commonly, you will have multiple units of work you'd like to perform in parallel. You can add blocks to an existing job using +add:
job.add { Math.sqrt(2**64) }
If there are multiple blocks in a job, +value+ will wait until they all finish then return the last value received:
job.value {|b| puts "value (async): #{b} => 4294967296.0" }
=== Job#values: Returning All Values
Note that values may be received out of order, since they may take differing amounts of time to complete. If you need to force a particular ordering, create a new +Job+ or call +join+ before submitting the block.
Additionally, you can call +values+ to obtain all the values:
@values = job.values
puts "values: #{@values.inspect} => [1.0E50]"
job.join
puts "values: #{@values.inspect} => [1.0E50, 4294967296.0]"
Note that unlike +value+ this will not by itself first +join+ the job, and thus does not have an asynchronous equivalent.
== Dispatch::Proxy: Protecting Shared Data
Concurrency would be easy if everything was {embarrassingly parallel}[http://en.wikipedia.org/wiki/Embarrassingly_parallel], but it becomes tricky when we need to share data between threads. If two threads try to modify the same object at the same time, it could lead to inconsistent (read: corrupt) data. There are well-known techniques for preventing this sort of data corruption (e.g. locks[http://en.wikipedia.org/wiki/Lock_(computer_science)] and mutexes[http://en.wikipedia.org/wiki/Mutual%20eclusion]), but these have their own well-known problems (e.g., deadlock[http://en.wikipedia.org/wiki/Deadlock], and {priority inversion}[http://en.wikipedia.org/wiki/Priority_inversion]).
Because Ruby traditionally had a global VM lock (or GIL[http://en.wikipedia.org/wiki/Global_Interpreter_Lock]), only one thread could modify data at a time, so developers never had to worry about these issues; then again, this also meant they didn't get much benefit from additional threads.
In MacRuby, every thread has its own Virtual Machine, which means all of them can access Ruby objects at the same time -- great for concurrency, not so great for data integrity. Fortunately, GCD provides serial queues for {lock-free synchronization}[http://en.wikipedia.org/wiki/Non-blocking_synchronization], by ensuring that only one thread a time accesses a particular object -- without the complexity and inefficiency of locking. Here we will focus on +Dispatch::Proxy+, a high-level construct that implements the {Actor model}[http://en.wikipedia.org/wiki/Actor_model] by wrapping any arbitrary Ruby object with a +SimpleDelegate+ that only allows execution of one method at a time (i.e., serializes data access on to a private queue).
=== Job#synchronize: Creating Proxies
The easiest way to create a Proxy is to first create an empty Job:
job = Dispatch::Job.new {}
then ask it to wrap the object you want to modify from multiple threads:
@hash = job.synchronize Hash.new
puts "synchronize: #{@hash.class} => Dispatch::Proxy"
This is actually the same type of object used to manage the list of +values+:
puts "values: #{job.values.class} => Dispatch::Proxy"
=== Proxy#method_missing: Using Proxies
The Proxy object can be called just as it if were the delegate object:
@hash[:foo] = :bar
puts "proxy: #{@hash} => {:foo=>:bar}"
@hash.delete :foo
Except that you can use it safely inside Dispatch blocks from multiple threads:
[64, 100].each do |n|
job.add { @hash[n] = Math.sqrt(10**n) }
end
job.join
puts "proxy: #{@hash} => {64 => 1.0E32, 100 => 1.0E50}"
In this case, each block will perform the +sqrt+ asynchronously on the concurrent queue, potentially on multiple threads
As with Dispatch::Job, you can make any invocation asynchronous by passing a block:
@hash.inspect { |s| puts "inspect: #{s} => {64 => 1.0E32, 100 => 1.0E50}" }
=== Proxy#__value__: Returning Delegate
If for any reason you need to retrieve the original (unproxied) object, simply call +value+:
delegate = @hash.__value__
puts "\n__value__: #{delegate.class} => Hash"
This differs from +SimpleDelegate#getobj+ (which Dispatch::Proxy inherits) in that it will first wait until any pending asynchronous blocks have executed.
As elsewhere in Ruby, the "__" namespace implies "internal" methods, in this case meaning they are called directly on the proxy rather than passed to the delegate.
==== Caveat: Local Variables
Because Dispatch blocks may execute after the local context has gone away, you should always store Proxy objects in a non-local variable: instance, class, or global -- anything with a sigil[http://en.wikipedia.org/wiki/Sigil_(computer_programming)].
Note that we can as usual access local variables from inside the block; GCD automatically copies them, which is why this works as expected:
n = 42
job = Dispatch::Job.new { puts "n (during): #{n} => 42" }
job.join
but this doesn't:
n = 0
job = Dispatch::Job.new { n = 21 }
job.join
puts "n (after): #{n} => 0?!?"
The general rule is "do not assign to external variables inside a Dispatch block." Assigning local variables will have no effect (outside that block), and assigning other variables may replace your Proxy object with a non-Proxy version. Remember also that Ruby treats the accumulation operations ("+=", "||=", etc.) as syntactic sugar over assignment, and thus those operations only affect the copy of the variable:
n = 0
job = Dispatch::Job.new { n += 84 }
job.join
puts "n (+=): #{n} => 0?!?"
== Dispatch Enumerable: Parallel Iterations
Jobs are useful when you want to run a single item in the background or to run many different operations at once. But if you want to run the same operation multiple times, you can take advantage of specialized GCD iterators. The Dispatch module defines "p_" variants of common Ruby iterators, making it trivial to parallelize existing operations.
In addition, for simplicity they all are synchronous, meaning they won't return until all the work has completed.
=== Integer#p_times
The simplest iteration is defined on the +Integer+ class, and passes the index that many +times+:
5.times { |i| print "#{10**i}\t" }
puts "times"
becomes
5.p_times { |i| print "#{10**i}\t" }
puts "p_times"
Note that even though the iterator as a whole is synchronous, and blocks are scheduled in the order received, each block runs independently and therefore may complete out of order.
This does add some overhead compared to the non-parallel version, so if you have a large number of relatively cheap iterations you can batch them together by specifying a +stride+:
5.p_times(3) { |i| print "#{10**i}\t" }
puts "p_times(3)"
It doesn't change the result, but schedules fewer blocks thus amortizing the overhead over more work. Note that items within a stride are executed completely in the original order, but no order is guaranteed between strides.
The +p_times+ method is used to implement several convenience methods on +Enumerable+, which are therefore available from any class which mixes that in (e.g, +Array+, +Hash+, etc.). These also can take an optional stride.
=== Enumerable#p_each
Passes each object, like +each+:
DAYS=%w(Mon Tue Wed Thu Fri)
DAYS.each { |day| print "#{day}\t"}
puts "each"
DAYS.p_each { |day| print "#{day}\t"}
puts "p_each"
DAYS.p_each(3) { |day| print "#{day}\t"}
puts "p_each(3)"
=== Enumerable#p_each_with_index
Passes each object and its index, like +each_with_index+:
DAYS.each_with_index { |day, i | print "#{i}:#{day}\t"}
puts "each_with_index"
DAYS.p_each_with_index { |day, i | print "#{i}:#{day}\t"}
puts "p_each_with_index"
DAYS.p_each_with_index(3) { |day, i | print "#{i}:#{day}\t"}
puts "p_each_with_index(3)"
=== Enumerable#p_map
Passes each object and collects the transformed values, like +map+:
print (0..4).map { |i| "#{10**i}\t" }.join
puts "map"
print (0..4).p_map { |i| "#{10**i}\t" }.join
puts "p_map"
print (0..4).p_map(3) { |i| "#{10**i}\t" }.join
puts "p_map(3)"
=== Enumerable#p_mapreduce
Unlike the others, this method does not have a serial equivalent, but you may recognize it from the world of {distributed computing}[http://en.wikipedia.org/wiki/MapReduce]:
mr = (0..4).p_mapreduce(0) { |i| 10**i }
puts "p_mapreduce: #{mr} => 11111"
This uses a parallel +inject+ (formerly known as +reduce+) to return a single value by combining the result of +map+. Unlike +inject+, you must specify an explicit initial value as the first parameter. The default accumulator is ":+", but you can specify a different symbol to +send+:
mr = (0..4).p_mapreduce([], :concat) { |i| [10**i] }
puts "p_mapreduce(:concat): #{mr} => [1, 1000, 10, 100, 10000]"
Because of those parameters, the optional +stride+ is now the third:
mr = (0..4).p_mapreduce([], :concat, 3) { |i| [10**i] }
puts "p_mapreduce(3): #{mr} => [1000, 10000, 1, 10, 100]"
=== Enumerable#p_find_all
Passes each object and collects those for which the block is true, like +find_all+:
puts "find_all | p_find_all | p_find_all(3)"
puts (0..4).find_all { |i| i.odd? }.inspect
puts (0..4).p_find_all { |i| i.odd? }.inspect
puts (0..4).p_find_all(3) { |i| i.odd? }.inspect
=== Enumerable#p_find
Passes each object and returns nil if none match. Similar to +find+, it returns the first object it finds for which the block is true, but unlike +find+ that may not be the actual first object since blocks -- say it with me -- "may complete out of order":
puts "find | p_find | p_find(3)"
puts (0..4).find { |i| i == 5 }.nil? # => nil
puts (0..4).p_find { |i| i == 5 }.nil? # => nil
puts (0..4).p_find(3) { |i| i == 5 }.nil? # => nil
puts "#{(0..4).find { |i| i.odd? }} => 1"
puts "#{(0..4).p_find { |i| i.odd? }} => 1?"
puts "#{(0..4).p_find(3) { |i| i.odd? }} => 3?"
== Queues: Serialization
Most of the time, you can simply use GCD's default concurrent queues or the built-in queues associated with synchronized objects. However, if you want more precise control you can create and use your own queues.
=== Queue::new
The simplest way to create a +new+ queue is by passing it a meaningful name, typically using reverse-DNS naming:
puts
puts q = Dispatch::Queue.new("org.macruby.queue.example")
=== Queue#sync
You can schedule blocks directly on a queue synchronously using +sync+:
q.sync { puts "queue sync" }
=== Queue#async
However, it is usually more useful to schedule them asynchronously using +async+:
q.async { puts "queue async" }
=== Queue#join
A key advantage of scheduling blocks on your own private queue is that you can ensure that all pending blocks have been executed, via a +join+:
puts "queue join"
q.join
== Semaphores: Synchronization
Semaphores provide a powerful mechanism for communicating information across multiple queues. They are another low-level mechanism you can use for synchronizing work.
=== Semaphore::new
First, create a semaphore using +new+:
puts
puts semaphore = Dispatch::Semaphore.new(0)
Semaphores can be used to manage complex interactions, but here will simply use to them signal completion of a single task by passing a +count+ of zero.
=== Semaphore#signal
Next, schedule an asynchronous block that will +signal+ when it is done:
q.async {
puts "semaphore signal"
semaphore.signal
}
=== Semaphore#wait
Finally, +wait+ for that signal to arrive
puts "semaphore wait"
semaphore.wait
== Sources: Asynchronous Events
In addition to scheduling blocks directly, GCD makes it easy to run a block in response to various system events via a Dispatch::Source, which can be a:
- Timer
- Custom event
- Signal
- File descriptor (file or socket)
- Process state change
When the source "fires", GCD will schedule the handler block on the specific queue if it is not currently running, or -- more importantly -- coalesce pending events if it is. This provides excellent responsiveness without the expense of either polling or binding a thread to the event source. Plus, since the handler is never run more than once at a time, the block doesn't even need to be reentrant -- and thus you don't need to +synchronize+ any variables that are only used there.
=== Source.periodic
We'll start with a simple example: a +periodic+ timer that runs every 0.4 seconds and prints out the number of pending events:
puts
timer = Dispatch::Source.periodic(0.4) do |src|
puts "Dispatch::Source.periodic: #{src.data}"
end
sleep 1 # => 1 1 ...
If you're familiar with the C API for GCD, be aware that a +Dispatch::Source+ is fully configured at the time of instantiation, and does not need to be +resume+d. Also, times are in seconds, not nanoseconds.
=== Source#data
As you can see above, the handle gets called with the source itself as a parameter, which allows you query it for the source's +data+. The meaning of the data varies with the type of +Source+, though it is always an integer. Most commonly -- as in this case -- it is a count of the number of events being processed, and thus "1".
=== Source#suspend!
This monotony rapidly gets annoying; to pause, just +suspend!+ the source:
timer.suspend!
puts "suspend!"
sleep 1
You can suspend a source at any time to prevent it from running another block, though this will not affect a block that is already being processed.
=== Source#resume!
If you change your mind, you can always +resume!+ the source:
timer.resume!
puts "resume!"
sleep 1 # => 1 2 1 ...
If the +Source+ has fired one or more times, it will schedule a block containing the coalesced events. In this case, we were suspended for over 2 intervals, so the next block will fire with +data+ being at least 2.
=== Source#cancel!
Finally, you can stop the source entirely by calling +cancel!+:
timer.cancel!
puts "cancel!"
puts
Cancellation is particularly significant in MacRuby's implementation of GCD, since (due to the reliance on garbage collection) there is no other way to explicitly stop using a source.
=== Custom Sources
Next up are custom or application-specific sources, which are fired explicitly by the developer instead of in response to an external event. These simple behaviors are the primitives upon which other sources are built.
==== Source.add
The +add+ source accumulates the sum of the event data (e.g., for a counter) in a thread-safe manner:
@sum = 0
adder = Dispatch::Source.add do |s|
puts "Dispatch::Source.add: #{s.data} (#{@sum += s.data})"
semaphore.signal
end
Note that we use an instance variable (since it is re-assigned), but we don't have to +synchronize+ it -- and can safely re-assign it -- since the event handler does not need to be reentrant.
==== Source#<<
To fire a custom source, we invoke what GCD calls a merge using the shovel operator ('+<<+'):
adder << 1
semaphore.wait
puts "sum: #{@sum} => 1"
Note the use of +Semaphore#wait+ to ensure the asynchronously-scheduled event handler has been run.
The name "merge" makes more sense when you see it coalesce multiple firings into a single handler:
adder.suspend!
adder << 3
adder << 5
puts "sum: #{@sum} => 1"
adder.resume!
semaphore.wait
puts "sum: #{@sum} => 9"
adder.cancel!
Since the source is suspended -- mimicking what would happen if your event handler was busy at the time -- GCD automatically merges the results together using addition. This is useful for tracking cumulative results across multiple threads, e.g. for a progress meter. Notice this is the event coalescing behavior used by +periodic+.
==== Source.or
Similarly, the +or+ source combines events using a logical OR (e.g, for booleans or bitmasks):
@mask = 0
masker = Dispatch::Source.or do |s|
@mask |= s.data
puts "Dispatch::Source.or: #{s.data.to_s(2)} (#{@mask.to_s(2)})"
semaphore.signal
end
masker << 0b0001
semaphore.wait
puts "mask: #{@mask.to_s(2)} => 1"
masker.suspend!
masker << 0b0011
masker << 0b1010
puts "mask: #{@mask.to_s(2)} => 1"
masker.resume!
semaphore.wait
puts "mask: #{@mask.to_s(2)} => 1011"
masker.cancel!
puts
This is primarily useful for flagging what kinds of events have taken place since the last time the handler fired.
=== Process Sources
Let's see how both of use are used for sources which deal with UNIX processes.
==== Source.process
This +or+-style source takes and returns a mask of different events affecting the specified +process+:
exec:: Dispatch::Source.PROC_EXEC
exit:: Dispatch::Source.PROC_EXIT
fork:: Dispatch::Source.PROC_FORK
signal:: Dispatch::Source.PROC_SIGNAL
[NOTE: +Thread#fork+ is not supported by MacRuby]
The underlying API expects and returns integers, e.g.:
@event = 0
mask = Dispatch::Source::PROC_EXIT | Dispatch::Source::PROC_SIGNAL
proc_src = Dispatch::Source.process($$, mask) do |s|
@event |= s.data
puts "Dispatch::Source.process: #{s.data.to_s(2)} (#{@event.to_s(2)})"
semaphore.signal
end
In this case, we are watching the current process ('$$') for +:signal+ and (less usefully :-) +:exit+ events .
==== Source#data2events
Alternatively, you can pass in array of names (symbols or strings) for the mask, and optionally use +data2events+ to convert the returned data into an array of symbols
semaphore2 = Dispatch::Semaphore.new(0)
@events = []
mask2 = [:exit, :fork, :exec, :signal]
proc_src2 = Dispatch::Source.process($$, mask2) do |s|
these = Dispatch::Source.data2events(s.data)
@events += these
puts "Dispatch::Source.process: #{these} (#{@events})"
semaphore2.signal
end
==== Source.process Example
To fire the event, we can, e.g., send a un-trapped signal :
sig_usr1 = Signal.list["USR1"]
Signal.trap(sig_usr1, "IGNORE")
Process.kill(sig_usr1, $$)
Signal.trap(sig_usr1, "DEFAULT")
You can check which flags were set by _and_ing against the bitmask:
semaphore.wait
result = @event & mask
print "@event: #{result.to_s(2)} =>"
puts " #{Dispatch::Source::PROC_SIGNAL.to_s(2)} (Dispatch::Source::PROC_SIGNAL)"
proc_src.cancel!
Or equivalently, intersecting the array:
semaphore2.wait
puts "@events: #{(result2 = @events & mask2)} => [:signal]"
proc_src2.cancel!
==== Source#event2num
You can convert from symbol to int via +event2num+:
puts "event2num: #{Dispatch::Source.event2num(result2[0]).to_s(2)} => #{result.to_s(2)}"
==== Source#data2events
Similarly, use +data2events+ to turn an int into a symbol:
puts "data2events: #{Dispatch::Source.data2events(result)} => #{result2}"
==== Source.signal
This +Source+ overlaps slightly with the previous one, but uses +add+ to track the number of times that a specific +signal+ was fired against the current process:
@signals = 0
sig_usr2 = Signal.list["USR2"]
signal = Dispatch::Source.signal(sig_usr2) do |s|
puts "Dispatch::Source.signal: #{s.data} (#{@signals += s.data})"
semaphore.signal
end
puts "signals: #{@signals} => 0"
signal.suspend!
Signal.trap(sig_usr2, "IGNORE")
3.times { Process.kill(sig_usr2, $$) }
Signal.trap(sig_usr2, "DEFAULT")
signal.resume!
semaphore.wait
puts "signals: #{@signals} => 3"
signal.cancel!
puts
=== File Sources
Next up are sources which deal with file operations -- actually, anything that modifies a vnode, including sockets and pipes.
==== Source.file
This +or+-style source takes and returns a mask of different events affecting the specified +file+:
delete:: Dispatch::Source.VNODE_DELETE
write:: Dispatch::Source.VNODE_WRITE
extend:: Dispatch::Source.VNODE_EXTEND
attrib:: Dispatch::Source.VNODE_ATTRIB
link:: Dispatch::Source.VNODE_LINK
rename:: Dispatch::Source.VNODE_RENAME
revoke:: Dispatch::Source.VNODE_REVOKE
As before, the underlying API expects and returns integers, e.g.:
@fevent = 0
@msg = "#{$$}-#{Time.now.to_s.gsub(' ','_')}"
puts "msg: #{@msg}"
filename = "/tmp/dispatch-#{@msg}"
puts "filename: #{filename}"
file = File.open(filename, "w")
fmask = Dispatch::Source::VNODE_DELETE | Dispatch::Source::VNODE_WRITE
file_src = Dispatch::Source.file(file.fileno, fmask, q) do |s|
@fevent |= s.data
puts "Dispatch::Source.file: #{s.data.to_s(2)} (#{@fevent.to_s(2)})"
semaphore.signal
end
file.print @msg
file.flush
file.close
semaphore.wait(0.1)
print "fevent: #{(@fevent & fmask).to_s(2)} =>"
puts " #{Dispatch::Source::VNODE_WRITE.to_s(2)} (Dispatch::Source::VNODE_WRITE)"
File.delete(filename)
semaphore.wait(0.1)
print "fevent: #{@fevent.to_s(2)} => #{fmask.to_s(2)}"
puts " (Dispatch::Source::VNODE_DELETE | Dispatch::Source::VNODE_WRITE)"
file_src.cancel!
q.join
And of course can also use symbols:
@fevent2 = []
file = File.open(filename, "w")
fmask2 = %w(delete write)
file_src2 = Dispatch::Source.file(file, fmask2) do |s|
@fevent2 += Dispatch::Source.data2events(s.data)
puts "Dispatch::Source.file: #{Dispatch::Source.data2events(s.data)} (#{@fevent2})"
semaphore2.signal
end
file.print @msg
file.flush
semaphore2.wait(0.1)
puts "fevent2: #{@fevent2} => [:write]"
file_src2.cancel!
As a bonus, if you pass in an actual IO object (not just a file descriptor) the Dispatch library will automatically create a handler that closes the file for you when cancelled!
==== Source.read
In contrast to the previous sources, these next two refer to internal state rather than external events. Specifically, this +add+-style source avoids blocking on a +read+ by only calling the handler when it estimates there are +s.data+ unread bytes available in the buffer:
file = File.open(filename, "r")
@input = ""
reader = Dispatch::Source.read(file) do |s|
@input << file.read(s.data)
puts "Dispatch::Source.read: #{s.data}: #{@input}"
end
while (@input.size < @msg.size) do; end
puts "input: #{@input} => #{@msg}" # => e.g., 74323-2010-07-07_15:23:10_-0700
reader.cancel!
Strictly speaking, the count returned by +s.data+ is only an estimate. It would be safer to instead call +@file.read(1)+ each time to avoid any risk of blocking -- but that would lead to many more block invocations, which might not be a net win.
Note that since the block handler may be called many times, we can't wait on a semaphore, but instead test on the shared variable. In a real implementation you should detect end of file instead.
==== Source.write
This +add+-style event is similar to the above, but waits until a +write+ buffer is available:
file = File.open(filename, "w")
@next_char = 0
writer = Dispatch::Source.write(file) do |s|
if @next_char < @msg.size then
char = @msg[@next_char]
file.write(char)
@next_char += 1
puts "Dispatch::Source.write: #{char}|#{@msg[@next_char..-1]}"
end
end
while (@next_char < @msg.size) do; end
puts "output: #{File.read(filename)} => #{@msg}" # e.g., 74323-2010-07-07_15:23:10_-0700
File.delete(filename)
In this case we play it safe by only writing out a single character each time we are called, to avoid risk of blocking (and simplify our algorithm).
= What's Next?
This concludes our introduction to the high-level wrappers available when using +require 'dispatch'+. These allow you to easily write concurrent, asynchronous code using simple Ruby idioms. For additional performance and fine-grained control, you may want to dive down into directly using Queues, Groups, and Semaphores.
At the moment, this is best done by reading the {existing C documentation for GCD}[http://developer.apple.com/mac/articles/cocoa/introblocksgcd.html] and comparing to the relevant RubyDoc.
$ macri Dispatch
$ macri Dispatch::Queue
$ macri Dispatch::Group
$ macri Dispatch::Semaphore
However, feel free to {file bugs}[https://www.macruby.org/auth/login/?next=/trac/newticket] about additional documentation you would like to see.
= Development
Gemified by {Arthur Gunn}[mailto:arthur@gunn.co.nz].
To contribute - please fork, commit, pull request, bonus points for tests. Have a play and report bugs, request features via the issue tracker.