I am currently working on a project developing a JRuby server process. The server is in constant communication with multiple front-end clients and back-end processes. The development team decided to make use of the “Celluloid”:http://celluloid.io/ Actor library so we could use concurrent objects as a way to deal with the complexity that comes with developing a multithreaded program.
Heavily inspired by some “Rich Hickey talks”:http://www.infoq.com/presentations/Are-We-There-Yet-Rich-Hickey and our experiments with Clojure, we have been making a concerted effort to use immutable data structures and to keep the application state from being spread out across many objects. To this end, we have been using “Hamster”:https://github.com/harukizaemon/hamster for its immutable data structures.
But how to manage the immutable application state? Again, turning to Clojure for inspiration, we have been experimenting with using a Celluloid Actor as a kind of poor man’s “Ref”:http://clojure.org/refs:
bq. … transactional references (Refs) ensure safe shared use of mutable storage locations via a software transactional memory (STM) system. Refs are bound to a single storage location for their lifetime, and only allow mutation of that location to occur within a transaction.
We aren’t trying to write an STM system, but using a Celluloid Actor to wrap an immutable data structure can cheaply provide a way to avoid concurrent modification without having to write any locking code yourself.
The implementation we have been using is an Actor with two methods, @current@ and @update@. The @current@ method just returns the current value to the caller. It’s immutable, so no worries about someone modifying the application state unexpectedly or concurrently.
The @update@ method is the transaction. It yields the current value to a block and _commits_ the new value that is returned from the block. The @update@ method is marked as @exclusive@ to prevent the possibility of multiple transactions occurring simultaneously. (Supporting simultaneous transactions is certainly something that could be done, but we haven’t had the need for it yet.)
Here is an example implementation of a Ref Actor:
class Ref
include Celluloid
def initialize(data = Hamster.hash)
@data = data
end
def update
old_data = @data
begin
@data = yield old_data
rescue => ex
puts "Error occurred in Ref#update: #{ex.message}"
end
@data
end
exclusive :update
def current
@data
end
end
This could be used as follows:
ref = Ref.new
ref.update do |hash|
hash.put(:temperature, 27).put(:unit, "F")
end
data = ref.current
puts "The current temperature is: #{data[:temperature]} #{data[:unit]}"
If multiple threads are trying to update the Ref at the same time, they will get queued up in the Actor’s mailbox, with only one thread at a time being able to update the value.
This simple example does not include some of the nicer features of a Clojure Ref like validation and the ability to be notified when the value of the Ref is updated. But it wouldn’t be much work to add either of these things to the Ref Actor.
So far this experiment has been working very well, and I suspect we will be using more of these Ref-like Actors as the project continues.
Cool stuff! Though to me it sounds like what you’re describing is more or less exactly like a Clojure Agent, rather than a Ref.
Agents provide immediate access to their current state, and you can update their state asynchronously by sending a function. The function will be called with the agent’s current state, and the return value will become the new state. Multiple updates get queued up and are executed serially.
Tero – I think you might be right.
One thing that we have been working towards that I didn’t get into in this blog post is the use of a kind of transaction to update multiple independent pieces of state in a consistent manner. I think this is what drew us toward calling it a Ref as opposed to an Agent.
I explored some of the exact same ideas on a JRuby project I am working on. The team members are all enthusiastic Clojure-ists working in Ruby day-to-day. So, we took a brief look at hamster and it proved to be way too green, but then something struck us – JRuby and Clojure both target the JVM, so it should be easy to add clojure.jar to the classpath and use Clojure’s native data structures directly from JRuby. Sure enough, Clojure’s structures outperformed Hamster’s by a long shot in JRuby. Here’s a simple benchmark for select.
Also, we loved the idea of atoms for state, and thankfully Charles Nutter has created the atomics gem for atoms in JRuby.
At the end of the day, though, we just pushed to use Clojure itself (and eventually got buy in), rather than push JRuby towards being something it isn’t.
Happy hacking. :)
Thanks Mike! I can’t believe I didn’t think of trying to actually use Clojure’s data structures directly. I am definitely going to try that out.
Thanks for the great comment.
Hi!
Great post.
I think this is similar to a Clojure agent if the update operation is called asynchronously with Celluloid’s
!
-suffix, otherwise it’s more like an atom: http://clojure.org/atomsIn either case, Clojure’s agents and atoms are implemented with atomic compare-and-swap operations and retries while this implementation locks the data for writing during the update operation (through
exclusive
).This makes me think that this implementation could be simplified by simply using a mutex (like this: https://gist.github.com/qerub/4fe10adbafebfa29c7b7) or using the atomics gem mentioned above (for full feature parity with Clojure’s concurrency primitives).
Thanks for the comment Chris.
You are correct that this implementation is locking, and that a mutex or the atomic gem would be a functional equivalent. My guess is that either of those would also be quite a bit faster as well (drops the overhead of the Celluloid actor message passing).