Article summary
Haskell is my language of choice, and one of the features that I miss the most when working in a different language is algebraic data types. I get particularly frustrated about the lack of algebraic data types (and the associated destructuring) because they are so simple. You don’t need any fancy runtime features to make them work. It would even be feasible to make them work in C, for goodness sake.
If you are unfamiliar with algebraic data types, you should check out Rust’s enum data type. Rusts’ enum’s are algebraic data types, and I’ve found them to be the gentlest introduction to their usefulness. The Algebraic data type wikipedia entry is also decent and a bit more thorough.
Just Simple Data
I’ve recently been working on a pet project in Ruby. I wanted an easy way to package up network messages where each message may have extra data that depends on the message, serialize the data and send it over a socket, and then easily unpack and dispatch on the data on the other side. Manually defining a class for each message was annoyingly tedious. This would be a perfect application for algebraic data types, but Ruby doesn’t have them. :(
After playing around with a few alternative approaches, I broke down and decided to just implement a simple approximation of algebraic data types.
After a little bit of googling I found Ruby’s Struct class, which greatly simplifies the construction of simple data. But I wanted to go a bit further and see if I could get some basic destructuring support as well.
Ruby’s case syntax uses the case equality operator under the hood, and the case equality operator is conveniently defined for procs in a way that I could hijack to get a basic form of destructuring.
Implementation
Here is the solution I came up with:
def data(*fields)
Class.new(Struct.new(*fields)) do
def self.match(&blk)
proc do |b|
if b.class == self
blk.call(*b.values)
true
else
false
end
end
end
end
end
Example Use
Now I can define data types for my messages like so:
Peer_Connected = data :source_id, :ip, :port
Peer_Data_Message = data :source_id, :data
Peer_Ping = data :source_id
Peer_Disconnected = data :disc_id
Note that the data
function actually creates a new class. I can then use the new classes to construct messages:
a_msg = Peer_Connected.new 5, "127.0.0.1", 4321
another_msg = Peer_Data_Message.new 12, "some data"
And then I can match/destructure using Ruby’s case statement:
def do_something(x)
case x
when Peer_Connected.match do |source_id, ip, port|
puts "Source id: #{source_id}, ip: #{ip}, port: #{port}"
end
when Peer_Data_Message.match do |source_id, data|
puts "Source id: #{source_id}, data: #{data}"
end
when Peer_Ping.match do |source_id|
end
when Peer_Disconnected.match do |disc_id|
end
else
puts "unsupported message"
end
end
do_something(a_msg)
# > "Source id: 5, 127.0.0.1, port: 4321"
do_something(another_msg)
# > "Source id: 12, data: some data"
do_something(nil)
# > "unsupported message"
That’s still a far cry from the algebraic data types in Haskell, but it’s good enough for the moment.
IMO class based matching is simpler (and easier to debug):
class PeerConnected < Struct.new(:source_id, :ip, :port); end
…
def do_something(x)
case x
when Peer_Connected
puts "Source id: #{x.source_id}, ip: #{x.ip}, port: #{x.port}"
when Peer_Data_Message
puts "Source id: #{x.source_id}, data: #{x.data}"
when Peer_Ping
when Peer_Disconnected
else
puts "unsupported message"
end
end