Adding Client Connections and Binary Data to em-websocket

Socket-to-meA few months ago I started working on a project where we wanted to use WebSockets. The technology seemed well-designed and lightweight, it enjoyed strong support from existing browsers, and we quickly found an implementation in Ruby to use for our server: em-websocket.

The problem was, we also wanted a Ruby-based client. Our client environment could not use a browser or Javascript, but it did have Ruby. So we looked around and found em-websocket-client, which built on em-websocket and added a client interface. However, we quickly ran into a major snag: em-websocket-client only supported the old hixie drafts of the websocket protocol, which meant that we could not support binary data transfer. Moreover, em-websocket-client was not receiving updates as frequently as em-websocket.

I started exploring the code and thinking about how Websockets work, and it seemed to me that it should be easy to make em-websocket support client connections as well. WebSockets are designed to be bi-directional, and for any given message it basically doesn’t matter whether it came from the server or the client. The only real differences are the opening handshake and the masking of messages from the client to the server. Since the eventmachine core already supported client connections, I figured it wouldn’t be too hard to get em-websocket to do the same.

So I set out to modify em-websocket to support client connections, with an eye to getting it up and running as quickly as possible. Since we were dealing with a very isolated environment, where we controlled the server and the client, initially I didn’t worry about doing the handshake properly. Additionally, the em-websocket server code did not have support for masking outbound messages, so I did not mask the client’s outbound messages either. The server code did not insist on the inbound messages from the client being masked, so I had no immediate need to deal with the masking. Consequently, with a few hours of work and a small smattering of changes, I could communicate between our client and server using messages conforming to the WebSocket protocol.

Because having the WebSocket connection working with the caveats of false handshakes and no masking was sufficient for our purposes, I moved on to other, more demanding tasks. I returned to em-websocket when I discovered that, despite conforming to the hybi 08 draft, and allowing the sending and receiving of binary frame types at a low level, em-websocket did not provide an externally accessible API for sending frame types besides text. So I just added an option variable to the send and onmessage functions when indicated the type of message. Then I hit another snag, which pointed me toward a Ruby language feature I had not seen before. em-websocket has users pass in a block to call when a message is received, and provides it with one variable: the message payload. However, with the new functionality we would supply the payload and the type (text or binary). But how to do this without breaking any code that already passes the block, and assumes only one variable? It turns out you can check the arity of a block, which will tell you how many parameters it expects:

if @onmessage.arity == 2
  @onmessage.call msg, type
else
  @onmessage.call msg
end

Recently I encountered an issue with the em-websocket code which caused large messages to our client environment to have significant delays. After some digging around in the code, I saw that the unmasking code was iterating over every byte of the incoming message, regardless of whether or not it was masked. I fixed that performance issue quickly, but it renewed my enthusiasm to properly finish the client.

Fortunately, the WebSocket protocol is quite simple, so I re-read how the handshake works and implemented the missing pieces, plus added masking to the outbound messages. For maximum flexibility, whether to mask outbound messages and whether to require inbound messages to be masked are stored as independent settings, continuing to keep most of the code agnostic to whether it is a client or a server.

I’ve put my fork of the project up on github: https://github.com/willryan/em-websocket. WebSockets are a defined by a solid, simple protocol, and I encourage you to experiment with them. em-websocket isn’t updated that often, though I see that there is an RFC protocol posted, so following that maybe the technology will be stable enough that developer interest increases.