Article summary
We learned two major performance lessons very quickly while working on the hottest part of our recent Clojure application:
Laziness
Laziness is amazing. As Debasish Ghosh explained on his blog, laziness lets you effectively reify and compose loops. Code that looks like it’s producing intermediate data structures is actually composing together patterns of iteration. When the lazy structure is actually consumed, execution proceeds as though all of those maps
, filters
, and other lazy calls were a set of nested loops, with a small amount of windowed overhead instead of O(n)
intermediate structures. We utilized laziness heavily for the data processing code in our application. The result was a lot of loosely coupled functional code whose equivalent in a stricter language (without ad hoc laziness) would have been a hopelessly tangled nest of loops.
Where we ran into trouble was with very large structures. Laziness introduces a per-element overhead which can consume a lot of memory on large collections. Realized lazy sequences have two objects of overhead per element – one thunk representing the lazy computation and one cons holding the realized value and the pointer to the next thunk. Compare this to a vector which has something like a single Object[]
array per 32 elements. Our initial naive implementation left our data set as a large realized lazy sequence and led to us blowing the heap on largish data sets.
Fortunately the loose coupling laziness enables makes it easy to work around this. All we needed to do was dump our lazy sequences directly into a vector before they were consumed by functions which would fully realize them. We retained all the benefits of compositionality and, in fact, didn’t really need to change any of or core processing code. We simply passed lazy structures through (into [])
at key points. The result was a drastic reduction in memory overhead.
Silent but Deadly Reflection
Reflection in Clojure is a known performance problem for hot loops and there are great tools for finding and fixing sources of reflection. We had set *warn-on-reflection*
, but still found we were getting killed by reflection without any warnings pointing us toward a problem. It turns out that type hints in a third party library don’t necessarily mean that library is reflection-safe and setting *warn-on-reflection*
to true doesn’t necessarily warn you of the problem.
For example, executing the following use of clj-time in a repl produces no reflection warnings (for me in Clojure 1.3.0) but jvisualvm shows a large number of ephemeral java.lang.reflect.Methods
being created:
(set! *warn-on-reflection* true)
(use 'clj-time.core)
(def t (now))
(loop [t t] (recur (plus t (secs 1))))
Note that the definition of plus
includes type hints for both its arguments. clj-time has a bug in that the type hint is for an interface that doesn’t define the method that’s called on it. Even hinting the call to plus as (plus ^org.joda.time.DateTime t ^org.joda.time.Period (secs 1))
doesn’t fix the issue because the type hints in the definition of plus
are simply wrong.
Once we discovered this gotcha, things went very smoothly by simply leaving *warn-on-reflection*
set to true and using Joda Time directly (which ended up being necessary because of the operations we ultimately needed to perform).
Performance Takeaways
We learned these lessons early on. The rest of our analysis code was implemented with them in mind and we ended up with fast, decoupled, parallel data analysis on the first try as a result.