Red Green Performance Testing with The Grinder

redgreenNo, not that Red Green!

Even a thoroughly-tested application can wreck havoc if it hasn’t been tested in the context of a production-like system under production-like conditions.

Tools like Puppet and Chef make it easy to produce a production-like environment for testing, but what about the production-like conditions?

One aspect of these conditions can be approximated with load testing tools like JMeter or The Grinder. I recently used The Grinder to troubleshoot a performance problem with a small web application. Here’s a walkthrough of my process.

Getting Started with the Grinder

Like JMeter, The Grinder is a Java-based load testing framework. It can coordinate the execution of a test plan by distributed worker processes for anything with a Java API. I used it to send requests to a web application with several distinct APIs and components.

The three main components of The Grinder are the Console, Agents, and Workers.

Setup

1. Make sure you have a Java runtime.
2. Download The Grinder from SourceForge.
3. Unzip the archive.

You should see something like this:

~   $ tree -L 2
~   .
~   ├── CHANGES
~   ├── LICENSE-HTTPClient
~   ├── README
~   ├── contrib
~   │   └── mq
~   ├── etc
~   │   ├── httpToClojureScript.xsl
~   │   ├── httpToJythonScript.xsl
~   │   ├── httpToJythonScriptOldInstrumentation.xsl
~   │   ├── httpToXML.xsl
~   │   └── tcpproxy-http.xsd
~   ├── examples
~   │   ├── amazon.py
~   │   ├── console.py
~   │   ├── ...
~   │   └── xml-rpc.py
~   └── lib
~       ├── LICENSE-ASM
~       ├── LICENSE-Jetty
~       ├── ...
~       ├── License-ring-json-params
~       ├── asm-3.2.jar
~       ├── cheshire-4.0.0.jar
~       ├── ...
~       └── xmlbeans-2.5.0.jar

Starting the Grinder

You’ll need to create a grinder.properties file. This file is used to configure several properties including the number of worker processes and threads. Here’s a simple one:

# Please refer to
# http://net.grinder.sourceforge.net/g3/properties.html for further
# documentation.

# The file name of the script to run.
#
# Relative paths are evaluated from the directory containing the
# properties file. The default is "grinder.py".
grinder.script = grinder.clj

# The number of worker processes each agent should start. The default
# is 1.
grinder.processes = 1

# The number of worker threads each worker process should start. The
# default is 1.
grinder.threads = 5

# The number of runs each worker process will perform. When using the
# console this is usually set to 0, meaning "run until the console
# sneds a stop or reset signal". The default is 1.
grinder.runs = 1

### Logging ###

# The directory in which worker process logs should be created. If not
# specified, the agent's working directory is used.
grinder.logDirectory = log

# The number of archived logs from previous runs that should be kept.
# The default is 1.
grinder.numberOfOldLogs = 2

I also created a couple bash scripts to set CLASSPATH and launch different grinder processes.

Set up environment variables (`grinder_env.sh`):

#!/bin/bash
GRINDERPATH=$HOME/path/to/grinder-3.11
GRINDERPROPERTIES=$GRINDERPATH/grinder.properties
CLASSPATH=$GRINDERPATH/lib/grinder.jar:$CLASSPATH
PATH=$JAVA_HOME/bin:$PATH
export CLASSPATH PATH GRINDERPROPERTIES

Launch the console (`startConsole.sh`):

#!/bin/bash
source ./grinder_env.sh
java -classpath $CLASSPATH net.grinder.Console

Launch the agent (`startAgent.sh`):

#!/bin/bash
source ./grinder_env.sh
java -classpath $CLASSPATH net.grinder.Grinder grinder.properties

Launch the proxy (`startProxy.sh`):

#!/bin/bash
source ./grinder_env.sh
java -classpath $CLASSPATH net.grinder.TCPProxy -console -http clojure -console > proxy_session.clj

Recording a Session with TCPProxy

Starting the Proxy

Run the `startProxy.sh` shell script from above. You should see something like this:

TCPProxy_Console

Using the Proxy

Once you’ve started the proxy, configure your browser to send all requests through the proxy. I chose to use Firefox for this because it allowed me to set the proxy at the browser level rather than send all of my HTTP traffic through the proxy.

Firefox_Advanced

Firefox_Proxy

Make sure to disable any extra plugins that might make extra requests not related to the subject under test. Visit pages which will model a typical user’s usage. When you’re done, stop the proxy.

A Clojure Test Script

Here is an example of some of the Clojure code generated by The Grinder.

;; The Grinder 3.11
;; HTTP script recorded by TCPProxy at Mar 14, 2013 1:59:26 AM

(ns user
  (:import (net.grinder.script Test Grinder)
           (net.grinder.plugin.http HTTPPluginControl HTTPRequest)
           (HTTPClient NVPair Codecs)))

(def grinder (Grinder/grinder))
(def connectionDefaults (HTTPPluginControl/getConnectionDefaults))
(def httpUtilities (HTTPPluginControl/getHTTPUtilities))

; To use a proxy server, uncomment the next line and set the host and port.
; (.setProxyServer connectionDefaults "localhost" 8001)

; Worker thread state is stored in a map using a dynamic var.
(def ^:dynamic *tokens*)
(defn set-token [k v] (set! *tokens* (assoc *tokens* k v)))
(defn token [k] (*tokens* k))

(defn nvpairs [c] (into-array NVPair
  (map (fn [[k v]] (NVPair. k v)) (partition 2 c))))

(defn httprequest [url & [headers]]
  (doto (HTTPRequest.) (.setUrl url) (.setHeaders (nvpairs headers))))

(defn basic-authorization [u p]
  (str "Basic " (Codecs/base64Encode  (str u ":" p))))

(defn to-bytes [s]
  (letfn [(to-byte[x] (byte (if (> x 0x7f) (- x 0x100) x)))]
    (byte-array (map to-byte s))))

(defmacro defrequest [name test & args]
  `(do
     (def ~name (httprequest ~@args))
     (.record ~test ~name (HTTPRequest/getHttpMethodFilter))))

(defmacro defpage [name description test & rest]
  `(do
     (defn ~name ~description ~@rest)
     (.record ~test ~name)))

; Offline debug
; (use '[clojure.string :only (join)])
; (defmacro .GET [& k] `(.. grinder (getLogger) (debug (str "GET " (join ", " `(~~@k))))))
; (defmacro .POST [& k] `(.. grinder (getLogger) (debug (str "POST " (join ", " `(~~@k))))))


(.setDefaultHeaders connectionDefaults (nvpairs [
  "Accept-Encoding", "gzip, deflate"
  "Accept-Language", "en-US,en;q=0.5"
  "User-Agent", "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.8; rv:19.0) Gecko/20100101 Firefox/19.0"]))

(def headers0 [
  "Accept", "image/png,image/*;q=0.8,*/*;q=0.5"
  "Referer", "http://www.example.com/"])

(def headers1 [
  "Accept", "*/*"
  "Referer", "http://www.example.com/"])

(def url0 "http://www.example.com:80")
(def url2 "http://ssl.static.example.com:80")

(defrequest request101 (Test. 101 "GET /") url0)

(defrequest request201 (Test. 201 "POST /") url1)

(defrequest request301 (Test. 301 "GET chrome-48.png") url0 headers0)

(defrequest request302 (Test. 302 "GET logo4w.png") url0 headers0)

(defrequest request401 (Test. 401 "GET rs=AItRSTPdVT73a8ca8dITXjGUdziGAyC2IQ") url0 headers1)

(defrequest request501 (Test. 501 "GET rs=AItRSTPdVT73a8ca8dITXjGUdziGAyC2IQ") url0 headers1)

(defrequest request502 (Test. 502 "GET tia.png") url0 headers0)

(defrequest request503 (Test. 503 "GET b84c02c3b64bf7ed.js") url0 headers1)

(defrequest request601 (Test. 601 "GET csi") url0 headers0)

(defrequest request602 (Test. 602 "GET nav_logo117.png") url0 headers0)

(defrequest request701 (Test. 701 "GET sem_87e2600bd08d93bebd4d641cad5ffb62.js") url2 headers1)


; A function for each recorded page.
(defpage page1 "GET / (request 101)." (Test. 100 "Page 1") []
  (.GET request101 "/" nil
    (nvpairs [
      "Accept", "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8"]))
)

(defpage page3 "GET chrome-48.png (requests 301-302)." (Test. 300 "Page 3") []
  (.GET request301 "/images/icons/product/chrome-48.png")

  (.GET request302 "/images/srpr/logo4w.png")
)

(defpage page4 "GET rs=AItRSTPdVT73a8ca8dITXjGUdziGAyC2IQ (request 401)." (Test. 400 "Page 4") []
  (set-token :token_rt "j")
  (set-token :token_ver "Za8TToM0_vY.en_US.")
  (set-token :token_am "BA")
  (set-token :token_d "1")
  (set-token :token_sv "1")
  (set-token :token_rs "AItRSTPdVT73a8ca8dITXjGUdziGAyC2IQ")
  (.GET request401
    (str "/xjs/_/js/s/c,sb,cr,cdos,vm,tbui,mb,hov,wobnm,cfm,abd,klc,kat,aut,bihu,kp,lu,m,rtis,tnv,amcl,erh,hv,lc,ob,rsn,sf,sfa,shb,tbpr,hsm,j,p,pcc,csi/rt=" (token :token_rt)
      "/ver=" (token :token_ver)
      "/am=" (token :token_am)
      "/d=" (token :token_d)
      "/sv=" (token :token_sv)
      "/rs=" (token :token_rs)))
)

(defpage page5 "GET rs=AItRSTPdVT73a8ca8dITXjGUdziGAyC2IQ (requests 501-503)." (Test. 500 "Page 5") []
  (set-token :token_d "0")
  (.GET request501
    (str "/xjs/_/js/s/sy9,gf,ifl/rt=" (token :token_rt)
      "/ver=" (token :token_ver)
      "/am=" (token :token_am)
      "/d=" (token :token_d)
      "/sv=" (token :token_sv)
      "/rs=" (token :token_rs)))

  (.GET request502 "/textinputassistant/tia.png")

  (set-token :token_bav "on.2,or.r_qf.")
  (.GET request503
    (str "/extern_chrome/b84c02c3b64bf7ed.js"
      "?bav=" (token :token_bav)))
)

(defpage page6 "GET csi (requests 601-602)." (Test. 600 "Page 6") []
  (set-token :token_v "3")
  (set-token :token_s "webhp")
  (set-token :token_action "")
  (set-token :token_e "17259,18168,39523,4000116,4001569,4001947,4001959,4001975,4002206,4002562,4002734,4002855,4003053,4003178,4003386,4003575,4003638,4003917,4004181,4004213,4004235,4004257,4004334,4004356,4004363,4004364,4004388,4004479,4004488,4004490,4004653,4004754,4004758,4004904")
  (set-token :token_ei "EtE-UcqbHKKEygGIg4CoBw")
  (set-token :token_imc "2")
  (set-token :token_imn "2")
  (set-token :token_imp "2")
  (set-token :token_atyp "csi")
  (set-token :token_adh "")
  (set-token :token_rt "xjsls.504,prt.538,xjses.3445,xjsee.3656,xjs.3659,ol.3969,iml.1089,wsrt.1342,cst.0,dnst.0,rqst.1425,rspt.175")
  (.GET request601
    (str "/csi"
      "?v=" (token :token_v)
      "&s=" (token :token_s)
      "&action=" (token :token_action)
      "&e=" (token :token_e)
      "&ei=" (token :token_ei)
      "&imc=" (token :token_imc)
      "&imn=" (token :token_imn)
      "&imp=" (token :token_imp)
      "&atyp=" (token :token_atyp)
      "&adh=" (token :token_adh)
      "&rt=" (token :token_rt)))

  (.GET request602 "/images/nav_logo117.png")
)

(defpage page7 "GET sem_87e2600bd08d93bebd4d641cad5ffb62.js (request 701)." (Test. 700 "Page 7") []
  (.GET request701 "/gb/js/sem_87e2600bd08d93bebd4d641cad5ffb62.js")
)


(defn run
  "Called for every run performed by the worker thread." [] 
  
  (page1)      ; GET / (request 101)

  (.sleep grinder 246)
  (page3)      ; GET chrome-48.png (requests 301-302)

  (.sleep grinder 32)
  (page4)      ; GET rs=AItRSTPdVT73a8ca8dITXjGUdziGAyC2IQ (request 401)

  (.sleep grinder 2899)
  (page5)      ; GET rs=AItRSTPdVT73a8ca8dITXjGUdziGAyC2IQ (requests 501-503)

  (.sleep grinder 249)
  (page6)      ; GET csi (requests 601-602)

  (.sleep grinder 358)
  (page7)      ; GET sem_87e2600bd08d93bebd4d641cad5ffb62.js (request 701)
)

(defn runner-factory
  "Create a run function. Called for each worker thread." []
  (binding [*tokens* {}] (bound-fn* run)))

After recording your session, you can modify this script to eliminate any requests you want to exclude from your test.

Running the Test

First, start the Console:

./startConsole.sh

Then, start the Agent:

./startAgent.sh

From the Console, you can star the grinder.properies file to mark it for use:
The_Grinder_Console_star

And edit your grinder.properties to point to your test script:
The_Grinder_Console_testscript

Depending on what you’ve done, you may need to reset the Agent(s) at this point (I usually want to reset the Console, too):
The_Grinder_Console_resetprocesses

Then you can distribute files to the Agent(s) – this includes the test script specified in our grinder.properties file:
The_Grinder_Console_distributefiles

Make it Fail

Turn up the number of threads and/or worker processes until the load replicates the failure case. As Red Green says, “If it ain’t broke, you’re not trying!”

Make sure to consider the whole system at this point because it’s easy to fool yourself into thinking you’ve crushed the server under heavy load when really you’ve only sapped the resources of you agents or local network.

In my case, I was trying to model the load of approximately 30 roughly concurrent requests for the same set of resources.
Due to interactions between several system components and a broken caching mechanism, this was causing the app to become unresponsive for several minutes.
My test script was able to model this failure quite well.

Go Green

Using The Grinder, I was able to model this failure well enough to test several configuration changes as well as a replacement caching mechanism. When the system was able to withstand the load of of the test (the test passed), I was confident that the changes were likely to work in production.

Summary

By first creating a failing test for the scenario of a complete system under load, I gained confidence that configuration changes I deployed to production would solve the problem. This was relatively a rudimentary example. What tools and techniques does your team use to test system integration at this level?