Your mileage may vary, but this worked well for me. I wanted to test one of our sites under load. Our sites generally get a lot of traffic, so when we move from staging to production we have to keep in mind that we’re about to go from 1 or 2 developers hitting the pages to thousands of people.
I’ve previously used Apache Benchmark (ab) to load test, but it’s fairly limited. It can only test one page. I did a bit of digging and found siege. It was perfect for what I wanted to do. I wanted to submit my site to a heavy load, over a long period of time, and see what happened. I also wanted semi-realistic traffic, and with a few well-typed commands, I was able to create a file that siege can read that contained *exactly* the traffic from our site.
I used this to read the last 1000 hits from our apache logs:
tail -1000 /var/log/apache2/access.log | awk '{print "http://mysite.com" $8}' > /tmp/siege-urls.txt
It’s pretty simple, but I’ll break it down. The following gives me the last 1000 lines from the log. Your log might be in a different location, or named something else. If you want fewer lines, or more lines, change the 1000 to something else.
tail -1000 /var/log/apache2/access.log
Then, pipe the output of the above to awk and print out the url. In our apache logs, the url was in the eighth position. I also appended the site’s domain to the output, since the apache log does not contain that information (at least not in the format I wanted).
| awk '{print "http://mysite.com" $8}'
Lastly, save it to a file for later use.
> /tmp/siege-urls.txt
Next, I used this file to place the site under load. In the example below, I used -i for “internet” mode, where siege randomly reads a line from the file and requests it from the server. I also used only 4 concurrent users or worker threads. If you ramp up the concurrency, you can really put a lot of strain on the server.
siege -i -c 4 -f /tmp/siege-urls.txt
When the siege is underway, you’ll get output that looks like this:
HTTP/1.1 200 0.53 secs: 6926 bytes ==> /some/url?some=params HTTP/1.1 200 0.54 secs: 7132 bytes ==> /some/other/url?other=params HTTP/1.1 500 0.13 secs: 521 bytes ==> /some/url?some=params HTTP/1.1 200 0.64 secs: 7133 bytes ==> /some/other/url?other=params HTTP/1.1 500 0.13 secs: 521 bytes ==> /some/other/url?other=params HTTP/1.1 404 0.09 secs: 431 bytes ==> /some/url?some=params
I paid close attention to the 404s and the 500 errors. They indicated that I was getting requests that were erroring out for some reason. Sometimes, those errors are simply bots that grab a hold of old urls and continue to request them. Sometimes, they are cause for concern.
Hitting Control-C ends the siege and you then get output like below.
Lifting the server siege... done. Transactions: 125 hits Availability: 88.03 % Elapsed time: 29.18 secs Data transferred: 1.18 MB Response time: 0.41 secs Transaction rate: 4.28 trans/sec Throughput: 0.04 MB/sec Concurrency: 1.77 Successful transactions: 101 Failed transactions: 17 Longest transaction: 3.13 Shortest transaction: 0.08
I also found it useful to look at top and a few other tools on the server while the siege was underway. In our case, I was interested in passenger’s memory consumption, so I used passenger-memory-status and passenger-status.
Photo Credit: Martin Addison, Demonstrating the Trebuchet