Looking for the snappiest, fastest web server software available on this here internet? So were we. Valid, independent, non-synthetic benchmarks can be difficult to find. Of course, we all know that benchmarks don’t tell us everything we need to know about real-world performance; but what’s the fun of having choices if we can’t pick the best?
Exactly. I decided to do a little research project of my own.
I selected for this exercise recent (as of October 2011) versions of Apache, Nginx, Lighttpd, G-WAN, and IIS — a list that includes the most popular web servers as well as web servers that came recommended for their ability to rapidly serve static content. The test machine was a modern quad-core workstation running CentOS 6.0. For the IIS tests I booted the same machine off of a different hard drive running Windows Server 2008 SP2.
Each web server was left in its default configuration to the greatest extent practical — I was willing, for example, to increase Apache’s default connection limit, and to configure Lighttpd to use all four processor cores available on the test machine. I’m willing to revisit any of these tests if I discover that there is a commonly used optimization that for some reason is disabled by default, but these are all mature software packages that should perform well out of the box.
The results in this article describe the behavior of heavily trafficked web servers delivering static content. I believe that performance metrics of this nature can be useful for understanding the behavior of systems under load, but we should remember that real-world systems are complex and that performance can be constrained by many variables other than the efficiency of any particular software package. In particular, these numbers will not necessarily be meaningful for systems that will be performing functions related to dynamic content, such as maintaining session state or performing searches.
For the realistic test, I downloaded a snapshot of Google’s front landing page and converted it into a very small static site. I choose this page because it represents a real, minimalistic page with a variety of resource sizes and types. Very few organizations offer a slimmer first impression.
Apache, Nginx, Lighttpd, G-WAN and IIS all comfortably saturated the test network at roughly 930
megabits per second (after TCP overhead), but some variation in efficiency at near-saturation is still apparent:
A lower number here presumably implies reduced hardware and energy costs, and also represents excess capacity that presumably could be used for other tasks.
(G-WAN, despite proving a snappy little web server in the benchmark test, eagerly consumed 50% of available CPU resources almost immediately upon receiving requests. As nearly as I could tell given the time I had available, G-WAN uses a different process model from other server software and CPU utilization is not necessarily proportional to load.)
For the benchmark test, I decreased the size of the resources to one kilobyte and increased the number of resources loaded per connection from 5 to 25. The former action was a deliberate adjustment of a test variable, but the latter was simply an operational necessity: opening and closing TCP connections too rapidly burns a surprising amount of CPU power, tests the operating system’s TCP implementation as much if not more than the web server, and beyond a certain threshold simply isn’t permitted by the client-side OS even with tuning.
The goal of the benchmark test was to measure the maximum throughput of each server in requests per second. Each test was considered complete when throughput stopped increasing with respect to load.
I also monitored average response times; note that these are average round-trips for a single resource, running over 1 or 2 switches (varying due to the use of multiple load generating engines), such that all systems are within about 50 feet of each other. The measurement is a running average taken during a sampling window between the second and third minutes into each test, when the system would be under only modest load (about 500 to 750 requests per second, actual).
The ability to respond to a request in less than 3 milliseconds is certainly an impressive technological feat. But to put these numbers in perspective, lets look at how they would affect a typical use case. Let’s assume that the user has a 100 millisecond round-trip (ping) time to your server, that we will need at most 3 sequential round trips (which is about right if we’re loading a typical page with 12-18 resources), and also need one round-trip for the SYN/ACK to establish the first TCP connection. Based on these assumptions, I calculated the simulated page load times below. The red bar represents the 1.5 second mark, a page duration goal that Google, in their wisdom, has deemed the dividing line between “fast” and “slow” web pages in their own analysis software.
If you’re interested in round-trip times, how they affect the performance of your system, and how to optimize them away, Google has a comprehensive article on the subject.
This data comes with a number of qualifications.
For common use cases with lots of static content, such as a corporate front landing page, a 1 gigabit network card will become saturated long before any other resource. The larger your pages (including all resources, such as CSS and image files) in kilobytes, the more this advice rings true.
Request-per-second numbers matter for systems that serve large numbers of small files. Such systems should be unusual. For example, if you are delivering pages with many small user interface elements, each requiring its own image resource, consider using CSS sprites to combine those resources into one.
As concerns the simulated page durations, moving your servers closer to your end users is the best way to reduce those numbers. This is where content delivery networks make a strong showing.
These measurements don’t describe the behavior of a system populated with many gigabytes of unique files that are accessed with uniform frequency. Such systems can’t keep all content in cache and can be limited by hard disk speed.
Finally, static content delivery is infinitely scalable. No matter how inefficient the system, we can always increase capacity by adding more hardware, and the cost of doing so needs to be weighed carefully against the ease of maintaining an alternative software configuration.
Each test revealed IIS 7 as a clear frontrunner.
IIS administrators may give themselves a big pat on the back and feel free to stop reading now. Our beloved Linux server administrators, however, will need to settle their priorities.
Lightttpd is the platform of choice if you want a lean and fast environment. Its low CPU utilization should help reduce energy and hardware costs, and it’s response times are within a millisecond of the frontrunner.
G-WAN is marketed on it’s ability to push requests per second, and it unquestionably delivers, but based on these tests it may not have the all-around efficiency that would recommend it for general purpose use.
Apache, arguably the most popular software package in the list, yielded the worst results, and Nginx certainly deserves an honorable mention. Oh, but feel free to continue using any web server that suits your organization, as long as it suits you to run something other than . . . the Fastest Webserver.