bradleyboy :: the online home of Brad Daily

Lessons in Scaling: Divide and Conquer

We’ve seen solid growth in our Director hosting offering over the past few months, and with it came all the fun that is scaling a new application that is handling millions of requests a day. After finally moving the entire Director hosting platform to a dedicated server of its own and also changing over to the super fast nginx web sever, we still saw issues during the busy midday hours. Content served to slideshows was working fairly well, but attempting to login to Director was horrendously slow and, at times, impossible.

Our initial setup was to pass all PHP requests from nginx back to a pool of PHP FCGIs running on high ports. We use the fair upstream module for nginx that essentially passes the request to the least busy PHP backend. This worked well until things got really busy, when the aforementioned problem would start and my blood pressure would go through the roof.

As always, tailing the log provided a useful wake up call. As the requests flew by, I realized that 95% (maybe more) of the requests were for one of two files: images.php and p.php. These are the two files used by SlideShowPro to communicate with Director. One (images.php) sends back the XML file and the other (p.php) parses all image requests for Director’s on-demand image publishing. Most of the time, these files are simply serving caches back to the client, but because of the sheer number of requests they seemed to be hogging (for lack of a better term) all the PHP backends and not leaving much for the other PHP requests.

The solution was to setup two different pools of PHP FCGIs, one for the images.php and p.php calls and one for everything else. With nginx, this was simple:

upstream main {
    fair;
    server    127.0.0.1:8100;
    # etc...
}

upstream workhorse  {
    fair;
    server   127.0.0.1:8000;
    # etc...
}

So, now we have two separate backends to serve the different requests and we can add more servers to each as need be. Then, we use the nginx location and regex features to split up the requests:

location ~ (images|p)\.php$ {
    fastcgi_pass   workhorse;
}

location ~ .php$ {
    fastcgi_pass   main;
}

Since nginx handles regex location directives in the order they appear in the configuration file, any images.php or p.php file will be captured by the first directive and sent to one of the “workhorse” backends. All other PHP requests are sent to the “main” backends.

The verdict? After days of struggling with uptime (including a few early morning Pingdom text messages), we are now enjoying solid uptime while still churning through approximately 4 million requests a day.

You are reading an archived post, written on Friday, May 23rd, 2008. Feel free to leave a comment or trackback from your own site.

» Next post:
   Tomorrow

« Previous post:
   All blogged out

Leave a Reply