Imagine a bacon-wrapped Ferrari. Still not better than our free technical reports.
See all our reports

WordPress Protips: Go With a Clustered Approach

Motivation

We have been using WordPress for some time now (6 years and counting), and though we’ve had some bumps on the road, we are generally happy with it (except for the HTML editor, of course).

Back in the day, we ran everything on a single server but due to growing traffic & traffic spikes after big releases, plus the unstable nature of computer hardware, we’ve needed to go with a clustered environment instead. You know, the chaos monkey is always after you!

We made the switch about two years ago, and I think it makes sense to share the approach we took. When it comes to the LAMP stack there are 1000 ways how to solve a problem. Why not document our approach! Also when you google for WordPress clustering you find a lot of results but most are not so much about concrete solution but rather general discussions about clustering. So far the best article we’ve come across is Clustering WordPress on Amazon EC2 micro instances by Ashley Croder.

Time to hack! (WordPress doesn’t automatically support clustering)

WordPress doesn’t have out of the box support for clustering, and there are parts of WordPress that freak out a little once it is running on multiple nodes. Besides some features, the product itself is perfect for clustering – it’s stateless. Also one more pain point is plugins, which might also freak out. No worries, we’ll document the shortcomings and the solutions. For the cluster we chose nginx as the load balancer and Apache with PHP for the WordPress nodes.

We also chose to use sticky sessions, meaning that a person visiting our website from some remote IP would get the same server from the cluster to answer. We booted up one server for the load balancer (lets call it LB) and two servers for WordPress nodes, lets call them server-1 and server-2. We also booted up a RDS instance (Amazon version of hosted MySQL). This became our storage for the servers. We configured WordPress on server-1 and server-2 and made them connect (DB wise) to the RDS server. Something like this below:

Architecture

Technical Details

Lets start with the configuration for the load balancer. At first we defined the two servers and made sure the sessions are sticky (see ip_hash). These servers together are called servers-frontend.

upstream servers-frontend {
      ip_hash;
      server 10.10.137.100:80; # server-1
      server 10.10.126.101:80; # server-2
}

Secondly we defined our load balancer and the load balancer talks to the servers-frontend.

server {
    listen 80;
    server_name my-server.com www.my-server.com;
 
    error_page 502 503 504 @maintenance;
 
    location / {
        proxy_pass http://servers-frontend;
        proxy_set_header Host $host;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Real-IP $remote_addr;
    }
 
    location @maintenance {
        root /etc/nginx/html;
        rewrite ^(.*) /maintenance.html break;
    }
}

HTTPS-only Administration

All WordPress content is managed (i.e. pages/posts created & edited) through a special URL /wp-admin where you have to specify your username and password – you don’t want to do that over HTTP, you want HTTPS! There are couple of plugins that let you force HTTPS on /wp-admin but we didn’t find any that would actually work.

Our solution is based on our load balancer, where we define a new server in the configuration file that redirects all /wp-admin requests to one special node. In our case server-2. All other content is served via HTTP. Some plugins and resources under /wp-admin actually also need access via HTTP so we’ve added an extra section to the first server also. For the complete nginx configuration file reference see the configuration gist.

Handling WordPress file uploads

WordPress has this feature that you can upload your media files like pictures and videos and produce galleries. These files get stored on the filesystem of the server. But now that you are using multiple servers, it gets tricky. Our solution here is to use a 3rd-party service like Amazon S3 to store the media. At the same time, we use the loadbalancer to always upload pictures to one specific node. In our case, one node is more important than others and all wp-admin is served from that node. This node also has a cron job that uploads our media, where all other nodes have a cron job that downloads media. On the master node we have:

# Enable this for master nodes
*/3 * * * * /usr/bin/s3cmd --config=/etc/s3/s3.conf sync /var/www/my-server.com/public_html/wp-content/uploads/* s3://my-server-com-wp-uploads >> /var/log/s3cmd.log

and the other nodes have:

*/3 * * * * /usr/bin/s3cmd --config=/etc/s3/s3.conf sync s3://my-server-com-wp-uploads /var/www/my-server.com/public_html/wp-content/uploads/ >> /var/log/s3cmd.log

This solution has some pros and cons. The main pro is that you can boot up several non-master WordPress nodes, which will fetch images and  be up-to-date fairly quickly after they’ve been booted up. The main con is that, when editing a piece, you will see a delay before the image actually is replicated to all the nodes. Here, good timings in the cron help, especially when Marketing Droids start moaning that the image is invisible to viewers.

Upgrading WordPress and Plugins

WordPress is actively developed and sees a new release every now and then; however, now that you have a multi-node setup the upgrade is not that easy anymore. Our solution for WordPress and plugins is that we have all the source code under version control in a Mercurial repository.

We run the upgrade on one of the nodes, the master one and then commit and push the changes to the central repository. The other nodes fetch the code from central repository. In reality, we actually have more servers than just two. We also have development servers to test our own custom development before going to the production system. We do all the upgrades to WordPress on the development servers, and once we mark a change-set safe, the production nodes download the code with Mercurial.

Conclusions

Running WordPress in a cluster is not a straightforward thing to do and there are certain aspects that you have to think about before doing this. The main issues we had were:

  • media uploading
  • updating WordPress and our plugins
  • HTTPS editing

Media uploading can be achieved by 3rd-party location syncing, you can use Amazon S3 to synchronise your media. WordPress and Plugin upgrades can be also synced and distributed, which is why here we chose Mercurial as a version control system instead of S3. HTTPS editing on the other hand was achieved with some load balancer configurations. Of course there are 1000 ways how to approach these problems. If you have some neat solutions to media upload, WordPress upgrades and HTTPS do let us know in the comments section below!

  • http://twitter.com/kenny Kenneth Younger III

    Are you using the default wp cron, or have you switched that to use crontab as well?

  • http://www.zeroturnaround.com/ Toomas Römer

    We are moving to the crontab.

  • Andrus Viik

    drbd in active-active mode with ocfs2 topping is worth giving a try

  • Andrus Viik

    drbd in active-active mode with ocfs2 topping is worth giving a try

  • http://www.zeroturnaround.com/ Toomas Römer

    But probably not available in AWS?

  • Andrus Viik

    For starters, check if drbd module is available:
    sudo modprobe -v drbd && echo GOOD TO GO!
    If getting “FATAL: Module drbd not found.” the kernel module is missing (For ubuntu there’s a package called linux-image-extra-virtual or smth).
    If module loaded fine follow http://www.drbd.org/users-guide-8.3/ch-ocfs2.html.