While developing for high traffic projects I’ve encountered a common problem not to far into the planning process: how do you do application farming and failover, in a simple way?
First of all, I’ll make clear what I mean by farming and failover. Farming in my mind means a single entrance point through where the users access the server, a domain name, that hosts behind itself a “bunch of severs”, “instances” or “resources” that respond in a transparent manner to the request. Failover is the action taken when any of those “resources” fail to respond, by which load is redistributed automagically (auto and magically).
So both of these features are offered by a lot of services providers and software applications. Also, you always have the option to build one yourself. Since I’ve found paid solutions too expensive, and software apps too complicated, I was on the verge of writing my own load+fail balancer… until I started digging on nginx.
If you have been working in web programming and content delivery for the last 5 years you probably heard about nginx. In my experience it was “like Apache, but faster”, and at the moment when I was ready to try it out (2008) it seemed a bit green or obscure on some features (e.g. geolocalization).
On this occasion I was explicitly looking for the terms “nodejs redis architecture” or something like that, when I ran into this great post by the Arg! Team (creators of sillyfacesociety.com).
That post covers the how-to of hosting several node instances behind a nginx webserver. In this scenario nginx would be the one taking requests from users, sending the request to the node.js apps, and delivering the response back.
The following is an example of the configuration I’ve used for my app, very similar to the one from Arg! Team, but removing a lot that I didn’t need and adding some failover features that are critical in my case. Since my app is a tracker that gets requested by users without them asking for it, it always needs to return an image, there’s no possibility in the planning that the server responds with an error.
The comments on each line explain the purpose of it. For more details about these nginx configuration statements I recommend looking at the very readable documentation.
Amazing, right? So this happens:
- User requests the file /resource.html
- Nginx takes the request and passes it on to the proxied list defined as “my_upstream”.
Here a few scenarios can take place:
- Is the 1st server’s turn to take the request, it takes it and answers back. All good.
- Is the 2nd server’s turn to take the request, it takes it and answers back. All good.
- Server 1 and 2 are down, so server 3 will take the request, it takes it and answers back. All good.
- None of the servers are responsive, a 50x error is generated, nginx intercepts this and answers with a redirect to the failover host.
For me, this means “sleeping peacefully”. Combined with an alert and restart system for node, and a log file for nginx you can be sure that no request information will be lost :)
Still to find out and develop:
- How to log requests to a file when nginx goes to the failover URL (50x error)?
- Try out different load-balancing methods. I would be interested to hear from anyone experienced on this, that has tried what nginx has to offer.
- Nginx specs on minimal server requirements for it to run smoothly.