Caching is technology that enhances the performance of websites—making them load faster when compared to a traditional server without caching. When used correctly and as per application need, caching can reduce the load on the server, which results in lower server resource consumption.
Some Background on the Need for Caching
RAM is an important aspect of any server. A few years ago, companies with websites where users do a lot of transactions daily, or even on an hourly basis, had been spending time and a lot of money purchasing RAM to support the considerable number of users visiting their websites. And believe me when I say lots of users. For example, an idle website could be serving 10K users request per minute.
The e-commerce business model says that if I want to sell my product through the internet, I will have to provide a good looking website to my users that loads pages and processes transactions quickly. Nobody wants to wait for a website that is taking ages to load data.
So, if I was the owner of any such big company (ex. Walmart, Amazon, Flipkart, etc.), my only option was to spend lots of money buying RAM and CPUs for the server on which my website is hosted, so it loads quickly and keeps users on my website. Because ultimately what I want is for millions of users to visit my website, buy products, complete lots of transactions, and provide me with a profit.
Living in Reality
At this point, what I am doing is spending lots of money to buy hardware for my server (RAM/CPUs) which, in turn, is supporting the thousands of users on my website. Because I’m spending so much to help the website carry that load, the overall profit margin for me is not that much. This is where caching technology comes into the picture, creates a better profit margin for me, and increases the reliability of my website for my millions of users.
How Does Caching Help?
First, let’s understand the basic role of an operating system (OS). The OS does all the calculations for any process running on it using RAM, the disk, and the CPU. In a normal scenario, when we make any request through a web browser, it goes to the server on which the website we are accessing is hosted. The server OS goes to the disk, gathers the data, and sends it back to the browser for us to see.
And now suppose there are millions of such requests coming to the server because of the website`s popularity and large audience. Then imagine what would happen to the server that is constantly going to and fro to fetch the data from the disk. Any guesses? If you think the server will go down, you are right!
The reason for this would be high CPU utilization combined with high network consumption. If I want my server not to go down, what option would I have? In normal circumstances, I might buy additional hardware like another server, additional RAM, and an additional CPU to support the number of website users.
This is where caching comes in to save the day. I need to simply introduce a caching mechanism on the server, which will cache (store) all of the new requests coming to a server in an organized way, thus reducing the number of to-and-fro data fetch operations to the disk by the OS. The only requests that will now go to the disk through the OS are those whose data is not already present in the caching database. This will reduce the CPU usage and network consumption. Ultimately it will boost the application hosted on my server and give very fast responses to end users. Concerning hardware, using a caching tool on your server will also save your RAM, reducing the amount you will have to purchase and the total cost of hosting your servers. You will also gain application stability by increasing the server uptime. Thus, a large number of users can purchase products through your website and help achieve your targets.
Caching in Action
We at Perficient Linux Hosting, based in Nagpur, India, use the Redis and Memcache caching tools, along with Varnish (if needed) to achieve our clients’ requirements. Each of our client projects has caching enabled for their websites, and the difference it makes in terms of website speed and overall performance is clearly visible, both when we simply surf the website and when we use tools to analyze it. On one particular project, we used the next level of caching mechanism.
This client had complex architecture requirements. The project included 12 e-commerce sites for different countries (.com, .uk, .br, etc.). Each online store had two web servers, a single database server and this client was expecting a high volume of traffic in the near future. The challenge in front of us was to use the same set of available servers to handle the traffic instead of purchasing new hardware. Considering the fact that we had two web servers and one database server for each site, we used Redis Cluster along with HAProxy, a tool we use to balance traffic between a master node and a slave node. We set it up, so one of the two servers acted as the master Redis node and other as the slave node, with a connection to both being handled and monitored by the HAProxy tool, which was installed on the database server.
The way this architecture works is that when cached data is needed, it is retrieved from the master server by HAProxy while, in the background, the cached data is also replicated to the slave node from the master. This means that if the master Redis node fails (because of high traffic, load, network congestion, etc.), HAProxy can make the slave node the master until the actual master comes back up (either through manual intervention or on its own, after the load on the server returns to normal).
This architecture enabled the websites to handle the high traffic the client was expecting and helped our client to exceed its yearly revenue targets for two consecutive years.
This Redis and HAProxy architecture was also the first of its kind for the Perficient Linux hosting team, and we are still providing hosting and support services to that client.
A happy customer? What else could a hosting team want!?