Tag Archives: AppFabric Caching

Comparing Distributed Caching Solutions

Preamble

Over the past week or so I’ve been researching possible caching solutions to be used with a .Net web service. The current caching strategy involved sending cached data to an SQL database in order to avoid pulling search results from web queries every time a user performed a search. This certainly sped up the call, however putting a heavy load on the database server during peak times proved to be a problem. It was therefore time to find a new caching strategy that uses in-memory caching instead of reading and writing to disk.

Throughout my research I looked in to three of the more well-known caching solutions. The first of which was Microsoft’s AppFabric Caching, the second was Alachisoft’s NCache Express, and the third was Memcached, an open-source caching solution. When evaluating these solutions there were specific requirements that needed to meet our needs. First and foremost, we wanted an easily deployable distributed caching solution. This meant that we wanted to be able to easily setup multiple caching servers, while having the ability to add or remove servers with minimal disruption to the availability of the caching service. The second was that we wanted to have a two-tiered expiry mechanism. In that, we wanted each object to have both a sliding expiry and an absolute expiry. The sliding expiry would allow us to keep the memory clear of infrequently used data (by setting a sliding expiry of about 5 minutes for example), and the absolute expiry would allow us to keep the cache clear of stale data that has been there for a few days. In the following document I will describe the pros and cons that I found with each of these solutions.

AppFabric Caching

AppFabric Caching is the evolution of a Microsoft project codenamed “Velocity”, and was released as part of Windows Server AppFabric, which also included a separate hosting solution for web applications. Both parts work as a stand-alone solution so I concentrated on experimenting with the Caching component.

There are three components to AppFabric caching: Caching Services, Cache Client, and Cache Administration. Caching Services is the component that you need to install on each server that you would like to have as a cache host. In order to manage your cache hosts you need to install the Cache Administration component, which is a PowerShell module that allows you to do all of the administrative duties of adding and removing cache hosts from a cache cluster. Finally, the Cache Client component provides you with all of the libraries needed to access the cache’s data from an application. The only thing that you need to plan for, before configuring your cluster, is to figure out where you want your cluster configuration to be held. With AppFabric you have two solutions out of the box: (1) you can create an XML file in a shared folder where all of your cache hosts have access, or (2) you can create an SQL Database that will hold all of the configuration settings.

Configuring a cache cluster with a single node was fairly straight forward. After a little bit of research I was able to quickly set up a cache cluster on my local box, with the configuration saved in a database on an existing SQL server, and have a test application up and running in just a couple of hours. The configuration wizard takes you through all of the necessary steps to building a working cluster right out of the box. So once the setup is finished it is simply a matter of running a few PowerShell commands to start up the cluster and you’re all set. The AppFabric team has also released a package of samples in order to test that the cluster is working. However these samples were created in Visual Studio 2010, so you’ll need to download the express version if you want to run them. Adding a second cache host in to the cluster is just as straight forward. It’s simply a matter of installing the caching services on the second host, choosing the existing configuration location, and starting the new cache host through a PowerShell command.

AppFabric Setup Screen

This is an example of what the settings should look like when you're creating your first node.

I did have a few problems attempting to setup a second host in the beginning however. On one particular box I downloaded the specified windows binary, went through the configuration wizard (with the same settings as my local box), and I received an error when it tried to register the cache host. The error complained about certain “identity references” not being translatable. Looking for the error online only came up with two solutions, both of which involved dropping the box from the domain and re-adding it, which was not something we were able, or willing, to do. In the beginning I thought that this error had something to do with a mismatch of architectures between my local box and the error generating box. However, applying the same configuration to another identical box worked just fine. Since this was clearly just a bug with no easy resolution, I decided to move on.

As far as caching functionality goes it gets the job done. It supports all of the major functions that you usually get with a caching solution. You can get existing items, add new items, remove items, and update existing items. With each item added to the cache you can specify an expiry time based on a TimeSpan object. Although sliding expiry is not built right in to the API, it is possible to implement it yourself using the ResetObjectTimeout method, which “touches” an object and extends its timeout. As for eviction policy it employs the Least Recently Used policy, which begins removing the least recently used items first once the cache memory quota has been reached.

As far as performance monitoring capabilities, they provide some performance counters that can be viewed through the Performance Monitor in Windows. They provide statistics for total data count, total object count, total client requests, total misses, and so on so that you can get a general idea of how the cache is performing.

Important Points

  • Relatively easy to setup.
  • Administrative duties are done through a PowerShell module. (However there is a GUI based administrator tool out there)
  • Supports small clusters with 1 to 5 machines, as well as large clusters with 15+ machines.
  • Sliding expiry is implemented through an extra API call which extends an object’s expiry time.
  • Supports Least-Recently Used eviction policy.

Useful Links

NCache Express

NCache is a well-established, commercially distributed caching solution by Alachisoft. As such, you know that you’re getting a product that is well into production, and is well supported. However, in order to use their Professional or Enterprise releases you need to pay a pricey licensing fee. Because of this I was limited to doing my research with the Express version that they provide.

Unfortunately, there are two crippling limitations with the express version of the software. The first of which is that your choice of cache cluster topology types are limited to local caches, and two-node replicated caches. This means that you can either have a single cache node on your application server, or you can have a distributed cache with two replicated nodes in it. With a two-node replicated cache, any data that goes in to it gets replicated on both of the servers in the cache. On top of this, the maximum RAM that you can allocate to a single node in the cache cluster is 500Mb. So this really puts a hard cap on the scalability of your cache.

The second limitation of NCache Express is that it does not support the Least-Recently Used eviction policy. Instead, it uses priority-based eviction. In priority-based eviction, NCache will start evicting objects based on 5 different priority indicators that YOU assign to each object that you add to the cache. This gives you very little control over what gets evicted first, because if you don’t have data that can be broken cleanly in to 5 priority categories then there is no smart way to prioritize the eviction order.

On the plus side, NCache Express actually does support sliding expiry in combination with absolute expiry. However, with the other two limitations mentioned above it would be very hard to scale the cache to accommodate peak loads, so it really doesn’t suit our needs.

Important Points

  • Very easy to setup a single node.
  • Administrative duties can only be done through the command prompt.
  • Express version only allows you to create a single node local cache or a 2-node replicated cache.
  • The maximum RAM that you can allocate to a node is 500Mb.
  • Supports sliding expiry coupled with an absolute expiry.
  • Express version does not support the Least-Recently Used eviction policy.
  • Professional and Enterprise releases are pricey.

Useful Links

Memcached/Membase

Memcached is an open-source, C-based caching solution. In its raw form it is a simple key/value based storage system with a built-in protocol used to access the data. Given that it’s open-source, there are a number of available clients built by various companies for various application platforms that hook into the Memcached protocol. These clients are available in tons of popular languages, such as C/C++, C#, Java, PHP, and much more. This makes Memcached a very versatile solution for pretty much any development situation. Furthermore, its user-base is really unprecedented with Wikipedia, Youtube, and Twitter topping the list.

One company even created a commercially distributed Memcached wrapper. This wrapper has Memcached built right in, and it provides you with a rich GUI for managing the cache cluster and monitoring its performance. Given how easy this software was to use, I used it to configure everything about my cache cluster. So Membase will be the main subject for the rest of this section.

Membase Dashboard

This is a screenshot of what the Membase dashboard looks like.

To setup the cache cluster you simply have to choose one server, install Membase, and then follow the setup wizard to create a new cluster. In the setup wizard, the first thing it will ask you is whether you want to create a Membase server or a Memcached server. The main difference between the two is that a Membase server supports data replication and persistence, whereas a Memcached server does not. I haven’t done too much experimentation with the Membase server however, so I won’t cover its configuration in this document. If you choose to create a Memcached server, you will simply be asked for a RAM quota for the server. This will be the maximum space that your cache will allocate for the data. After this, you simply have to set up an administrator password for the console, and you’re done! One thing to keep in mind with Membase is that when you configure your initial cache host, the memory quota that you specify will be the memory quota that all of your additional cache hosts will need to have in order to join that cluster. In order to add another node to your cluster, you simply have to install Membase on another server, and in the configuration choose to “Join an Existing Cluster”. It will then ask you for the administrator password and the IP of one of the cache hosts, and that’s it!

In order to access this cache cluster you need to download one of the existing client libraries for Membase (or for Memcached) and include it in your application. I chose to use the .Net Enyim Memcached client. This client implements all of the functionality that the Memcached protocol provides. You can add new objects to the cache with a specified time expiry, you can remove existing objects, and you can update existing objects. Unfortunately, Memcached does not support sliding expiry at the moment. However, having the absolute expiry coupled with the Least-Recently Used eviction policy should be enough to meet our needs.

Important Points

  • Very easy to setup an entire multi-node cluster.
  • Administrative tasks can all be done through a rich web-based dashboard.
  • Does not support sliding expiry.
  • Supports the Least-Recently Used eviction policy.
  • Released a version for both Windows and Linux.
  • With the Memcached-based servers you can access the cache from almost any type of application.

Useful Links