How to avoid the dog-pile effect on your Rails app

Everyone already heard about scalability at least once. Everyone already heard about memcached as well. What not everyone might heard is the dog-pile effect and how to avoid it. But before we start, let’s take a look on how to use Rails with memcached.

Rails + Memcached =

First, if you never used memcached with rails or never read/heard a lot about scalability, I recommend checking out Scaling Rails episodes done by Gregg Pollack, in special the episode about memcached.

Assuming that you have your memcached installed and want to use it on your application, you just need to add the following to your configuration files (for example production.rb):

config.cache_store = :mem_cache_store

By default, Rails will search for a memcached process running on localhost:11211.

But wait, why would I want to use memcached? Well, imagine that your application has a page where a slow query is executed against the database to generate a ranking of blog posts based on the author’s influence and this query takes on average 5 seconds. In this case, everytime an user access this page, the query will be executed and your application will end up having a very high response time.

Since you don’t want the user to wait 5 seconds everytime he wants to see the ranking, what do you do? You store the query results inside memcached. Once your query result is cached, your app users do not have to wait for those damn 5 seconds anymore!

What is the dog-pile effect?

Nice, we start to cache our query results, our application is responsive and we can finally sleep at night, right?

That depends. Let’s suppose we are expiring the cache based on a time interval, for example 5 minutes. Let’s see how it will work in two scenarios:

1 user accessing the page after the cache was expired:

In this first case, when the user access the page after the cache was expired, the query will be executed again. After 5 seconds the user will be able to see the ranking, your server worked a little and your application is still working.

N users accessing the page after the cache was expired:

Imagine that in a certain hour, this page on your application receives 4 requests per second on average. In this case, between the first request and the query results being returned, 5 seconds will pass and something around 20 requests will hit your server. The problem is, all those 20 requests will miss the cache and your application will try to execute the query in all of them, consuming a lot of CPU and memory resources. This is the dog-pile effect.

Depending on how many requests hit your server and the amount of resources needed to process the query, the dog-pile effect can bring your application down. Holy cow!

Luckily, there are a few solutions to handle this effect. Let’s take a look at one of them.

Dog pile effect working on your application!

Dog pile effect working on your application!

How to avoid the dog-pile effect?

The dog-pile effect is triggered because we allowed more than one request to execute the expensive query. So, what if we isolate this operation to just the first request and let the next requests use the old cache until the new one is available? Looks like a good idea, so let’s code it!

Since Rails 2.1, we have an API to access the cache, which is defined by an abstract class called ActiveSupport::Cache::Store. You can read more about it in this post or in this excellent railscast episode.

The code below simply implements a new and smarter memcached store on top of the already existing MemCacheStore:

module ActiveSupport
  module Cache
    class SmartMemCacheStore  expires_at && !exist?("lock_#{key}")
          orig_write("lock_#{key}", true, :expires_in => lock_expires_in)
          return nil
        else
          data
        end
      end

      def write(key, value, options = nil)
        expires_delta = options.delete[:expires_delta] if !options.nil?
        expires_delta ||= 300

        expires_at = Time.now + expires_delta
        package = [value, expires_at]
        orig_write(key, package, options)
        delete("lock_#{key}")
      end
    end
  end
end

The code above is mainly doing:

  1. Suppose that your query is already cached;
  2. In the first five minutes, all requests will hit the cache;
  3. In the next minutes, the first request will notice that the cache is stale (line 17) and will create a lock so only it will calculate the new cache;
  4. In the next 5 seconds, the new query is calculated and all requests, instead of missing the cache, will access the old cache and return it to the client (lines 17 and 21)k;
  5. When the query result is returned, it will overwrite the old cache with the new value and remove the lock (lines 31 and 32);
  6. From now on, all new requests in the next five minutes will access the fresh cache and return it (lines 17 and 21).

Fallbacks and a few things to keep in mind

First, is not recommend to set the :expires_in value in your cache:

Rails.cache.write('my_key', 'my_value', :expires_in => 300)

With the solution proposed above, you just need to set :expires_delta. This is due to the fact that our application will now be responsible to expire the cache and not memcached.

Rails.cache.write('my_key', 'my_value', :expires_delta => 300)

However, there are a few cases where memcached can eventually expire the cache. When you initialize memcached, it allocates by default 64MB in memory. If eventually those 64MB are filled, what will memcached do when you try to save a new object? It uses the LRU algorithm and deletes the less accessed object in memory.

In such cases, where memcached removes a cache on its own, the dog pile effect can appear again. Suppose that the ranking is not accessed for quite some time and the cached ranking is discarded due to LRU. If suddenly a lot of people access the page in the five initial seconds where the query is being calculated, requests will accumulate and once again the dog-pile effect can bring your application down.

It’s important to have this scenario in mind when you are sizing your memcached, mainly on how many memory will be allocated.

Now I can handle the dog-pile effect and sleep again!

Summarizing, when your are using a cache strategy, you will probably need to expire your cache. In this process, the dog-pile effect can appear and haunt you down. Now you have one (more) tool to solve it.

You just need to add the SmartMemCacheStore code above to your application (for example in lib/), set your production.rb (or any appropriated environment) to use the :smart_mem_cache_store. If you use Rails default API to access the cache (Rails.cache.read, Rails.cache.write) and designed well your memcached structure, you will be protected from the dog-pile effect.

A real dog-pile! =p

A real dog-pile! =p

23 responses to “How to avoid the dog-pile effect on your Rails app”

  1. Dan Kubb says:

    Hugo, great article. One question though. How would you deal with the situation where there was no previously cached response, which can happen if it is the very first request, or the cache was swept clear? Would it make sense to block the other requests until the cache is ready, or do you have another approach?

    The dogpile effect is pretty common in high traffic sites, should the cache built into Rails handle it automatically?

  2. Priit says:

    It is interesting how playing with memcached keys can put simple key-value based data store on steroids. For example, we managed to simulate “namespaced” cache with memcached: http://www.edicy.com/developer/blog/namespaced-cache-expiring-with-memcached

  3. Dan Kubb says:

    Hugo, great article. One question though. How would you deal with the situation where there was no previously cached response, which can happen if it is the very first request, or the cache was swept clear? Would it make sense to block the other requests until the cache is ready, or do you have another approach?

    The dogpile effect is pretty common in high traffic sites, should the cache built into Rails handle it automatically?

  4. Priit says:

    It is interesting how playing with memcached keys can put simple key-value based data store on steroids. For example, we managed to simulate “namespaced” cache with memcached: http://www.edicy.com/developer/blog/namespaced-cache-expiring-with-memcached

  5. […] 11:39 pm on September 8, 2009 Reply Tags: performance (3), rails (23) http://blog.plataformatec.com.br/2009/09/how-to-avoid-dog-pile-effect-rails-app/ – How to avoid the dog-pile effect on your Rails app […]

  6. Robin Ward says:

    While your solution will improve things, it is flawed.

    It is possible that more than one request will get a positive on line 16. They will go through to the next line and will both set the lock.

    One way around this is to use the ADD command of memcached, as is detailed here:

    http://code.google.com/p/memcached/wiki/FAQ#Emulating_locking_with_the_add_command

    Only one process can successfully add the lock, so with another check that will make your code much more successful under a heavy load.

  7. Robin Ward says:

    While your solution will improve things, it is flawed.

    It is possible that more than one request will get a positive on line 16. They will go through to the next line and will both set the lock.

    One way around this is to use the ADD command of memcached, as is detailed here:

    http://code.google.com/p/memcached/wiki/FAQ#Emulating_locking_with_the_add_command

    Only one process can successfully add the lock, so with another check that will make your code much more successful under a heavy load.

  8. […] How to avoid the dog-pile effect on your Rails app – How to make your cache use a bit smarter on heavily-trafficked apps. […]

  9. […] How to avoid the dog-pile effect on your Rails app | Plataforma Blog Imagine that in a certain hour, this page on your application receives 4 requests per second on average. In this case, between the first request and the query results being returned, 5 seconds will pass and something around 20 requests will hit your server. The problem is, all those 20 requests will miss the cache and your application will try to execute the query in all of them, consuming a lot of CPU and memory resources. This is the dog-pile effect. […]

  10. Hugo Baraúna says:

    @Dan We could have some solutions. One of them is the one you said. You could make the other requests sleep for some time and after that, let them try to access the cache again.

    Another one, you can pre-warm the cache yourself instead of wait for a user request for it.

    It depends on your app requirments.

    I’m not sure about putting that solution inside the cache built into Rails, mainly because there are different solutions, that depends on the situation. For example, is common to have another process generating the cache or use a namespaced memcached. All of them have positive and negative aspects. =)

  11. Hugo Baraúna says:

    @Dan We could have some solutions. One of them is the one you said. You could make the other requests sleep for some time and after that, let them try to access the cache again.

    Another one, you can pre-warm the cache yourself instead of wait for a user request for it.

    It depends on your app requirments.

    I’m not sure about putting that solution inside the cache built into Rails, mainly because there are different solutions, that depends on the situation. For example, is common to have another process generating the cache or use a namespaced memcached. All of them have positive and negative aspects. =)

  12. Hugo Baraúna says:

    @Priit, I also find very interesting what we can do with just a simple key-value based data store. Actually, we are using that namespace solution too.

    Maybe you already know, but you can find more about that inside the memcached’s FAQ, here: http://code.google.com/p/memcached/wiki/FAQ#How_to_prevent_clobbering_updates,_stampeding_requests

  13. Hugo Baraúna says:

    @Priit, I also find very interesting what we can do with just a simple key-value based data store. Actually, we are using that namespace solution too.

    Maybe you already know, but you can find more about that inside the memcached’s FAQ, here: http://code.google.com/p/memcached/wiki/FAQ#How_to_prevent_clobbering_updates,_stampeding_requests

  14. pete says:

    “You just need to add the SmartMemCacheStore code above to your application (for example in lib/)”

    i use to place this kind of code in config/initializers.

  15. pete says:

    “You just need to add the SmartMemCacheStore code above to your application (for example in lib/)”

    i use to place this kind of code in config/initializers.

  16. Walter says:

    Is there some reason you don’t just use super.read and super.write instead of aliasing them?

  17. Walter says:

    Is there some reason you don’t just use super.read and super.write instead of aliasing them?

  18. José Valim says:

    @pete we usually lay code on lib/. initializers we use for configuration and for code that should be loaded while rails is loaded (aka monkey patches).

    @walter we cannot use super because we need to access the orig_write inside read.

  19. José Valim says:

    @pete we usually lay code on lib/. initializers we use for configuration and for code that should be loaded while rails is loaded (aka monkey patches).

    @walter we cannot use super because we need to access the orig_write inside read.

  20. I get issues trying to start up the application with this in my lib.

    /Workspace/lc_dev/lib/smart_mem_cache_store.rb to define SmartMemCacheStore

    In my development.rb:
    config.cache_store = :smart_mem_cache_store

    In environments.rb:
    cache = ActiveSupport::Cache::MemCacheStore.new(fragment_memcache_servers,shared_memcache_options.merge(:namespace => “fragment”))
    self.action_controller.cache_store = cache, {}

    I tried putting ActiveSupport::Cache::SmartMemCacheStore and I get the same error.

  21. I get issues trying to start up the application with this in my lib.

    /Workspace/lc_dev/lib/smart_mem_cache_store.rb to define SmartMemCacheStore

    In my development.rb:
    config.cache_store = :smart_mem_cache_store

    In environments.rb:
    cache = ActiveSupport::Cache::MemCacheStore.new(fragment_memcache_servers,shared_memcache_options.merge(:namespace => “fragment”))
    self.action_controller.cache_store = cache, {}

    I tried putting ActiveSupport::Cache::SmartMemCacheStore and I get the same error.

  22. José Valim says:

    @Thomas, for Rails’ auto load to work, you have to use all the namespaces in your class, so you have to put it under: app/lib/active_support/cache/smart_mem_cache_store.rb.

    If you want to have it hanging at app/lib/smart_mem_cache_store.rb, you can require the file manually, instead of relying on Rails autoload.

  23. José Valim says:

    @Thomas, for Rails’ auto load to work, you have to use all the namespaces in your class, so you have to put it under: app/lib/active_support/cache/smart_mem_cache_store.rb.

    If you want to have it hanging at app/lib/smart_mem_cache_store.rb, you can require the file manually, instead of relying on Rails autoload.