Understanding the RAILS_CACHE_ID Environment Variable

Last week, I was looking through my Twitter stream and came across a tweet that referenced ENV[“RAILS_CACHE_ID”]. I was unfamiliar with this environment variable and made a note to learn more.

Rails 4 has made a discernible effort to improve view rendering performance by what they’re calling russian doll caching. Prior to this, view caching was simple in syntax, but realistically more complicated when cached partials needed busting on subsequent releases.

I didn’t know if this environmental variable had anything to do with Rails 4 caching, so I went digging…

The first thing I did was git clone the Rails source code and look for that variable. Outside of the documentation and tests, it showed up in one place, activesupport/lib/active_support/cache.rb:

# Expands out the +key+ argument into a key that can be used for the
# cache store. Optionally accepts a namespace, and all keys will be
# scoped within that namespace.
#
# If the +key+ argument provided is an array, or responds to +to_a+, then
# each of elements in the array will be turned into parameters/keys and
# concatenated into a single key. For example:
#
#   expand_cache_key([:foo, :bar])               # => "foo/bar"
#   expand_cache_key([:foo, :bar], "namespace")  # => "namespace/foo/bar"
#
# The +key+ argument can also respond to +cache_key+ or +to_param+.

def expand_cache_key(key, namespace = nil)
  expanded_cache_key = namespace ? "#{namespace}/" : ""

  if prefix = ENV["RAILS_CACHE_ID"] || ENV["RAILS_APP_VERSION"]
    expanded_cache_key << "#{prefix}/"
  end

  expanded_cache_key << retrieve_cache_key(key)
  expanded_cache_key
end

The comments preceding the method do a good job telling the whole story. The expanded_cache_key variable is an array built up from the key argument. The RAILS_CACHE_ID environment variable is prefaced to this array, operating similar to a namespace.

With these assumptions in mind, let’s see if we prove for sure. I’ll generate a new Rails 4 app:

$ rails new cache_test

We’ll be inspecting model objects, so let’s generate a fake blog model:

$ rails g scaffold post title content:text

Migrate the database to get current:

$ bin/rake db:migrate

I’ll create a new Post and take a look at the default cache_key:

$ rails c
> p = Post.create
> p.cache_key # => "posts/1-20140206201702645196000"

Let’s set the RAILS_CACHE_ID and look at the cache_key of this record again:

$ export RAILS_CACHE_ID=octopus
$ rails c
> p = Post.create
> p.cache_key # => "posts/1-20140206201702645196000"

Hmmm…same thing. Referring back to the Rails course code above, it turns out that not all cache writes use the expand_cache_key method. Searching around the Rails code for expand_cache_key, we find the following results:

ActionController::Caching::Fragments#fragment_cache_key
ActionDispatch::Http::Cache::Response#etag=

From the looks of it, it only applies to fragment caching and manually setting ETAG headers for HTTP responses, so let’s dig in to those.

Fragment Caching

I can cache the post fragment by wrapping it in a cache block:

<% app/views/posts/show.html.erb %>
<%= cache(@post) do %>
  <p>
    <strong>Title:</strong>
    <%= @post.title %>
  </p>

  <p>
    <strong>Content:</strong>
    <%= @post.content %>
  </p>
<% end %>

Run the server and make a request to http://localhost:3000/posts/1.

Started GET "/posts/1" for 127.0.0.1 at 2014-02-06 16:16:04 -0500
  ActiveRecord::SchemaMigration Load (0.2ms)  SELECT "schema_migrations".* FROM "schema_migrations"
Processing by PostsController#show as HTML
  Parameters: {"id"=>"1"}
  Post Load (0.2ms)  SELECT "posts".* FROM "posts" WHERE "posts"."id" = ? LIMIT 1  [["id", "1"]]
Cache digest for posts/show.html: c39e6bde261c006ffe9ddf27fb9d5318
Read fragment views/octopus/posts/1-20140206210721854446000/c39e6bde261c006ffe9ddf27fb9d5318 (0.2ms)
Write fragment views/octopus/posts/1-20140206210721854446000/c39e6bde261c006ffe9ddf27fb9d5318 (1.6ms)
  Rendered posts/show.html.erb within layouts/application (11.0ms)

Note: Make sure config.action_controller.perform_caching is set to true in config/environments/development.rb (it’s false by default), otherwise caching is disabled.

There it is! It wrote out the key views/octopus/posts/1-20140206210721854446000/c39e6bde261c006ffe9ddf27fb9d5318. Let’s change the environmental variable and see if it adjusts accordingly:

$ export RAILS_CACHE_ID=shark

Restart the Rails server and again, request http://localhost:3000/posts/1:

Started GET "/posts/1" for 127.0.0.1 at 2014-02-06 16:35:33 -0500
  ActiveRecord::SchemaMigration Load (0.1ms)  SELECT "schema_migrations".* FROM "schema_migrations"
Processing by PostsController#show as HTML
  Parameters: {"id"=>"1"}
  Post Load (0.2ms)  SELECT "posts".* FROM "posts" WHERE "posts"."id" = ? LIMIT 1  [["id", "1"]]
Cache digest for posts/show.html: c39e6bde261c006ffe9ddf27fb9d5318
Read fragment views/shark/posts/1-20140206210721854446000/c39e6bde261c006ffe9ddf27fb9d5318 (0.3ms)
Write fragment views/shark/posts/1-20140206210721854446000/c39e6bde261c006ffe9ddf27fb9d5318 (1.5ms)
  Rendered posts/show.html.erb within layouts/application (11.1ms)
Completed 200 OK in 82ms (Views: 62.7ms | ActiveRecord: 0.5ms)

Sure enough…the key views/shark/posts/1-20140206210721854446000/c39e6bde261c006ffe9ddf27fb9d5318 was used this time.

Etags

Etags are a value added to the HTTP headers that allow a browser to determine whether the content for a particular piece of content should be refreshed or not. If not, the server returns a 304 HTTP code and the browser uses the cached response from a previous request.

Rails provides two controller helper methods to help determine whether the content should be refreshed: fresh_when and stale?. The Rails guides provide a decent explanation of these methods and when to use them.

Jumping in to the PostsController, let’s use the fresh_when method in the show action:

class PostsController < ApplicationController

  def show
    @post = Post.find(params[:id])
    fresh_when(@post)
  end

end

Now, when we request the show page for that post, we get an Etag back in the header response of 9c754ae292618570ec43cae8e03a0b13. That doesn’t look very familiar, huh?

It turns out that the Etag is generated using the following method:

Digest::MD5.hexadigest(key)

But the key is more than just the cache_key of the @post. If we refer back to where the expand_cache_key method was used, we see that when the etag attribute is set on the response, the key is expanded using the ActiveSupport::Cache#expand_cache_key method we looked at above:

 def etag=(etag)
   key = ActiveSupport::Cache.expand_cache_key(etag)
   @etag = self[ETAG] = %("#{Digest::MD5.hexdigest(key)}")
 end

Let’s stop the server and reset the ENV[“RAILS_CACHE_ID”]:

$ export RAILS_CACHE_ID=whale

Running the same request, we get an Etag of b389da68ca1b25986edecf349fcf63e6. So as you can see, by updating ENV[“RAILS_CACHE_ID”] alone, we get a different Etag in the response without changing the post object itself, which means we can invalidate our browser cache by updating this environmental variable.

I, personally, deploy to Heroku most of the time and it’d be nice to be able take advantage of browser caching in this manner, but also know when a new deploy is released. This article describes a solution that integrates Heroku’s numbered releases.

However, we know that Heroku uses git to get the source code of the appellation to the application container, so why not leverage git commit IDs for ENV[“RAILS_CACHE_ID”].

git log is a command to look at….well, git logs!. With a few additional options, we can get a shortened version of the commit SHA:

$ git log --pretty=format:%h -n1
40ad584

Combining the git commit SHA gives us the perfect opportunity to invalidate browser cache. If the Etag is different because a different git commit SHA is the most recent on the server, we know there’s a potential the HTML rendered and cached in the browser could be outdated. We can implement this by setting the ENV variable in config/application.rb:

ENV['RAILS_CACHE_ID'] = `git log --pretty=format:%h -n1`.strip

Now, when a request is made we get an Etag of 6d80b6397347f8de1b8718e7fd9f90e1. And if we add an empty commit and re-request the post, we get an Etag of 6dcc66ddf83613dd0c46406a1e983b38.

Hooray! A browser caching strategy for HTML content that’s dependent on the git release.

Summary

Before Rails 4, it was common to use a version number in the view cache method so that when you changed a encompassed view template, the cache would bust and the changes would be shown. However, now that cache_digests are part of Rails 4, the contents of the template block is evaluated and an MD5 string is created from the contents (the latter part of the cache key – c39e6bde261c006ffe9ddf27fb9d5318 in the last example). This ensures that whenever the contents of the view template are changed, the cache is busted and the new template is properly stored and rendered.

Given the cache_key of fragment caching is dependent on the contents of the template, I have trouble finding value in setting the RAILS_CACHE_ID environmental variable for fragment caching alone.

Alternatively, the use case for using ENV[“RAILS_CACHE_ID”] to control browser caching can be a tremendous asset when the content of the body response is not considered. Caching strategies can be difficult to reason about and hopefully this gives you another tool in the toolbox to make parts of your Rails applications more performant.

Comments

Ruby Gem Crash Course - FREE!

Join 1,000+ other Rubyists and take your Ruby gem skills to the next level!

Whether you're an expert Rubyist, or just starting out, this FREE 10-day email course will guide you through the process of creating your own Ruby gems from start to finish.

Enter your email below to get started today: