Mike Perham

On Ruby, software and the Internet

Creating a counter_cache column

December 17th, 2007 · 5 Comments

A counter_cache provides a reasonable way to speed up code which faults in a collection just to get the size.  In my experience, the most painful part is initializing the column with the first value.  Traditionally the Rails blogs use Ruby code to iterate through every instance, calculating the value by performing an individual query and saving the result.  It’s much, much faster (and easier in many cases) to initialize the column value with a single query.  Imagine POSTS and COMMENTS tables where you want a COMMENTS_COUNT column to speed up the lookup of posts.comments.size. Here’s the beef:

update posts, (select post_id, count(*) as the_count from comments group by post_id) as comm set posts.comments_count = comm.the_count where posts.id = comm.post_id;

Note the use of a temporary table to hold the intermediate, calculated size.  This has been tested in MySQL.  YMMV.

Tags: Rails

5 responses so far ↓

  • 1 Josh Owens // Dec 28, 2007 at 11:51 am

    Why use MySQL directly? Just use migrations…

    http://josh.the-owens.com/archives/2007/11/03/rails-edge-change-how-to-add-a-counter-cache-to-an-existing-db-table/

  • 2 mperham // Dec 28, 2007 at 12:46 pm

    Josh, please read my blog posting over again. You are doing exactly what I said: iterating through each Recipe one at a time and initializing its counter value. If you have 2,000,000 recipes, this will require 2,000,000 queries and a lot of memory (since you will be holding 2m recipes in memory after the find(:all)). The query I give above will do all that work in a single query with no memory overhead (all the hard work is pushed to the database layer).

  • 3 Josh Owens // Jan 22, 2008 at 12:32 am

    Mike,

    I see your point, but I would worry about people who *haven’t* implemented a counter cache and have 2 million objects that are associated. I think the speed up a counter cache provides will be found much sooner than 2 million objects.

    I think most people wouldn’t know how to start by creating temp tables and pushing/pulling the data via a migration.
    Just my $0.02

  • 4 Erik Andersson // Feb 17, 2008 at 3:12 pm

    This was exactly what i needed! Thank you!

  • 5 Chip Ramsey // Oct 22, 2008 at 11:51 am

    Same concept for PostgresSQL:

    UPDATE accounts
    SET users_count = comm.the_count
    FROM (select account_id, count(*) as the_count from users where is_active = ‘t’ group by account_id) as comm
    WHERE accounts.id = comm.account_id;

Leave a Comment