Five Common Rails Mistakes

I’ve worked with Rails for quite a while now and in that time I’ve seen a lot of Rails applications and both read and written a lot of bad Ruby code. Here’s five common mistakes that I see in almost every Rails codebase.

1. Migrations with no schema specifics

Your data model is the core of your application. Without schema constraints, your data will slowly corrode due to bugs in your codebase until you can’t depend on any fields being populated. Here’s a Contact schema:

  create_table "contacts" do |t|
    t.integer  "user_id"
    t.string   "name"
    t.string   "phone"
    t.string   "email"
  end

What is required? Presumably a Contact must belong_to a User and have a name — use database constraints to guarantee this. By adding :null => false, we ensure that the model is always consistent even if we have bugs in our validation because the database will not allow a model to be saved if it fails those constraints.

  create_table "contacts" do |t|
    t.integer  "user_id", :null => false
    t.string   "name", :null => false
    t.string   "phone"
    t.string   "email"
  end

Bonus points: use :limit => N to size your string columns appropriately. Strings default to 255 characters and phone probably doesn’t need to be that big, does it?

2. Object-Oriented Programming

Most Rails developers do not write object-oriented Ruby code. They write MVC-oriented Ruby code by putting models and controllers in the expected locations. Most will add utility modules with class-methods in lib/, but that’s it. It takes 2-3 years before developers realize: “Rails is just Ruby. I can create simple objects and compose them in ways that Rails does not explicitly endorse!”

Bonus points: introduce facades for any 3rd-party services you call. Provide a mock facade for use in your tests so that you don’t actually call the 3rd party service in your test suite.

3. Concatenating HTML in helpers

If you are creating helper methods, kudos, at least you trying to keep your view layer clean. But developers often don’t know the basics of creating tags within helpers, leading to messy string concatenation or interpolation:

str = "
  • " str += link_to("#{vehicle.title.upcase} Sale", show_all_styles_path(vehicle.id, vehicle.url_title)) str += "
  • " str.html_safe

    Yikes, it’s ugly and can easily lead to XSS security holes! content_tag is your friend.

    content_tag :li, :class => 'vehicle_list' do
      link_to("#{vehicle.title.upcase} Sale", show_all_styles_path(vehicle.id, vehicle.url_title))
    end
    

    Bonus points: start introducing helper methods that take blocks. Nested blocks are a natural fit when generating nested HTML.

    4. Giant queries loading everything into memory

    You need to fix some data so you’ll just iterate through it all and fix it, right?

    User.has_purchased(true).each do |customer|
      customer.grant_role(:customer)
    end
    

    You have an ecommerce site with a million customers. Let’s say each User object takes 500 bytes. This code will take 500MB of memory at runtime! Better:

    User.has_purchased(true).find_each do |customer|
      customer.grant_role(:customer)
    end
    

    find_each uses find_in_batches to pull in 1000 records at a time, dramatically lowering the runtime memory requirements.

    Bonus points: use update_all or raw SQL to perform the mass update. SQL takes time to learn well but the benefits are even more tremendous: you’ll see a 100x improvement in the performance.

    5. Code review!

    I’m guessing you are using GitHub and I’m also guessing you aren’t using pull requests. If you spend a day or two building a feature, do it on a branch and send a pull request. Your team will be able to review your code, offer suggestions for improvement and possible edge cases that you didn’t consider. I guarantee your code will be higher quality for it. We’ve switched to using pull requests for 90% of our changes at TheClymb and it’s been a 100% positive experience.

    Bonus points: Don’t merge pull requests without tests for at least the happy path. Testing is invaluable to keep your application stable and your sleep peaceful.

    Did I miss any common issues? Leave a comment and let me know!

    Update: use find_each rather than find_in_batches, thanks readers!

    49 thoughts on “Five Common Rails Mistakes”

    1. Great tips. It seems like in particular point 2 is getting a lot of attention recently – as apps are growing larger, developers are realizing that software engineering concerns about code organization actually apply to the work that we’re doing with Rails.

      There are some good books coming out, like Sandi Metz’ Practical Object Oriented Design in Ruby that articulate some of these concepts really well.

      Re: point 5, I tend to use find_each (which calls find_in_batches internally) to gain the benefits of reducing returned result size without having to use a nested block.

      Thanks for writing these up!

    2. find_each is just like find_in_batches but yields a single instance so you don’t need the nested iterator.

    3. True. Especially 1 & 2.

      Regarding migrations I think that :null => false should be the default and :limit should be smaller by default so that people would learn that only sometimes they want a bigger limit. 255 by default is crazy.

    4. You don’t even need to know sql to use update_all, you can use the standard query building methods like where(:purchased => true).update_all(:customer => true)

    5. #2 is one of those things that is driving me crazy lately. Inherited a 2 yr old rails app and there is no semblance of oo.. Sometimes I’m in pain just trying to understand “why”. :)

      Also like Jeremy, curious about foreign keys.. in my very limited rails experience it felt like rails-ists (is that a word?) didn’t believe in foreign keys. But then again the apps I’ve seen didn’t use many constraints either…

    6. #2 – This. I see this so much, but I didn’t have a good way of explaining it to the team. OO != MVCO is a good simple way of putting it.
      As for #5 – I’m curious about whether you are pair programming. Although I like pull requests, (and I think the new Travis CI way of testing pull requests before merging may change the way I work,) I think that explicit code review is somewhat redundant if you are pairing and doing continuous code review.
      However, if I get to replace Jenkins with Travis I’ll probably switch to pull requests just to be able to smoothly run tests before merging.

    7. Bonus point of 3) you could have mentioned HAML helpers, they do pretty much that and they’re extremely flexible.

      Great article.

    8. For #1 in the bonus point section… while the default max size is 255, the database does not allocate 255 characters, it only uses the size of the data you add in that column. So it does not matter what the default length is, so long as you do not exceed it.

      All the other points are great for nubies!

    9. > find_each uses find_in_batches to pull in 1000 records at a time, dramatically lowering the runtime memory requirements.

      find_each and find_in_batches are likely preferable to doing a giant load at once, but one thing to be aware of is when you use it you’re likely trading app memory for database cpu cycles.

      As an example, say you’re doing a gnarly db query, that’s going to return a 1,000,000 rows.

      SELECT * FROM users WHERE unindexed_field = true;

      When you do User.all.each your database will do one giant query, computing all the results once. When you do User.where(…).find_each(:batch_size => 100), your database will have to run that same query 1,000,000 / 100 = 1,000 times. And, if you’re using mysql, in that it recalculates all the results every single loop. Eg to calculate the 100-200 set, it, it calculates the first 200. And for 200-300 it recalculates the results for 0-300. Then for 300-400 it recalculates the results for 0-400, and so on, all the way to 999,900-1,000,000 recalculating all the results all over again.

      So if you’re dealing with a big dataset, it’s likely better to use find_each or find_in_batches, but be aware that it can also get you into trouble.

      A common workaround is to do ids = User.where(…).select(:id).all

      ids.in_groups_of(100) do |id_group| ….

    10. You missed mass_assignment, also calls for exact User must be a kind of this: “current_user.posts.find(params[:id]).destroy ..”

    11. I’ve just started using rails and have missed off a few schema constraints, is there an easy way to add them on afterwards in a migration?

    12. As far as your bonus to #1, and kind of in addition to what Ericson Smith said:

      If you are using PostgreSQL, you might want to check out its documentation (http://www.postgresql.org/docs/8.0/static/datatype-character.html), there is no performance difference between `character varying` (aka `varchar`) and `text` character types; unlike some other databases where declaring a limit to `varchar` columns will give you a performance increase. In fact, with Postgres, unless you know the exact length of what will be stored, you are probably better off always using `text`, as a few extra cycles will be needed to check the length when writing to length-constrained columns (http://www.thebitguru.com/blog/view/302-Performance%20difference%20between%20varchar%20and%20text%20in%20PostgreSQL).

    13. Putting (or leaving) business logic in controllers. Controllers tend to get bloated with business logic. Try to push these down to the model layer and keep the controllers skinny. Putting business logic in the model layer makes it more testable too.

    14. Nice post. It’s hard to reform. It takes small, simple lists for me to reform “good enough” work into something that I can be proud of. Now I’m thinking about a Facade for a third-party API in a current project…thanks.

      I agree with Justin Leitgeb, OO Rails is a thing these days. It seems to be a rational response to all of the pent-up anxiety over maintaining complicated Rails apps. I wrote a book review for Avdi Grimm’s excellent Objects on Rails a while back, in case it’s useful to anyone: http://bit.ly/KPuYm0

    15. With regard to “write OO code”, use presenters. Some people may think that it’s an unnecessary level, but eventually you’re going to have to a) format attributes of a model or b) take form input and convert it to model attributes, or vice versa (classic example: currency). Both of these things belong to the presenter layer, not the controller or model layer (it could be argued case (b) belongs to the controller layer, but I disagree, especially when you have a two-way conversion). I think it’s better if you introduce this sooner rather than later. Plus, presenters are testable.

    16. This is good stuff. You should link to Avdi’s excellent “Objects on Rails” book for #2.

      Some of mine:

      * Inconsistent code formatting. While I prefer two space indentation, I can be ok with just about anything as long as its consistent. Trailing whitespace should never be allowed.

      * Not using the asset pipeline to properly organize JS, CSS, and image assets. 3rd party assets go to /vendor/assets. Internal shared assets should go in /lib or, even better, turned into an asset pipeline gem.

      * Messy or poorly written JavaScript. The asset pipeline makes it so easy to break up The Huge JavaScript File into components.

      * Using JavaScript instead of CoffeeScript. :-)

    17. I remember helping a small company subcontracted to the large company I worked for. They had a mass of code and many subtle bugs. They resisted code reviews until I forced the issue with their two top developers. We found a bunch of bugs that would have been hard to test out, and the company was converted.

    18. Very interesting, I’m using some tips, and Github pull request is awesome!

      Ruby OO is a big gap for a lot of developers
      Did you some post about it? It’s a nice topic

      Cya.

    19. User.has_purchased(true).each do |customer|
      customer.grant_role(:customer)
      end

      Use that with DataMapper and you get:
      UPDATE users
      SET role=customer
      WHERE COUNT(purchases) > 0

      There’s no need to fetch at all, stupid ActiveRecord, even in batches.

    20. I find your wording to be thoroughly confusing. Are these common rails mistakes, or are these good rails practices? Is “object-oriented programming” a common rails mistake? Is “code review”?

    21. This was nice, thanks. However, concerning using SQL directly, isn’t it better to stick with the abstraction and Arel?

      1. Abstractions should be used in the common case or they aren’t useful. But there are always exceptions to the rule, for instance performance might be better if you can do a complex operation all in one SQL statement rather than requiring iteration through a result set in Ruby.

    22. Can you elaborate on your team’s pull request process? Is there one owner of the repo that must approve each pull?

      Bonus points: write a blog post about it :)

    23. This is a very useful list! I am looking forward to implement the code review thing. We are using github for a while now but we don’t do code review a lot due to time constraints. Do you have any tips for this kind of problem?

    24. David Richards: A warning for those who have yet to pick up Avdi’s Objects on Rails book: if you’re a classical TDDer (as opposed to mockist, insisting on a green bar before you go on to the next thing, OoR will drive you absolutely nuts.

      Also a lot of bespoke code in the book is better taken off the shelf now, in my opinion; decorators such as Draper are great for taking the presentational crap out of your model where it doesn’t belong, and if you’re not using domain-service objects like ActiveInteraction, why TF not?!?

    Leave a Reply

    Your email address will not be published. Required fields are marked *

    You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>