Clojure vs Ruby, Part I

I’m a performance guy, I’ll admit it. I love performance tuning and comparing code to see why one thing is slower than another. I’ve recently taken a shine to the Clojure programming language as it seems to combine two good things: an incredibly fast and reliable VM and a functional language designed for concurrency. Here’s my first performance test: the ever-reliable fibonacci method. Please remember this is just a microbenchmark, take it with a grain of salt.

In Clojure:

(defn fib [n]
    (if (<= n 1)
        1
        (+ (fib (- n 1)) (fib (- n 2)))
    )
)

In Ruby:

def fib(n)
  return 1 if n <= 1
  fib(n-1) + fib(n-2)
end

The JVM is Server 1.6.0_07 64-bit. Let's see how long each takes to calculate the 40th Fibonacci number:

C (gcc 4.0.1)         2.6 sec
Java                  1.2 sec
Clojure 20080916     11.3 sec
Ruby 1.9pr1          44.3 sec
JRuby 1.1.5         118.1 sec
Ruby 1.8.6          195.6 sec

Clojure's performance is really spectacular in this one microbenchmark compared to Ruby. There's some language overhead but the native speed is so fast that performance is still better than the fastest Ruby VMs. Rich Hickey has done a great job with Clojure but make no mistake, this is still mostly a hobbyist language done by one person - he can't afford to spend years building his own VM. Running native on the JVM has huge benefits in terms of performance.

35 thoughts on “Clojure vs Ruby, Part I”

  1. It would be interesting to see how Ruby 1.9 does once they enable tail-call elimination. Also, fib is faster in Ruby 1.8.6 when it’s done iteratively, FWIW.

  2. I assume you are using this implementation of fib to test the speed of the various languages, because if you want fib to be fast:

    (defn fib [n]
    (loop [a 1 b 0 count n]
    (if (zero? count)
    b
    (recur (+ a b) a (dec count)))))

    user=> (time (fib 40))
    “Elapsed time: 0.155 msecs”
    102334155

  3. Thanks Paul. Yes, I’m just curious to see how each language performs on the same simple code snippet. My takeaway is that recursion is not cheap in Ruby. As Adam pointed out, I rewrote fib to use iteration and it completes in Ruby 1.8.6 in 0.03 ms.

  4. Mike, if you’re comparing to Java, you should probably add type hints. Otherwise, Clojure version of fib is probably doing much more than Java.

    Try this:

    (defn fib [n]
    (if (<= (int n) (int 1))
    1
    (+ (fib (unchecked-dec (int n)))
    (fib (unchecked-subtract (int n) (int 2))))))

    I’m not very good at these things, but I’ve seen people do magic with type hints.

  5. For scripting languages, Lua appears to be the fastest so far:

    $ time lua fib.lua
    165580141

    real 0m42.064s
    user 0m40.472s
    sys 0m0.380s

    Python 2.5 is ahead of JRuby and stable Ruby, but slower than Ruby 1.9 (is that the release that finally uses a VM?)

    $ time python -OO fib.py
    165580141

    real 1m36.832s
    user 1m32.695s
    sys 0m0.208s

  6. What’s the Java and C code, respectively? I didn’t expect there to be much of a difference – certainly not more than 100%.

    Furthermore, while Clojure is faster than Ruby I don’t see there should be any reason why it should not be able to be some five times faster still. I seem to recall that the fastest Lisp implementations generally ran at about half C speed. If you are willing to code C in Lisp you can often match the speed (exacting the terrible cost of horrible looking code).

  7. Although my machine has a somewhat slower 64 bit CPU (as show by the Java result), here is the result in SBCL. Java is still 2x better.

    CL-USER> (defun fib (n)
    (declare (optimize (speed 3) (debug 0) (safety 0))
    (type fixnum n))
    (if ( (time (fib 40))
    Evaluation took:
    4.689 seconds of real time
    4.668291 seconds of total run time (4.668291 user, 0.000000 system)
    99.55% CPU
    9,139,286,246 processor cycles
    26,368 bytes consed

    165580141

    levy@tsunami:~$ time java fib
    165580141

    real 0m2.273s
    user 0m2.160s
    sys 0m0.040s

  8. Michel: I thought Lua had a reputation of being fast? 20x slower than C is really, really bad for such a simple benchmark.

  9. (int n) is not a type hint – it’s a coercion. A type hint would be (def fib [#^Integer n] …). On my machine, this actually slows things down. by about 10%.

    Using unchecked math without the type hint is even worse – it’s about 18 times slower. (unchecked-dec n) is slightly slower than (unchecked-subtract n 1).

    The only version I’ve come up with that’s faster than the original used both a type hint and unchecked math. This was twice as fast:

    (defn fib [#^Integer n]
    (if (<= n 1)
    1
    (+ (fib (unchecked-subtract n 1)) (fib (unchecked-subtract
    n 2)))))

  10. Hmm, seems like something’s wrong with your JRuby setup. On my MBP core 2 2.5GHz I get 22s to calculate fib(40) with default settings and 16 settings with some experimental perf optz turned on. So it’s still not as fast as clojure, but it’s not an order of magnitude slower. Did you run with -server flag?

  11. Without -O2 GCC yields this here:

    $ time ./fib
    165580141

    real 0m2.397s
    user 0m2.020s
    sys 0m0.008s

    ..with -O2 it’s a bit faster:

    $ time ./fib
    165580141

    real 0m1.435s
    user 0m1.180s
    sys 0m0.012s

    ..and here is SBCL (Steel Bank Common Lisp):

    CL-USER> (time (fib 40))
    165580141
    Evaluation took:
    4.896 seconds of real time
    4.000250 seconds of total run time (3.968248 user, 0.032002 system)
    81.70% CPU
    10,826,651,713 processor cycles
    1,049,568 bytes consed

    165580141

    ..not bad. If I nag someone on sbcl-devel enough I bet they could optimize the generated ASM to match that of GCC; it seems to be doing more than it have to.

  12. Michel S.: yes, Ruby 1.9 is the VM branch. 1.9.1 is expected for January 25th, and for 1.9.2 we might get tail call optimization.

    These are my timings, though:

    Python (2.5, standard Ubuntu 8.10 x64 build):
    real 1m29.732s
    user 1m29.250s
    sys 0m0.076s

    Ruby 1.9 (trunk, compiled with standard settings on same machine):
    real 0m34.521s
    user 0m34.038s
    sys 0m0.028s

    JRuby 1.1.7 (trunk, running on IcedTea6 1.3.1 6b12 in server mode):
    real 0m32.377s
    user 0m29.110s
    sys 0m0.208s

    Interestingly, JRuby beat 1.9 here for fib(40).
    This is a C2D 2 GHz with 2 MB cache and 2 GB RAM, btw.

  13. Compared clojure to yeti.

    $ java -server -jar clojure.jar
    Clojure
    user=> (defn fib [n] (if ( (time (fib 40))
    “Elapsed time: 8991.244384 msecs”
    165580141
    user=> (time (fib 40))
    “Elapsed time: 8743.094273 msecs”
    165580141

    $ java -server -jar yeti.jar
    Yeti REPL.

    > time f a = (t = System#currentTimeMillis(); v = f a; println “Elapsed time: (System#currentTimeMillis() – t) msecs”; v)
    time is (‘a -> ‘b) -> ‘a -> ‘b =
    > fib n = if n number =

    > time fib 40
    Elapsed time: 6953 msecs
    165580141 is number
    > time fib 40
    Elapsed time: 6723 msecs
    165580141 is number

    Similar (not suprising, given that both run on JVM). In case you wonder, Yeti has numeric tower and don't directly use Java primitives therefore.

    JRuby:
    jruby-1.1.5$ time java -server -jar lib/jruby.jar fib.rb
    165580141
    real 0m27.309s

  14. For gits and shiggles I decided to create a Reia version. Reia is a scripting language on the Erlang VM which is not even alpha yet. But I could do the fibonacci number test, so why not?

    The code doesn’t use tail recursion and still does pretty well when stacked up against other scripting languages:

    real 0m46.907s
    user 0m44.931s
    sys 0m0.112s

    module Fibonacci
    def calc(n)
    if n <= 1
    1
    else
    calc(n – 1) + calc(n – 2)

    Fibonacci.calc(40)

  15. Since microbenchmarking is so fun, a couple more numbers :P

    Python 2.7a0 (just-built from trunk; same machine as before, again compiled with the default settings)
    real 1m23.950s
    user 1m23.953s
    sys 0m0.000s
    (I expected something more significant from 2.6+… well, the main opt it got was related to method invocation, not simple function calling, so I guess this test alone should never be used as a basis to say how much faster or slower it is compared to 2.5. Plus I don’t know how exactly the 2.5 package was compiled for Ubuntu, making the comparison even less objective)

    Rubinius (commit 362a70237, once again a fresh build)
    real 0m59.733s
    user 0m59.640s
    sys 0m0.056s

  16. You haven’t shown the command line used to compile the C program but from time relative to the Java program I guess you missed a compiler flag -

    $ /usr/bin/gcc -pipe -Wall -fomit-frame-pointer fib.c -o fib
    $ time ./fib
    real 0m2.255s
    user 0m2.252s
    sys 0m0.000s

    $ /usr/bin/gcc -O3 -pipe -Wall -fomit-frame-pointer fib.c -o fib
    $ time ./fib
    real 0m0.776s
    user 0m0.780s
    sys 0m0.000s

  17. And your point is what, again?

    octave:6> tic;(((1+sqrt(5))/2)**40 – ((1-sqrt(5))/2)**40)/sqrt(5);toc

    Elapsed time is 0.000118971 seconds.

    A better algorithm beats a better implementation almost any day.

  18. I chose to test each environment as default because I don’t pretend to be an expert in each. Isaac wins; his gcc options made a significant improvement:

    C (gcc 4.2, -O3 -pipe -Wall -fomit-frame-pointer) 0.9 seconds

    Thanks to everyone for your comments.

  19. The regular Clojure fib takes about 6.5s to do fib(35) on my machine. Using the ‘unchecked’ version halves that time. However, Scala runs fib(35) in approx. 212ms! What the heck is going on?

  20. I know you’re using a naive fib algorithm for a benchmark, but I just have to post my favorite clojure fibonacci sequence algorithm – it’s very fast and clojure-idiomatic. I didn’t write it, but used it to learn about lazy sequences in clojure.

    (def fibonacci-seq
    (lazy-cat [0 1] (map + fibonacci-seq (rest fibonacci-seq))))

    user=> (time (nth fibonacci-seq 40))
    “Elapsed time: 0.045 msecs”
    102334155

  21. Sam: Sure that is not a cached result? I get “Elapsed time: 0.272 msecs” for the first run and and approx 0.04 for subsequent runs.

  22. Is this just a single run of each? Certainly in the case of the JVM I would expect to see significant speedup on subsequent runs (due to JIT compilation). In general a single run doesn’t tell the whole story.

  23. Googles Go:
    time ./8.out
    165580141
    ./8.out 4,39s user 0,00s system 90% cpu 4,842 total

    —–8<——–
    1 package main
    2 import "fmt";
    3 func fib(n int) int {
    4 if(n<=1){ return 1; }
    5 return fib(n-1) + fib(n-2);
    6 }
    7 func main() {
    8 fmt.Println(fib(40));
    9 }
    —–8<——–

  24. One of the nice features that I like a lot is the memoize in Clojure. That makes Dynamic Programming so easy! I no longer have to manually create an array or Hashmap and then fill it in my algorithm. It is automagically taken care off!

    It think that is important to consider memoization too, because memoization in C or Ruby requires a lot additional work.

  25. For fun, major Ruby implementations, Scala, Python.

    (slowest to fastest)
    - [Python 2.7.2 => 62.70 secs]
    - [Python 2.7.3 => 59.578 secs]
    - [Rubinius 2.0.0rc1 => 32.571 secs]
    - [Ruby 1.9.3p327 => 22.142 secs]
    - [Ruby 2.0.0-preview1 => 20.606 secs]
    - [JRuby 1.7.0 => 9.665 secs]
    - [Macruby 0.12 => 6.301 secs]
    - [Scala 2.9.2 => 1.562 secs]
    - [Scala 2.9.2 compiled => 1.037 secs]

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>