Mike Perham

On Ruby, software and the Internet

Entries Tagged as 'Ruby'

Developing Rubygems with RVM and Bundler

August 3rd, 2010 · 3 Comments

It’s safe to say that RVM and Bundler have completely changed how I interact with my Ruby applications and gems. It’s pretty well understood how to use each by itself, I didn’t have a good idea how to use them in tandem until recently. Parts of this post are based on Derek Kastner’s great post [...]

[Read more →]

Tags: Ruby

Detecting Duplicate Images with Phashion

May 21st, 2010 · 9 Comments

Recently I was given a ticket to implement a “near-duplicate” image detector. Look at these three images: The original image files have different bytesizes and different sizes but they show essentially the same thing. This is what we call a “near-duplicate” and the problem was that when displaying an automatically generated image gallery for a [...]

[Read more →]

Tags: Ruby · Software

bayes_motel – Bayesian classification for Ruby

April 28th, 2010 · 6 Comments

Bayesian classification is an algorithm which allows us to categorize documents probabilistically. I recently started playing with Twitter data and realized there was no Ruby gem which would allow me to build a spam detector for tweets. The classifier gem just works on a set of text by figuring out which words appear in a [...]

[Read more →]

Tags: Ruby

Phat News

April 6th, 2010 · No Comments

Gregg and Nathaniel (both of whom are notorious Gowalla cheats, which I would never do, no sir) chat a bit about Phat in the latest episode of Ruby5. The Changelog crew also gave their take on Phat in a recent posting. I’ve spent 100s of hours working on the technology behind Phat over the last [...]

[Read more →]

Tags: Ruby

Ruby Open Files

March 19th, 2010 · No Comments

Get the number of open files for each of your Ruby processes: sudo lsof | grep ruby | ruby -e ‘h=Hash.new(0);$<.each_line {|line| h[line.split[1]] += 1};p h’ Example output: {“3268″=>808, “4513″=>399, “4795″=>237, “5067″=>178, “5083″=>16, “23751″=>108}

[Read more →]

Tags: Ruby

Touch a File

February 27th, 2010 · 1 Comment

Here’s how to touch a file using Ruby, easy as 1-2-3: File.utime(access_time, mod_time, filename)

[Read more →]

Tags: Ruby

The Trouble with Ruby Finalizers

February 24th, 2010 · 6 Comments

I was test driving Devil, the developer’s image library, recently to see if it would work for us in a long-living daemon. Task #1 to that end is to verify the absence of memory leaks, which seem to be common in image libraries. It was almost immediately apparent that Devil contained a large memory leak. [...]

[Read more →]

Tags: Ruby

Asynchronous DNS Resolution

February 10th, 2010 · 4 Comments

Ruby has a serious scalability problem most Rubyists are unaware of. When you lookup the IP address for a hostname, the entire Ruby process blocks by default. If you have a slow DNS server, your process can grind to a halt waiting for hostname resolution. Ruby comes standard with a fix, resolv-replace, which provides a [...]

[Read more →]

Tags: Ruby

Cassandra and EventMachine

February 9th, 2010 · 6 Comments

I spent this past weekend adding eventmachine support for the Cassandra gem. We’re using Cassandra at OneSpot as our next-gen data store and need EM support. They were nice enough to pull my changes yesterday so the next release of the thrift_client and cassandra gems should work in EM. You just need to do this: [...]

[Read more →]

Tags: Ruby

Scalable Ruby Processing with EventMachine

January 27th, 2010 · 5 Comments

I gave a talk at Austin On Rails last night on using EventMachine, focused on maximizing concurrency when processing a message queue. There were a lot of questions, mostly revolving around the flow of execution within EventMachine code. To this point, there were two common stumbling points people seemed to have: Ruby developers are not [...]

[Read more →]

Tags: Ruby