Interior_view_of_Stockholm_Public_Library

Library Versioning

It’s time for our annual Semantic Versioning argument/gripefest! This time it was kicked off by Jeremy Ashkenas’s post why he believes Semantic Versioning is wishful thinking. Olivier Lacan chipped in further thoughts on the importance of a changelog.

Yes, Semantic Versioning is wishful thinking. Change cannot be compressed into three version numbers to guarantee safe upgrades. Developers get things wrong and forget changes such that versioning often isn’t correct, even if they wanted to follow SemVer exactly1. I thought I would write down my own versioning policies as another example for people to consider.

Continue reading

cloud2_55171579

Building Systems and The Cloud

If you are building a system to run in the cloud, be prepared to spend much of your time building a resilient system.

Not a fast system. Not a very efficient system. Not a system full of fun, quirky features that users love. A resilient system because you will see performance and network issues at every connection point in your system. I hope that’s what you want.
Continue reading

Use runit!

I’ve been exploring a few new (to me!) technologies recently and runit is one that I’ve come away really impressed with. Linux distros have a few competing init services available: Upstart, systemd, runit or creaky old sysvinit. Having researched all of them and having built lots of server-side systems over the last two decades, I can firmly recommend runit if you want a server-focused, reliable init system based on the traditional Unix philosophy.

Continue reading

photo-5

My Next Chapter

After 2.5 years I’ve decided to move on from The Clymb. I’m incredibly proud of our accomplishments during my time there: the site has dramatically increased in stability and scalability, the GitHub development workflow really improved code quality and we increased the size of the development team from 3 to 15. On the technical side, we moved from manually-configured cloud-based servers to Chef-managed dedicated servers, switched email providers twice, moved warehouses and rewrote our inventory and fulfillment process, integrated our logistics and fulfillment processes with an ERP system, and much more. I’m really proud to have helped grow the business while they helped me grow in management skill.

What am I doing next?
Continue reading

Setting MySQL DATETIME column defaults in Rails

Starting in MySQL 5.6.5, datetime columns can have an actual useful default of CURRENT_TIMESTAMP and MySQL will auto-populate the columns as necessary. This is incredibly handy if you ever do bulk updates in SQL, now you don’t need to remember to set updated_at! Inserting records manually will auto-populate those columns too. Let’s try it:

def up
  create_table :rows do |t|
    t.integer :value
    t.datetime :created_at, null: false, default: "CURRENT_TIMESTAMP"
    t.datetime :updated_at, null: false, default: "CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP"
  end
end

Run that and we’ll see this:

ActiveRecord::StatementInvalid: Mysql2::Error: Invalid default value for 'created_at': CREATE TABLE `rows` (`id` int(11) DEFAULT NULL auto_increment PRIMARY KEY, `value` int(11) NULL, `created_at` datetime DEFAULT 'CURRENT_TIMESTAMP' NOT NULL, `updated_at` datetime DEFAULT 'CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP' NOT NULL) ENGINE=InnoDB

Notice that Rails quotes the default value, making it invalid. We can bypass this by using a custom type to define all the special logic we need and use the generic column definition method:

CREATE_TIMESTAMP = 'DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP'
UPDATE_TIMESTAMP = 'DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP'

def up
  create_table :rows do |t|
    t.integer :value
    t.column :created_at, CREATE_TIMESTAMP
    t.column :updated_at, UPDATE_TIMESTAMP
   end
end

Big Caveat: you must make sure your database’s timezone is set correctly. MySQL defaults to the system’s timezone and we set our system timezone to Pacific so everything should work fine for us.

$ mysql
mysql> select @@time_zone;
+-------------+
| @@time_zone |
+-------------+
| SYSTEM      |
+-------------+

Defined like that, those columns will be populated and updated any time rows are touched, not just when Rails does it.

Ruby Performance 2014

Last year I posted a comparison of various Ruby VMs and how fast they could process N empty jobs. This is the equivalent of pumping out “Hello World” responses in an app server: it’s not very useful for application developers but it’s far more useful than a microbenchmark in determining real Ruby VM performance. Let’s take a look at the most popular three versions available today: MRI 2.1.1, MRI 2.0.0 and JRuby 1.7.11.

Time required to process 50,000 empty jobs with a single Sidekiq process running 25 threads.

Version Time With Logging
2.1.1 46 sec 67 sec
2.0.0 50 sec 70 sec
1.7.11 33 sec 51 sec

 

Like last year, JRuby continues to dominate in raw runtime performance. 2.1.1 shows a small performance advantage over 2.0.

“With Logging” shows some interesting data: just logging the start and finish times of the jobs to the global logger proves to be a major performance hit. The reason is that Ruby’s Logger contains an internal Mutex to ensure that data is logged to the stream atomically. This Mutex becomes a source of contention when 25 threads are processing those no-op jobs. Your first impression might be to optimize the Logger but this is a red herring! During normal execution the logger won’t be as heavily contented because your jobs are actually doing work.

Details:

The actual code is here.

Run on a late 2013 MBP retina with 2.8Ghz Core i7 with 2 cores running on battery. Defaults were used for everything.

java version “1.7.0_45″
Java(TM) SE Runtime Environment (build 1.7.0_45-b18)
Java HotSpot(TM) 64-Bit Server VM (build 24.45-b08, mixed mode)