It’s time for our annual Semantic Versioning argument/gripefest! This time it was kicked off by Jeremy Ashkenas’s post why he believes Semantic Versioning is wishful thinking. Olivier Lacan chipped in further thoughts on the importance of a changelog.
Yes, Semantic Versioning is wishful thinking. Change cannot be compressed into three version numbers to guarantee safe upgrades. Developers get things wrong and forget changes such that versioning often isn’t correct, even if they wanted to follow SemVer exactly1. I thought I would write down my own versioning policies as another example for people to consider.
If you are building a system to run in the cloud, be prepared to spend much of your time building a resilient system.
Not a fast system. Not a very efficient system. Not a system full of fun, quirky features that users love. A resilient system because you will see performance and network issues at every connection point in your system. I hope that’s what you want.
I’ve been exploring a few new (to me!) technologies recently and runit is one that I’ve come away really impressed with. Linux distros have a few competing init services available: Upstart, systemd, runit or creaky old sysvinit. Having researched all of them and having built lots of server-side systems over the last two decades, I can firmly recommend runit if you want a server-focused, reliable init system based on the traditional Unix philosophy.
After 2.5 years I’ve decided to move on from The Clymb. I’m incredibly proud of our accomplishments during my time there: the site has dramatically increased in stability and scalability, the GitHub development workflow really improved code quality and we increased the size of the development team from 3 to 15. On the technical side, we moved from manually-configured cloud-based servers to Chef-managed dedicated servers, switched email providers twice, moved warehouses and rewrote our inventory and fulfillment process, integrated our logistics and fulfillment processes with an ERP system, and much more. I’m really proud to have helped grow the business while they helped me grow in management skill.
What am I doing next?
Since the Sidekiq 3.0 release, I’ve been slowly chipping away at some new features in Sidekiq Pro. What’s new and upcoming?
A lot of people ask me “How can I guarantee that a batch of jobs finished successfully?” Here’s the sad fact: you can’t. 99% of the time things go perfectly but there will always be some small percentage that fail for a myriad of reasons: hardware failure, software bug, thunderstorms in the cloud.
I’ve been noticing a theme in certain Rubygems recently that I like: opinionated designs which explicitly don’t allow the user to do certain things. I call these bounded libraries because they draw a functional boundary and won’t go beyond that point.
Starting in MySQL 5.6.5, datetime columns can have an actual useful default of CURRENT_TIMESTAMP and MySQL will auto-populate the columns as necessary. This is incredibly handy if you ever do bulk updates in SQL, now you don’t need to remember to set updated_at! Inserting records manually will auto-populate those columns too. Let’s try it:
create_table :rows do |t|
t.datetime :created_at, null: false, default: "CURRENT_TIMESTAMP"
t.datetime :updated_at, null: false, default: "CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP"
Run that and we’ll see this:
ActiveRecord::StatementInvalid: Mysql2::Error: Invalid default value for 'created_at': CREATE TABLE `rows` (`id` int(11) DEFAULT NULL auto_increment PRIMARY KEY, `value` int(11) NULL, `created_at` datetime DEFAULT 'CURRENT_TIMESTAMP' NOT NULL, `updated_at` datetime DEFAULT 'CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP' NOT NULL) ENGINE=InnoDB
Notice that Rails quotes the default value, making it invalid. We can bypass this by using a custom type to define all the special logic we need and use the generic
column definition method:
CREATE_TIMESTAMP = 'DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP'
UPDATE_TIMESTAMP = 'DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP'
create_table :rows do |t|
t.column :created_at, CREATE_TIMESTAMP
t.column :updated_at, UPDATE_TIMESTAMP
Big Caveat: you must make sure your database’s timezone is set correctly. MySQL defaults to the system’s timezone and we set our system timezone to Pacific so everything should work fine for us.
mysql> select @@time_zone;
| @@time_zone |
| SYSTEM |
Defined like that, those columns will be populated and updated any time rows are touched, not just when Rails does it.
Last year I posted a comparison of various Ruby VMs and how fast they could process N empty jobs. This is the equivalent of pumping out “Hello World” responses in an app server: it’s not very useful for application developers but it’s far more useful than a microbenchmark in determining real Ruby VM performance. Let’s take a look at the most popular three versions available today: MRI 2.1.1, MRI 2.0.0 and JRuby 1.7.11.
Time required to process 50,000 empty jobs with a single Sidekiq process running 25 threads.
Like last year, JRuby continues to dominate in raw runtime performance. 2.1.1 shows a small performance advantage over 2.0.
“With Logging” shows some interesting data: just logging the start and finish times of the jobs to the global logger proves to be a major performance hit. The reason is that Ruby’s Logger contains an internal Mutex to ensure that data is logged to the stream atomically. This Mutex becomes a source of contention when 25 threads are processing those no-op jobs. Your first impression might be to optimize the Logger but this is a red herring! During normal execution the logger won’t be as heavily contented because your jobs are actually doing work.
The actual code is here.
Run on a late 2013 MBP retina with 2.8Ghz Core i7 with 2 cores running on battery. Defaults were used for everything.
java version “1.7.0_45″
Java(TM) SE Runtime Environment (build 1.7.0_45-b18)
Java HotSpot(TM) 64-Bit Server VM (build 24.45-b08, mixed mode)
After tons of work on Sidekiq 2.x for the last 18 months, I decided it was time for some deeper refactoring and cleanup necessitating a major version bump.
Sidekiq 3.0 is the result of three months of hacking, cleanup and community suggestions. There’s an huge amount of stuff in here so hang on to your hats…