Positive Change


I’ve been in Portland for a week. So far, it’s amazing. I really don’t want to blather on about how great it is because, to be honest, I’m afraid of boyish optimism. This town, like college, will probably give back whatever I put into it. So I’m pacing myself. I think it will be good though.

Our house is completely empty while we wait for our movers to arrive and that’s ok. I’ve been getting a lot done without all the distractions. One of my favorite pictures of Steve Jobs is where he is sitting in an almost empty room with nothing but books. I’m not trying to be Steve Jobs but I appreciate the minimalism because my house looks very similar to this picture right now.


I went downtown and got through most of the angular.js tutorial. The commute to an amazing spot downtown was universally less painful than doing the equivalent in DC. I’m floored and excited.

I’ve converted my blog from wordpress to jekyll and github pages (sorry for any weird problems). I went to a TechFest event downtown to meet people in the tech scene. It was really great. Puppet Labs folks seem really nice (others as well). The three ruby shops I talked to are desperate for senior ruby people. I’m not really looking right now but it’s good to know that nice places are local.

We’ve just been settling in. I just wanted to post something positive since my last DC rant. I’m far too negative about my past geographical location. I need to get over it and not dwell on it or make it part of my identity. You get back what you put in. Any city can be nice, it’s positive thinking and attitude that determines what your experience is. Sorry if that’s corny but I’m not giving in to my typical cynicism this time.

Default DC Tech is Just Bad


The opinions of this blog, but especially this post are mine and not my employers'.

I'm done with DC. I need to archive the reasons why for myself. I hope this serves as a free field trip to the DC area for anyone outside the beltway.


If you move to DC for the tech jobs, you are going to have to prune a lot of C-minus government work if you are good. All the while, you will be paying for local benefits you are not taking advantage of. This is the land of politics, military, intelligence, big government and lobbyists. I tried to influence from within but now it's time for me to GTFO and move to Portland to try to find "actual reality" jobs.

Continue reading


On a Ruby Rogues podcast about Passion, Avdi continued to enlighten and entertain me with his insights. I've really been enjoying his speaking style and voice lately through tapas and talks. If he reads this, I hope he understands I don't disagree with what he is saying; I thought he would enjoy a related story.

Honestly, this topic is so massive I don't think I can really offer too much more than the Rogues did on the podcast so I encourage you to listen to the episode yourself. It has almost nothing to do with programming or Ruby. I feel that philosophies and stories about passion are so close to the difficult and inevitable goal of "master yourself", which is both complicated and personal, I can just barely approach the topic and then a rat's nest of anecdotes and advice explodes all around us.

With that context laid out, here are a few stories.

On the podcast, the rogues talked about two PhDs that would scream at each other in the office on a daily basis. Avdi said that he's had similar experiences personally and said:

I've gotten upset with people for their code because it was so stupid. I've gotten angry and I've said mean things. And you know what? All those instances, those were wastes of my passion. That was wasted emotional energy.

I know what he's trying to say. There are moments when this is true. In fact, I would say in the majority of cases it's better to just "get over it" (a challenge itself). Most of the time I try to run in this mode. Most of the time I fail. It's especially hard when you feel like you need to "represent".

Story Time

Here was my situation. A "lead architect" I sat with in a shared office room was named "Bradly". Bradly was not his real name but it will help you remember that Bradly was Bad. He was a lead architect of our project but he couldn't code and he couldn't build servers. He had some very narrow skills in a certain problem domain but those skills weren't general enough for him to be an "architect". He got the title/position through a previous successful project. I'm just setting up the scenario here.

We had an application in the middle-tier, doesn't really matter what it did except that it talks to a database. His bright idea was to install a database on every node to reduce network traffic. We were vendor-locked into Oracle. We had at least 10 servers that this design decision would impact. We really needed 1 database but we would be purchasing 10. I was the only one on the project that could or would argue against the decision.

It was very simple from my point of view: - Three tier architecture is front, middle, back. Normally that's web, app server, database. - No one installs databases on their app servers to reduce network traffic. - We didn't know that network traffic is the (or a) bottleneck. - Oracle database licenses sell for about $20k + $?? annual support. - We were going to have a failover site so this single decision was on the order of $400k. - Running 10 databases is hard. Replication is hard. Oracle RDBMS does not "want" this layout.

Bradly's argument was: - Networks (gigabit, brand new awesome switches) are slow. - Local databases would avoid the network.

Just to be clear, this is a simplified version of what Bradly wanted.


This is what I wanted.


I did my best to remain calm but this is the represent problem. I felt compelled to bring the "outside world" to his mind. I knew that no one did it like this. And I imagined a stadium of my peers agreeing with me if they only knew what he was saying. "What would the Internet think? Oh my god! They would laugh! Imagine our embarrassment! What kind of project is this?!" etc.

I tried to approach it logically. I showed vendor diagrams and documentation of example architectures from the very vendor product we were using and no diagrams showed locally installed databases. Eventually it came down to his argument: "I own the architecture, I'm the architect."

In this case, I was very passionate. I was furious. I had to represent the outside world. I couldn't let it go. I had to teach him that no one does it this way. I had to represent.


We eventually got into the only screaming argument I've ever had in my 14-ish year career. It was bad. We didn't talk anymore. We split the team. We pitched our designs to management separately. Management was trying to side with one of us since money was involved. I moved offices so I didn't have to sit with him anymore. It was no fun, it was a bad situation.

In the end, he moved onto to other projects and I was able to influence the project back to normalcy. When new developers joined the project, some artifacts of this debate would surface (like an old diagram). Someone with three-tier experience would say "what the hell is this" and I'd have to explain the whole thing. In those rare cases I felt righteous but to this day I feel awful about the whole story. My working relationship was ruined with him and I hate the memory of that job just because of experiences like this one and others.

I'd rather teach than win. But he wasn't there to be a teacher and I wasn't given that role or power. When some people go to work they optimize for their career. They start off the day wanting to perform a job, move up and at the very least maintain the power they have. Because this is the best they've ever been in their career.

When I wake up, I optimize for experience. I start off the day ready to learn, improve and at the very least make myself or other people better at what we spend our time doing. Because this is the very worst I'll ever be in my career.

So Avdi, in this and other things, I hear you.

DRY up Methods with Ruby Blocks


Let's do something terrible by hand. First, here's our data. It comes from a database.

db_results = [
  { id: 1, login: 'mjay', roles: ['user'], projects: ['muffins'] },
  { id: 2, login: 'rroke', roles: ['admin', 'user'], projects: ['security'] },
  { id: 3, login: 'tpain', roles: ['user'], projects: ['muffins'] },
  { id: 4, login: 'ghaz', roles: ['admin', 'user'], projects: ['muffins', 'cakes'] },
  { id: 5, login: 'bbarker', roles: ['user'], projects: ['pies'] }

Now when working with these people, we probably could get away with doing something like this for a while:

# find all admins
admins = db_results.select {|user| user[:roles].include? 'admin' }

Which is fine. Until you want to find out what people are on the Muffin Project:

# find all people working on the muffins project
people_on_muffins = db_results.select {|user| user[:projects].include? 'muffins' }

But as you keep working, you might be getting a feeling of deja-vu. The two methods above are very similar. You might be inspired by other Ruby libraries which give you a tiny DSL or at least allow you to pass blocks into methods to be more expressive.

The Smell

Here’s the complete code smelly example.

db_results = [
  { id: 1, login: 'mjay', roles: ['user'], projects: ['muffins'] },
  { id: 2, login: 'rroke', roles: ['admin', 'user'], projects: ['security'] },
  { id: 3, login: 'tpain', roles: ['user'], projects: ['muffins'] },
  { id: 4, login: 'ghaz', roles: ['admin', 'user'], projects: ['muffins', 'cakes'] },
  { id: 5, login: 'bbarker', roles: ['user'], projects: ['pies'] },

admins = db_results.select {|user| user[:roles].include? 'admin' }
people_on_muffins = db_results.select {|user| user[:projects].include? 'muffins' }
meeting = admins + people_on_muffins
meeting_ids = meeting.collect {|user| user[:login] }.uniq

puts meeting_ids
# => rroke ghaz mjay tpain

We're having a meeting between the admins and people who are on the Muffin Project. The only person not matching these rules in this case is Bob Barker (bbarker). He must be busy enjoying retirement eating pie, who knows.


Let's take a look at Faraday. Faraday uses blocks to great effect to communicate intent just like most libraries in Ruby. In Faraday, this is how a HTTP POST is done using Faraday:

conn.post do |req|
  req.url '/nigiri'
  req.headers['Content-Type'] = 'application/json'
  req.body = '{ "name": "Unagi" }'

This is kind of nice! You can get more than one thing done at a time and it doesn't require a lot of temporary variables. Let's see if we can use blocks like this. We'll get to blocks in a miniute. Let's first refactor a little bit first.

The Fix

There's a certain similarity between the two selects. We really want to get "admins" and "project people" all together, so let's just do that. We'll create two methods that essentially replace the instance methods but can be used in the future for other rules. We'll call them .with_roles and .with_projects.

def with_roles(results, role)
  results.select {|user| user[:roles].include? role }

def with_projects(results, project)
  results.select {|user| user[:projects].include? project }

Next, we'll create a method that takes a block.

def user_ids(results, &block)
  rows = yield block
  ids = rows.collect {|user| user[:login] } if rows

The &block argument and yield block is optional. You could write this as:

def user_ids(results)
   rows = results.dup
   rows = yield if block_given?
   ids = rows.collect {|user| user[:login] }

But in that case, the block is optional, so you'll want to check for block_given?. For this example, it's easier for us to require a block to make this a shorter post ... err, well I guess it's longer now.

In any event, this method's job is to filter results (users) with whatever code is passed in. Then it uniques the collected array because user IDs are assumed here to be unique. Finally, it returns just user_ids like it's name implies.

The usage of this user_ids method that takes a block ends up reading very well.

admins = user_ids(db_results) do
  with_roles(db_results, 'admin') +
  with_projects(db_results, 'muffins')

puts admins
# => rroke ghaz mjay tpain

Here's the completed, less smelly example.

db_results = [
  { id: 1, login: 'mjay', roles: ['user'], projects: ['muffins'] },
  { id: 2, login: 'rroke', roles: ['admin', 'user'], projects: ['security'] },
  { id: 3, login: 'tpain', roles: ['user'], projects: ['muffins'] },
  { id: 4, login: 'ghaz', roles: ['admin', 'user'], projects: ['muffins', 'cakes'] },
  { id: 5, login: 'bbarker', roles: ['user'], projects: ['pies'] }

def with_roles(results, role)
  results.select {|user| user[:roles].include? role }

def with_projects(results, project)
  results.select {|user| user[:projects].include? project }

def user_ids(results)
  rows = results.dup
  rows = yield if block_given?
  ids = rows.collect {|user| user[:login] }

admins = user_ids(db_results) do
  with_roles(db_results, 'admin') +
  with_projects(db_results, 'muffins')

puts admins
# => rroke ghaz mjay tpain

# usage without a block, showing that it's a little more flexible
# puts user_ids(db_results)
# => returns everyone because no filtering block was passed

Wrap Up

This is pretty procedural. I'll leave it to you to put it into a class, maybe add something better than a "plus" operator to combine the user list together. Maybe a UserList abstraction class could help get away from hashes too.

I like going down these paths because you end up with more expressive code that is flexible to change. At the same time, little hints of DSLs come out when using blocks to this effect. This is starting down the path of a Ruby DSL. I'll be posting about that pretty soon.

Problems with "The Cloud"


I've been thinking about the problems with The Cloud outside it being a raging buzzword. It really comes down to Control and Connectivity. That's the problem but allow me to elaborate.


Google Wave is a great example of control loss. If you really put a lot of energy, stock and trust into Google Wave as a content store for your team, brain or idea then you might feel deflated by its cancellation. Even as an idea and a disruptive alternative to E-mail or SMTP crappiness, it's a shame it had to die. So what now? Wait for an open source version? Host your own?

The idea was to "put it in the cloud" and forget about it. But when the cloud changes outside your control, you have to be aware of it again. Now you really have to think about the cloud itself. It's not such a vague black box which is what the cloud diagram really means.

Another example of control is YouTube. I use YouTube favorites as a persistent list. I see a cool video, I favorite it and I feel like I sort of own it, or at least it's in a list that I can refer to later. But take a look at this:


What were those things? Who knows! Now, I have to think about "the cloud" again. These are temporary videos that someone else ultimately controls. I'm just adding references to a list. I don't own the clips. They are transient. They are ephemeral. I'm out of control again. I don't even know what media I've lost. Do I mitigate again? Do I suck down a list periodically and do a diff?


I recently got a Roku box for my TV. It's a great box. During registration it does a bunch of sign up and account creation. But it doesn't work without uPNP enabled on the router. This isn't even a connectivity outage thing, it's a connectivity assumption that I have a certain kind of firewall that can't have holes punched in it ... or that I'm not capable of punching the holes myself. I don't even really know why Roku does this uPNP thing. All I know is, it wouldn't even finish the setup until I made this change. Now here's a device that doesn't work without connectivity or a clear path to connectivity.

Think about how picky that is for a second. If it's not picky then think about how many technical barriers there are to pure or uniform Internet. Everyone brings their own quilted environment and it's a mess.

IPv6 Spike

A spike is when you play around with something and then throw it away for the purposes of learning. So, let's play around with IPv6. I had read a little bit about it but essentially my working experience with IPv6 was nothing except for disabling it. Let's learn some stuff!

I'm going to skip over all the history of IPv6 and assume that you agree with me and think that this is important and relevant to the future of the Internet.


First, build 4 Ubuntu VMs. I'm using 13.04 but any current Linux distro should work, just the packages and paths will change. I found the best way is to build a simple VM and then clone it 3 more times (in Fusion this is copy/paste and resetting the MAC address). You'll need four machines to simulate a local network. You won't need any network hardware and VMware will be able to simulate everything we need. You can actually do this whole experiment on one real box (cool stuff)!

The goal of this spike is:

  • Build 4 VMs
  • Make a router, a web server, a dns server and a client
  • Hit a web page between two network boundaries over IPv6 only

Super practical IPv6 primer

Addressing is WEIRD. That's really what I wanted to spike on. Getting comfortable with the addressing length, hexadecimal and understanding the addressing layout.

In IPv4, a network segment might look like this: So a box with an IP on that network might be this: ip: netmask:

IPv6 is a lot different. Private addresses don't start with 192.168., 172. or 10. Private addresses start with fc00 (from what I've read). So I made up two network segments called fc00:deed:d34d:b33f fc00:deee:deee:deee

But that's only 4 sets of hex. IPv6 addresses have 8 sets of 4 hex bytes like this:

(where n's are the network parts and h's are the host parts in a /64)

So let's configure a box with an ip. Our boxes are named after onomatopoeias (boing, wap, rawr and piff). Boing's address is "dot" 10. boing: fc00:deed:d34d:b33f::10/64

So there's a box that's configured with an IP. Notice the double colons. That just means it fills in the zeros between. It's shorthand. The /64 is the network segment. Like in ipv4 is a common private ip. The /24 is out of 32. So it means X.X.X.Y where Y is the host part and X.X.X is the network part. So 192.168.0.* is the network and .1 is the host. In IPv6 it's /64 out of a total /128.

So my private address space is just like an IPv4 private range. I'm assigning this IPv6 space and I have 18 trillion private address for my ONE SUBNET. For a router to work, I need two subnets. So now I have 36 TRILLION free private addresses. O_o

Address Configuration

I'm using ipv4 just for remote admin and installing things through apt. So you'll have to add another ip to your vm's network card. Ubuntu does this in the file /etc/network/interfaces. Here's an example.

boing - dns (
	  iface eth0 inet6 static
	  	address fc00:deed:d34d:b33f::10
	  	netmask 64
	  	gateway fc00:deed:d34d:b33f::1

If you type ifconfig or `ip addr` you will see that it has two IPs now. One IPv4 and one IPv6 address. We're not quite done. I drew a picture of the network layout and you'll have to configure all the VMs like this.


Router Configuration

This is really easy. You just need the Rawr box to forward IPv6 packets like a router but not like a firewall. So Linux can do that will a simple kernel switch. But first, you'll need to add a second network card in VMWare. So:

  • Shutdown rawr
  • Add a second network card
  • Boot rawr
  • Edit /etc/sysctl.conf, change net.ipv6.conf.all.forwarding = 1
  • Run sysctl -p

Routing through rawr should work at this point. For example, from piff, you should be able to ping boing through ipv6 even though they aren't on the same network segment. Use ping6 and traceroute6 to sanity check.

DNS Configuration

Boing is our DNS server so let's make some changes. First, apt-get install bind9. Then edit these files below. I configured a temporary subdomain on squarism.com called ipv6.squarism.com but this can be anything you want.

zone "ipv6.squarism.com" {
        type master;
        file "/etc/bind/db.ipv6.squarism.com";
zone "f.3.3.b.d.4.3.d.d.e.e.d.0.0.c.f.ip6.arpa" {
        type master;
        file "/etc/bind/db.fc00_deed_d34d_b33f";

Notice that the reverse zone (ip6.arpa) is super annoying to type out. It needs to be the reverse bytes (afaik).</p>

$TTL 2D ; zone default 2 days
$ORIGIN ipv6.squarism.com.

@                       IN SOA  ns1.ipv6.squarism.com. hostmaster.squarism.com. (
                                2013062702      ; serial
                                3H              ; refresh
                                15M             ; retry
                                1W              ; expire
                                1D              ; minimum

                        IN      NS      ns1.ipv6.squarism.com.

ns1                             AAAA    fc00:deed:d34d:b33f::10  ; dns server

; hosts
boing                   IN      AAAA    fc00:deed:d34d:b33f::10
wap                     IN      AAAA    fc00:deed:d34d:b33f::11
rawr                    IN      AAAA    fc00:deed:d34d:b33f::1
piff                    IN      AAAA    fc00:deee:deee:deee::6

; aliases
www                     IN      CNAME   wap

Here we're using the AAAA records for IPv6. In IPv4 these would be A records. Bind has supported AAAA records for a long time. CNAME records don't change. Notice that you could easily run a DNS server that serves both stacks.

; fc00:deed:d34d:b33f::/64
$TTL 1h ; Default TTL
@       IN      SOA     ns1.ipv6.squarism.com.  hostmaster.squarism.com. (
        2013072801      ; serial
        1h              ; slave refresh interval
        15m             ; slave retry interval
        1w              ; slave copy expire time
        1h              ; NXDOMAIN cache time

; domain name servers
@       IN      NS      ns1.ipv6.squarism.com.

; IPv6 PTR entries    IN    PTR    boing.ipv6.squarism.com.    IN    PTR    wap.ipv6.squarism.com.    IN    PTR    rawr.ipv6.squarism.com.    IN    PTR    piff.ipv6.squarism.com.

I used a reverse zone generator at rdns6.com for this last time. It’s amazingly annoying to type out. I’m not sure if there’s a more convenient form that could be used.</p>

Restart bind: sudo services bind9 restart Check the logs in /var/log/syslog for any `named:` errors. I had a few typos I had to chase down. DNS can be tricky to set up so take your time.

See DNS working

Ok, let's take a quick break from this infinite configuration and see how we are doing so far. At this point we should have routing and DNS working. So that means that Piff should be able to ping a DNS name and it should work.

But first, we need to tell Piff and all the other boxes to use our new dns server. Edit /etc/network/interfaces again and add

dns-search ipv6.squarism.com dns-nameservers fc00:deed:d34d:b33f::10

For example, your eth0 block will look like this

# The primary network interface
auto eth0
iface eth0 inet dhcp
iface eth0 inet6 static
	address fc00:deed:d34d:b33f::11
	netmask 64
	gateway fc00:deed:d34d:b33f::1
	dns-search ipv6.squarism.com
	dns-nameservers fc00:deed:d34d:b33f::10

Restart networking. Even if your VMs are running DHCP (they are by default), we're just adding a DNS server to the static IPv6 address. In other Linux distros, this file will be different (sry).

piff:~$ ping6 www.ipv6.squarism.com
PING www.ipv6.squarism.com(fc00:deed:d34d:b33f::11) 56 data bytes
64 bytes from fc00:deed:d34d:b33f::11: icmp_seq=1 ttl=64 time=1.12 ms
64 bytes from fc00:deed:d34d:b33f::11: icmp_seq=2 ttl=64 time=0.553 ms

Ok great. DNS and routing are working. Now we are ready for the final part.

Web Server Configuration

Get dependencies installed on Wap (the web server). aptitude install zlib1g-dev libssl-dev libpcre3-dev

We need ipv6 support built in and I'm not sure if the OS packages are going to come with it out of the box. Installing nginx is easy. So let's download nginx, configure, compile.

# download latest stable and untar ...
./configure --with-ipv6 --prefix=/opt/nginx
make install
cd /opt/nginx
vi conf/nginx.conf
  # change this line
  listen       [::]:80 default ipv6only=on;

Start nginx: /opt/nginx/sbin/nginx -c /opt/nginx/conf/nginx.conf Normally, I'd write a init.d script here but whatever. You can stop nginx like this: sudo /opt/nginx/sbin/nginx -s stop

Sanity check: netstat -nlp | grep nginx tcp6 0 0 :::80 :::* LISTEN 24618/nginx.conf

Let's create a dummy web page on Wap (the nginx box). We'll test this page in the next section.

<h1>This is only viewable over ipv6!</h1>

End to end test

Ok, everything is set up. So let's see it work end to end. Our goal was to hit an IPv6 web server through DNS and a router.

Here you can see I'm pinging the webserver from the web client (piff -> wap).

piff:~$ ping
PING ( 56(84) bytes of data.
64 bytes from icmp_req=1 ttl=64 time=0.341 ms

IPv4 is routable directly because of vmware. But IPv6 traffic is split, the client isn't on the same IPv6 segment as the web server. So I can ping directly with IPv4. So that's our IPv4 sanity test but not why we did all this.

You can see when I try to hit that ipv6.html test page we created earlier it won't work.

piff:~$ curl
curl: (7) Failed connect to; Connection refused

It actually says connection refused and this makes sense if you look at the netstat information from Wap. It's not listening on, it's listening on :::80. Crazy!

If I use ipv6 (curl needs some special settings for the URL)

piff:~$ curl -g http://[fc00:deed:d34d:b33f::11]/ipv6.html
<h1>This is only viewable over ipv6!</h1>

And ipv6 DNS is working.

piff:~$ curl -g http://wap.ipv6.squarism.com/ipv6.html
<h1>This is only viewable over ipv6!</h1>

You can see it's going through a router:

piff:~$ traceroute6 wap.ipv6.squarism.com
traceroute to wap.ipv6.squarism.com (fc00:deed:d34d:b33f::11)
from fc00:deee:deee:deee::6, 30 hops max, 24 byte packets
 1  fc00:deee:deee:deee::1 (fc00:deee:deee:deee::1)  0.739 ms  0.651 ms  0.185 ms
 2  fc00:deed:d34d:b33f::11 (fc00:deed:d34d:b33f::11)  1.266 ms  0.278 ms  0.262 ms

Wget works too

piff:~$ wget -O- http://wap.ipv6.squarism.com/ipv6.html

Ssh has no special flags, it just works.

piff:~$ ssh wap.ipv6.squarism.com
The authenticity of host 'wap.ipv6.squarism.com (fc00:deed:d34d:b33f::11)' can't be established.
ECDSA key fingerprint is -----.
Are you sure you want to continue connecting (yes/no)? yes

Victory Lap

Even firefox works.


Just to prove that this isn't IPv4, let's use the weird numerical URL format for the IP. ipv6_firefox_ip

Well this was a fun spike and I feel like I understand IPv6 a whole lot more and it doesn't strike fear into my heart to think about big scary addressing. I think the key is to actually use DNS instead of fudging it with typing manual addresses or managing crazy hosts files. It should be interesting to see when ISPs and cloud providers start offering serious options for IPv6.

The Best Way to Read CSV in Ruby


CSV is awful. CSV isn't well formed. It isn't hard to use because it's bloated and slow. CSV is hard to use because it's just a dumb data format. However, sometimes all you have is stupid data and who cares, let's do this thing and blot out the memories.

I assume you know how to use the CSV module that's built into Ruby. It's pretty easy. You just read a file in and you get some 2D array back. It usually comes out pretty horrible with long methods and little room for nice abstractions.

So what if you want to polish it up a little bit? Maybe you aren't just going to kludge this thing again and hate yourself later? What if you aren't just going to load this into a database? What if you want to do some quick CSV analysis but at the same time make it come out sort of readable?

Let's take a look at an abstraction layer and see how we could write a CSV loader for a guest list. We're going to have a dinner party and evite gave us a crappy CSV dump of who's responded so far. Well, it's what we have. But how many people are coming and how many groups aren't allergic to peanuts? We want to know how many peanut M&Ms to buy.

Here's our data:

Name, Plus, RSVP'd, Peanut Allergies
Tom DeLuise, 1, No, Yes
Mel Brooks, 3, Yes, Yes
Lewis Black, 5, Yes, No
Jon Stewart, 3, Yes, Yes
Jim Gaffigan, 0, Yes, No

Supermodel is pretty old and I like it a lot but it hasn't been updated in a while and has some open pull requests. I took at look at some alternatives but it didn't work out. - ActiveModel from Rails 3 is hard to make generic - ActiveModel::Model from Rails 4 is a great upgrade from 3.x. You can make anything look like a database object but it still doesn't have the concept of a collection. So now I have to make an array variable called table? This is weird. - Sequel has a nice interface to an in-memory sqlite3 database. It's probably the most 'real' that I found but it requires you to do a CREATE TABLE statement even for your in-memory database.

None of these alternatives above are bad but let's take a look and see how nice we can get it with Supermodel.

First, we are going to use a supermodel fork so that we automatically get rails 3.2.13 instead of 3.0.x. Create a project folder and a Gemfile file:

source "https://rubygems.org"
gem 'supermodel', :git => 'https://github.com/amdtech/supermodel.git'

Run bundle.

require 'csv'
require 'supermodel'

class Guest < SuperModel::Base
  validates_presence_of :name

class CSVImporter
  def import filename
    csv = CSV.read(File.open(filename))
    remove_headers csv

    csv.each do |row|
      Guest.create attributes_for(row)

  # i know i've gotten this to work more elegantly
  def remove_headers csv
    csv.delete_at 0

  def attributes_for row
    row = strip_row!(row)
      name:             row[0],
      plus:             row[1].to_i,
      rsvp:             row[2] == "Yes",
      peanut_allergies: row[3] == "Yes"

  def strip_row!(row)
    row.collect {|cell| cell.strip }


puts "How many people are coming?"
puts Guest.all.reduce(0) {|sum, guest| sum += guest.plus + 1}

peanut_eating_guests = Guest.all.select {|guest| guest.peanut_allergies == false }
peanut_guest_count = peanut_eating_guests.inject(0) {|sum, guest| sum += guest.plus + 1 }
puts "Number of guests eating peanut M&Ms:"
puts peanut_guest_count

# How many people are coming?
# 17
# Number of guests eating peanut M&Ms:
# 7

You can see that Guest.all is much more intent revealing than manipulating a 2D array by hand.

Rails Dev Shops in Washington DC


What shops, companies, consultants, startups or other folks are using Ruby or Rails (on any level)? Contact me on twitter if you want to be added or you have corrections: @squarism or leave a comment below.

rails_dev_shops_radiant Radiant CMS Radiant is a no-fluff, open source content management system designed for small teams.
rails_dev_shops_triple_dog_dare Triple Dog Dare Has your Rails (or Ruby) project gone off of the tracks? Did you outsource your work on the cheap only to find that your application is bug-ridden and slow? I can help make it better. Wrangling chaos is one of my specialities.
rails_dev_shops_intridea Intridea We don't just make web apps - we solve problems. At Intridea we write intelligently designed software to help businesses develop strategic solutions and launch new ideas.
rails_dev_shops_codesherpas Code Sherpas CodeSherpas is a full life-cycle software development and design firm based in Reston, VA.
rails_dev_shops_livingsocial Living Social LivingSocial is a deal-of-the-day website that features discounted gift certificates usable at local or national companies. Based in Washington, D.C., LivingSocial now has more than 70 million members around the world.
rails_dev_shops_bloomberg Bloomberg Bloomberg, the global business and financial information and news leader, gives influential decision makers a critical edge by connecting them to a dynamic network of information, people and ideas.
rails_dev_shops_monkeysee Monkeysee.com Monkey See captures the skill and knowledge of the world's top experts and delivers it to inquisitive audiences everywhere.
Gannett ???
Comcast ???
Sprint ???

The Supermodel Ruby Gem Loses Data


Actually no.

I love Supermodel. It might be overcome by ActiveModel::Model in Rails 4 but until then Supermodel is a fantastic in-memory database for Ruby that has a lot of advantages over using just a plain hash or trying to roll your own.

However using it with a large amount of data, we noticed it loses data. Sometimes, a few records. Other times, a few more. It was really random. We were confused. Looking at the docs, this is the default class maccman has in his README.

class Test < SuperModel::Base

That works no problem. We looked at the IDs that it uses and saw that it's using the Ruby ObjectID which is about 14 digits long.

#<Test:0x007f80e41dbd18 @new_record=false, @attributes={"bacon"=>"tasty",
 "id"=>70095779847820}, @changed_attributes={}, @validation_context=nil,
@errors={}, @previously_changed={"bacon"=>[nil, "tasty"]}>

Ok, that ID of 70095779847820 seems good enough right? Let's see!

require 'supermodel'

class FancyPants < SuperModel::Base

# create one thousand pairs of fancy pants
1_000.times {
  FancyPants.create(glitter: true)

raise "Nooo!  My fancy pants!" if FancyPants.count < 1_000

Run it.

RuntimeError: Nooo! My fancy pants!


Well. I'm no expert but I bet the object_ids in ruby aren't very random. I would hope they wouldn't be. Because you're creating objects all the time right? Ruby is slow enough without some super accurate id field. Should we abandon all hope and scatter our dreams in despair? Nope.

Supermodel has a documented solution for this. Just add this mixin into your class.

include SuperModel::RandomID

This will make the IDs more random and you'll find 1,000 pairs of fancy pants in your class. The odd ball thing for me was realizing that supermodel 'loses data'. But it doesn't. IMHO, this mixin should probably be the default. I find Supermodel an awesome quick and dirty database but a database shouldn't lose records silently.

I still love Supermodel. I've played around with other in memory databases such as Rails3, Rails4, sqlite3 with datamapper and Supermodel works like I want it to.

Super Interesting Talks from RubyConf 2012


Trying to summarize someone's 30-60minute talk is really hard. So apologies go out to anyone I'm trying to paraphrase here. I took it upon myself to watch every single video from RubyConf 2012 which started airing in November. It's May now. There's a lot of content there and you can't just slurp it down and expect to process it all. So I thought I'd leave little breadcrumbs to myself noting which things were super interesting to me.

Real Time Salami - Aaron Patterson
Any presentation by @tenderlove is great and this one was fun and interesting as expected. Aaron talks about parallelism, streaming and making Salami (actual salami).

Tokaido: Making Ruby Better on OSX - Yehuda Katz
This was an exciting talk about Tokaido which is a work in progress to make a Rails.app one-click super-easy dev tool for Mac. He talks about other platforms too, don't worry. This talk really makes you appreciate how hard this problem is. There are some super interesting low-level OSX details in there.

Why JRuby Works - Charles Nutter, Thomas Enebo
This was a great talk about JRuby and was very convincing presentation. Since watching it, I've been playing with Torquebox and JRuby. Unfortunately the audio and video are a bit weird. For me, I loved the part about garbage collection. It was a great summary about how good the JVM is at garbage collection.

Zero Downtime Deploys Made Easy - Matt Duncan
This talk was great. Matt walks through all the problems you will encounter when trying to reach a large number of nines. He covers a lot of gotchas, like "whoops that database migration locks the entire table and just took your site down". He covers how Yammer does database changes, managing job queues and external services when you are trying to keep uptime at maximum. This was definitely an eye opening talk.

Y Not -- Adventures in Functional Programming - Jim Weirich
OMG. This talk left my brain on the floor. I can't really explain how awesome Jim (he wrote rake) is. If you want to see the best live coding I've ever seen and learn about the y-combinator, watch it. I didn't follow along 100% but I was blown away.

The Celluloid Ecosystem - Tony Arcieri
This was a great intro into everything surrounding the celluloid gems. More importantly though, it was a _reference_ concurrency state of the union talk. If you want to learn why the actor model is the way to go (in Ruby or Scala actually) then watch it.

Ruby vs. the world - Matt Aimonetti
A great overview of languages other than Ruby. His starting point about the Sapir-Whorf Hypothesis - that language influences thought is a great opening to this talk. Matt chooses really interesting topics and does a good job. He covers Clojure, Scala and Go. This is a great talk if you don't know what any of those are or want a quick 'Rosetta Stone'.

Your app is not a black box - Josh Kalderimis
This talk is easy to watch. He does a great job of keeping it interesting. It's basically a talk about DevOps but more importantly about tooling. I found this talk very interesting from an ops, polish and motivation. Please watch this.

How to build, use and grow internal tools - Keavy McMinn
One of my favorites. I forwarded to a bunch of people. Github is worth emulating and Keavy shares insight about tools, culture and teams.

Asynchronous Processing for Fun and Profit - Mike Perham
A great talk about sidekiq vs redis from the authority on sidekiq.

Change your tools, change your outcome - Dr. Nic Williams
Dr. Nic nailed this talk. Some NSFW language. Hilarious and interesting talk about getting over nice to haves (like fast MRI spin-up time) and making your app more awesome for Ops people. Super great talk.

Grow Your Unix Beard Using Ruby - Jesse Storimer
A reference talk all about Unix. I found this very educational even though I consider myself pretty unix savvy. Jesse is great, he has a book on pragprog.

Boundaries - Gary Bernhardt
Amazing talk by destroyallsoftware's Gary. He talks about an imperative shell vs a functional core which is all the rage right now. Gary is brilliant.

Abstracting Features Into Custom Reverse Proxies - Nick Muerdter
Some great ideas about reverse proxies.

Service Oriented Architecture at Square - Chris Hunt
Chris walks you through creating a web service like they do at square as if you were working there. He introduced some amazing open source libraries from square that I need to check out (cane, fdoc, jetpack). For example, they use jetpack to auto-pack up and deploy a rails app with Jetty. So all you need is a JVM.

I'm pretty sure I picked more than half of the talks as ones that I found super interesting. There were many more but I can't just pick everything. It takes a while to watch all these videos but they are worth your time.

Using a Redis as a Database


The Spike

I was spiking on Redis recently. I wanted to use the redis-objects gem to simulate a shopping cart app even though the README specifically says

Just use MySQL, k?

I wanted to see what would happen if I tried it anyway. So the README and examples for the redis-objects gem are great so I'm not going to rehash what's there. However, I will say though that the example has you hardcode the id field to 1. That detail snuck up on me.

If you don't set an ID then you can't work with a redis-object instance. You get an exception: Redis::Objects::NilObjectId: Attempt to address redis-object :name on class User with nil id (unsaved record?)

It's basically trying to tell you, "hey, save the record first or set an ID". Well, honestly, I don't want to set an id myself. This is where the meat of the README is. Redis-objects really fits organically in an existing ActiveRecord model. That means Rails. In this case though, I don't want an entire Rails app. I can see the value though in a plain old Rails app. Just look at the examples if you want to see more.

Anyway, continuing on with the spiking, I tried to integrate the Supermodel gem with Redis-objects. That sort of worked. You just class User < Supermodel::Base and you can sort of get it to work. This is great because Supermodel gives you finders like User.find_by_email('bob@yahoo.com') to make it act like ActiveRecord but you can't use .create(email: 'bob@yahoo.com') to begin with because of the same errors as I mentioned above. Redis-objects really wants the record to have an ID already. Even using Supermodel's RandomID mixin didn't work. The initialize order and callback hooks don't really work (or at least I couldn't get them to work).

Finally, I tried combining just redis-objects and datamapper redis. That worked. And it's pretty nice. Check it out.

require 'redis-objects'
require 'dm-core'
require 'dm-redis-adapter'

DataMapper.setup(:default, {:adapter  => "redis"})

# you would move this to a common location
Redis.current = Redis.new(:host => '', :port => 6379)

class User
  include Redis::Objects
  include DataMapper::Resource

  # datamapper fields, just used for .create
  property :id, Serial
  property :email, String

  # use redis-objects fields for everything else
  value :disabled
  value :name
  list :cart, :marshal => true


# absolutely need this line for dm-redis

So using this is pretty easy.

u = User.create(email: 'test@test.com')
u.name = 'Testy McTesterson'

When you look at Redis, the keys are already composited for you and magic has happened.

redis> keys *

redis> get user:test@test.com:name
Testy McTesterson


The name field is from redis-objects and the create uses datamapper. This is a really odd pairing but I like the fact that I have no sql database in the mix but still have finders similar to an ORM. Something to keep in mind, datamapper's finders are a bit different than the Rails 3 ones (no .where method).

Benchmarking A Million Things

Ok fine. So maybe this works, maybe it doesn't. Maybe it's not the right idea. What about the good stuff? Like, how fast can we load a whole lot of names into MySQL versus Redis using the above code and techniques? Is it even relevant?

(PL = pipelined redis operation)

Loading one million random names (full names) like John Smith, Patty Gerbee Sr)
MySQL:                   06:05
Redis:                   02:45
Redis C ext              01:32
Redis pipelined:         00:56
Redis pipelined C ext:   00:19
Ruby just loading array: 387ms

Loading 10k ecommerce-style data (orders, users, products)
MySQL:    00:09.40
Redis:    00:14.50
Redis PL: 00:02.72

A gist of these test results is here.

A More Complete Example

If you know the ID and don't need something like an auto-incrementing column outside your code/control then you can greatly simplify the code above by getting rid of Datamapper. You can simply use redis-objects to fake an ORM. I had great success using it as long as you USE NATIVE REDIS TYPES. Listen to the redis-objects author, don't try to force the tool into the use case.

# What if we want to use redis-objects as a database but
# try to stick with native redis objects?
# For example, Supermodel is a great gem but using the Redis
# mixin causes Supermodel to serialize to JSON strings in Redis
# which is going to kill performance.  You have to model your
# problem using native Redis objects to keep the speed up.
# At the same time, I miss the finders from ActiveModel
# like: Person.find('Joe')
# Supermodel does give you those finders so you will feel right at
# home coming from Rails.  I tried using ActiveModel mixins with
# redis-objects but it didn't work for me.
# I found the below a nice compromise but it requires a lot of
# custom methods.  :(

require 'redis-objects'

class Person
  attr_reader :name
  alias :id :name

  include Redis::Objects

  def initialize name
    @name = name

  def self.exists? name
    # Here's a big assumption, if the id attribute exists, the entire
    # object exists.  This might not work for your problem.
    self.redis.exists "name:{#name}:id"

  def self.find name
    # new behaves like find when a record exists, so this works like
    # find_or_create()
    self.new name

  # native redis attributes with redis-objects
  value :age
  list :favorite_foods

# example usage

joe = Person.new 'Joe'
joe.age = 34
joe.favorite_foods << ['cake', 'pie']  # it will flatten arrays, don't worry
joe.favorite_foods << 'bacon'          # or you can do this

Person.find('Joe').age = 56

# find and initialize
Person.find('Stan').age = 21

# you cannot just .favorite_foods = ['Steak]' because that's not how native
# Redis objects work
Person.find('Stan').favorite_foods << 'Steak'

# deleting a field
Person.find('Stan').favorite_foods.del  # notice it's .del and not .delete (del is the redis cmd)

The Blub Paradox and Delicious Pie


Anything worth doing is worth doing well. “That’s Good Enough” isn’t good enough. There are some cases where you are writing a one-off script but those tasks are few and far between. Most people don’t get paid to write throw-away code. Most developers aren’t sysadmins writing procedural code on the level and complexity of Bash. Even if you are, is this the only style you know? Do you shy away from more complex tasks? Can you not break down an any size problem into smaller pieces?

A musician practices their art constantly. An athlete trains. And yet, we in the tech community have to go out of our way to find people who work on personal projects on the side or have the will/motivation to learn new things outside of “the job”. If you don’t learn on your own then you have to figure it out as you go. And you are already very busy so I know you’re just skimming over this post anyway. So let’s break this down in a bulleted list.

  • Developers work best when they’re challenged but not overwhelmed. Optimal work is a run, not a jog or a sprint.
  • Developers will step-up or fill-in a task to make it challenging as stated above.
  • Your classic boss doesn’t really care how you solve problems.
  • Culture takes 20 years to change.
  • The Blub Paradox says (among other things) that you can’t make anyone understand the power of a different language because developers become sedentary.
  • Good consultants are already busy so no one is going to save you.
  • Training or bootcamps won’t change your skillset or habits by a large percentage.

Let me put this another way.

My cat will play with a toy on the floor. And after a while, it becomes boring. So she swats the toy under a cabinet. And now it’s exciting! She can barely reach the toy and its a challenge. When she gets the toy out, she’ll bat it under the cabinet again. What is she going to do? Be bored again? She’s good at reaching for the toy and using her claws to hook stuff, tail counter-balance to stay in control and seeing in dark places. She needs to be playing under the cabinet and not given an mouse in a barrel. This is what you must do. You must improve your skills by going beyond “Good enough”. Otherwise, you will stagnate and have limited tools and approaches to problems.

But how do you even know what to learn? When you are looking up and down the power continuum in a problem domain, how can you grow beyond your own knowledge boundaries to see advantages that other people are enjoying? If you were looking at a menu of pies, how would you choose which one you will like?

Here’s the pie menu. Pick the one that you like!

(all pies come with whipped cream on the side)
Cranston Ermu Supreme Pie
Tagasnackle Mound Blat Pie
Rumination Flip Pie
Rainclouds and Humility Pie
Are You Seeing The Problem Pie
$3.99 per slice

What? This is terrible. This isn’t fair. Certainly technology cannot be like this. This is computer science! How can you choose? How can you know?

Hungry: “Waiter. What’s in humility pie?”

Waiter: “Humility.” Hungry: “What does that mean?” Waiter: “Have you ever literally tasted humility?” Hungry: “No.” Waiter: “Well it’s kind of like tasting regret but also different than pride. You kind of taste like > you should have prepared more or set expectations lower.” Hungry: “That doesn’t make any sense.” Waiter: “Of course it doesn’t. It’s an experience. It’s not a definition.”

Given enough questions, the Waiter will quit unless the Hungry customer chooses a pie or leaves.

Now let’s look at some horrifyingly contrived personalities looking down (or not looking down) the pie continuum.

The Eternal Procedural Coder

Let’s say you are a developer who just writes scripts all the time. You piece things together and when they work the first time, you ship it (whatever shipping means to you). If a more complicated task comes along, you don’t change your design, tools or strategy, you just change the amount of time you throw at it. Once it works, you call it done.

  • If you don’t know how to build something in a variety of ways then you are always going to build it the same way.
  • If you don’t pre-learn outside work then you have to run to catch up to change your habits.
  • You can’t train new developers easily on what you are doing because everything you build is “your way”.
  • You can’t learn new techniques, tools or study good design because you don’t have time at work.
  • You don’t feel like learning on the weekend or after work.
  • When you look at another style, it looks like a bunch of crap and you don’t see the value/power/happiness/productivity/peace-of-mind because you haven’t gone through even one example yet.

Exactly what is the exit strategy here? What’s the end game? Who or what is going to save you or change your situation? What would the disruption look like? What would the disruption be called?

Meanwhile, you hate looking at old code. You wonder “what was I thinking?” or “I must have had a lot of coffee this day”. Projects come and go, you hope you never have to go back to old projects. You leave things alone because they are terrible but do the job. You continually start new tasks but they all come out the same. Huge single files, huge methods/functions. One mess after another. No wonder why they call this “work” right? Who would do this for fun or learning?

What’s interesting is that The Blub Paradox applies here too. It’s difficult to explain what development is supposed to be like to someone who has never had a success. Success breeds success. Failure is of course crucial to learning and courage is needed to fail when trying to succeed. But experience really is the success side of trying and learning. You can gain experience through failure but in terms of visualizing the ideal, success is very important.

The It Can’t Be Done Guy

Let’s say there’s a website called Searchbox.com that’s a stripped down version of Google. In a meeting, everyone is talking about how Facebook search and Twitter are all hot right now and you feel like you are behind the curve. So you decide you want to add friends, followers and other social features to Searchbox.com. Your rockstar lead developer says “it can’t be done”. No one questions him. The idea dies, jokes are cracked and the grapes were probably sour anyway. Yay team?

Software is supposed to be soft. So why can certain things be so easy and other things be so hard? I understand certain problems just “are hard” (concurrency, cache invalidation). But not even entertaining ideas or experiments could be a sign of bad design. Even with crappy procedural scripts, simple changes mean omfg-lots-of-lines changed. Design isn’t about avoiding change (and change will happen), it’s about reducing the cost of change.

If you want to learn from a master, read POODR. Sandi says:

…successful but undesigned applications carry the seeds of their own destruction; they are easy to write but gradually become impossible to change.

POODR is full of these insightful observations and Sandi doesn’t just give you advice. She walks you through refactoring code and demonstrates the concepts that are related to these insights.

Anyway, back to our example project. If searchbox.com is broken up into modules then there is a user class or object. Surely this class could be forked into a feature branch and worked on in isolation by a few team members to add Friend and Follower traits/mixins/interfaces? Or is the user really a session cookie combined with a database row and “who knows” how to change it?

If there isn’t really a concept of a user but it’s just a bunch of hacks then yes, no one is going to entertain massive changes to this concept of a “user”. It’s too hard (in your system). However, if a system was explicitly designed thoughtfully (not by default or by accident) then features are easy to add. The software stays soft. This is what Sandi is talking about in POODR.

The Waterfall by Default Guy

Let’s say there’s a guy in charge of a project. He hasn’t ever tried Agile of any sort. He’s read about it and he’s confident that he knows all about it. He thinks it sounds crazy, lazy and chaotic. He’s done plenty of projects in the past that have come out under-budget and on-time so why change anything? On the other hand, he’s not technical so his opinion of past projects is completely dependent on the feedback given to him by developers. If developers ever felt like talking to him was a waste of time then his opinion of how good “waterfall” is could be slightly off. The real question is, can you explain how delicious Agile Pie is to him?

Here’s the wheelhouse of agile type stuff.

  • Developers aren’t going to want to work with you again if you burn them out.
  • Project heroes burn out.
  • You can’t hire your way out of a failing project.
  • Most people don’t like the term Waterfall because it’s viewed old.
  • If you don’t declare a methodology it’s Waterfall by default.
  • You can’t do Agile. Agile is a type of thing. Like Car. You have to buy a type of Car.

So when you try to explain any type of agile techniques to someone who doesn’t have any experience with it, it’s the same thing as the Blub Paradox. The Waterfall Guy is happy enough doing no explicit methodology, why would he change? When asked how to prevent projects from getting off-track he will say, “before you start working on a project, you need to know what you are building”. Even though Agile (and common sense) says that both the customer and the developers are the stupidest at the beginning of a project. Big Design Up Front (BDUF) doesn’t work. BDUF is undesigned project methodology. BDUF is how you put together simple things like grocery lists and legos. BDUF isn’t how you deal with change and complicated systems where developers will upscale the project difficulty until it becomes interesting.

So even if you discuss these items with Waterfall Guy, he’s not going to get a taste of Agile Pie without eating it. You could talk all day long about the ingredients, how people keep eating it, how there’s these case studies showing it’s delicious. “Meh, whatever. I have my dessert already.” At the same time, Waterfall Guy is always asking for visibility as to what developers are doing through status reports. In his heart he knows that he has a few hero developers and if he loses them, he’s screwed. That’s ok, that’s why they call it work right?

The Non-Tester

There’s a team who are running a production app. They have a testing team. When developers are working on “omg the really important thing”, they tell the testing team to run “whatever test”. It usually involves someone manually going down a list with a pencil. When a problem is found then a tester will tell a developer something went wrong. “It showed me this error 0x2123 when I did the thing you told me to do.”

Except a lot of the time there isn’t anything wrong. A lot of times the problem is the testers aren’t developers. Otherwise, they’d be developers. A lot of times the tester doesn’t really understand the system and the developer has to explain the system to them. So while the developer is working on “omg the next really important thing” the tester is asking questions and the developer is getting annoyed. Jokes are made “you only come to me when there’s a problem, haha”. Or, “I bug you so much I owe you a beer! Haha.”.

Looking down the power continuum, a developer looks at testing libraries. They look annoying and time-consuming. Who needs this hassle? They don’t have time to find out if they do. They don’t understand the different levels of testing (from most involved and beneficial):

  • No testing at all - you have to hit refresh in a browser and you have no idea if you broke anything else. Old bugs pop up.
  • Some testing - unit tests but you have a team run through end-to-end scenarios.
  • A lot of testing - you have unit and integration tests and maybe even code coverage reports and CI.
  • Test-first development - you write your tests before code but that’s the end of that methodology.
  • Test-driven development (TDD) - you let your tests completely drive the design of the system.
  • Testing end game - You do red-green-refactor. Your tests run fast. You do UI testing. Your customer requirements map to easily read executable stories. You spend very little time in red. You ping-pong pair with TDD. No code is committed without passing CI. Your testing tools and process are constantly evolving in response to pain points.

The non-tester doesn’t see the value or the constant effort people are making to try to get to the Testing End Game. “I’m not a tester! Why would I test!” Meanwhile, they complain about how little time they have because they have to manually have to see if their code works. Or maybe they are embarrassed / mad / frustrated that they broke the development/production build. In essence, the Blub Paradox is telling us that they can’t see their problems through the lens of Blub because they don’t know Blub. Meanwhile, Blub (TDD) practitioners can’t imagine doing it the way the non-tester does it. In fact, TDD practitioners are honing their craft so much that it is a rare event when they’ll take the time to explain TDD or the End Game to a non-tester. Even if they do, both will be frustrated because any insight or evangelizing will be misunderstood.

The Long Time Ignorant

Take any of the above and put it in the context of time. How long has SQL been around? Does someone on your team not know SQL? Is it because they haven’t needed it? What if they are writing flat database files by hand and do need SQL? Have they just been living under a rock or can no one change their mind? What can you do as an individual to change such a blind eye?

How long has JUnit been around? Is there a Java developer you know that hasn’t written a unit test? Or they have written a few but they don’t anymore because they think they don’t have time to do it?

That Seems Hard

Someone who can only write code in a procedural style can’t break down problems into easily solvable bits. Without concepts like Mixins/Traits, Composing Behavior and TDD; a complex problem is just going to seem too hard. “I don’t do that kind of stuff.” When someone that has OO or Functional experience might say “here’s how I would do it, who wants to help me?” Of course there are things that are just too big but the difference is in experience. A Bash scripter isn’t ever going to understand ncurses events because they haven’t ever written a desktop GUI. So even though ncurses can be great for turning “scripts” into “programs”, they are going to shy away from ncurses because “wtf that seems hard”.

  • You can’t explain the benefits of a web framework because of the Blub Paradox.
  • You can’t explain the benefits of an ORM vs raw SQL because of the Blub Paradox.
  • You can’t explain the advantages of tmux to someone because of the Blub Paradox. Even while they keep losing their ssh sessions over unstable wifi. Given a simple enough problem, you might be able to convince them to try it out.

Tmux is a perfect example. No one understands how great it is until they use it. Then and only then do they never want to go back. Tmux isn’t always great though. Sometimes the terminal gets all weird with certain keyboards or maybe you want the native buffer scrollback to work. So it’s not some silver bullet default. However, once you grok tmux, you know when to use it and miss it when you don’t have it. This is true for many things that people evangelize. However, sometimes you need to understand their world view and take that into consideration.

Tmux pie is delicious. Ask anyone who has tried it.


I don’t have all the answers. Ok, see you later. silence Well listening to Rubyrogues and reading POODR has given me some memorable advice which I’ll repeat here.

  • Spike on foreign concepts. Ok, you don’t know OOP and breaking down problems. Make up a problem and do it. Don’t try to learn in your main project. Spike on a concept and throw the code away.
  • Pair with someone who knows this stuff and check your ego at the door. When you are lost, don’t play with your iPhone. Ask questions. Pretend to be stupid. Check your ego.
  • Learn a different tech stack. When I write Java (poorly these days), I bring Ruby experiences back with me.
  • Watch conference videos and screencasts. Pre-learn before work. You have no choice. No one is going to pay you to get better.
  • Read the POODR book even if you’re not a Rubyist.
  • If you are resistent to an idea, change sides. Let’s say you don’t like soccer. Take the position of a soccer fan and argue with yourself. You might be amazed at the argument you make just looking from another perspective.

Ruby p385 benchmarks


I was playing around with the falcon p385 patch to see if it's any faster than some of the more recent MRI rubies. TL;DR version: looks like p192 is faster than p385 of any type or tweak.

Here's how to get a p385 Ruby version patched with funny falcon's performance patches using RVM.

mkdir ~/.rvm/patches/ruby/1.9.3/p385
curl https://github.com/funny-falcon/ruby/compare/p385...p385_falcon.diff > \
rvm install 1.9.3-p385 -n perf --patch falcon
Then rvm use 1.9.3-p385-perf or set it as your global ruby.

Test Setup

The following benchmarks were run on an i7 server with a RAID5 array. The disk is slow (lack of large cache) but the benchmarks were run on the same box so it should compare apples-to-apples.

From here on out, here are the definitions for the Ruby versions.

p194 = 1.9.3p194 default
p385 = 1.9.3p385 default
falcon = 1.9.3p385 with the above falcon diff patch applied
gcc_tweak = the falcon patches with GCC compile flags tweaked.

So what are these GCC tweaks? Explicitly setting the CFLAGS for your machine's CPU type and recompiling ruby with the Falcon patches applied.

Micro and Macro Benchmarks

I used the ruby-benchmark-suite to run these tests.

Here are some example results. I can't list them all. There are over 100 benchmarks. These are results for the mean times in seconds.

test 1.9.3p194 1.9.3p385 falcon gcc_tweak
bm_sudoku.rb 1.379112226 1.598182153 1.495923579 1.526717563
bm_open_many_files.rb 0.175602996 0.197096826 0.197673286 0.194135045
... etc etc

Here's the winner summary for mean times. This is the number of times the ruby version was the fastest for a particular benchmark. 1.9.3p194 - 64 wins 1.9.3p385 - 29 wins 1.9.3p385-falcon with GCC tweaks - 10 wins 1.9.3p385-falcon 5 wins

Boot time and IO

Timing rails boot time is a bit more important to me. If you want to know how to really save "rails boot time" see the DAS screencast on not loading Rails at all.

Even when using domain objects and lib tricks, it's nice to have Rails and all I/O boot fast. The main thing that funny falcon's patches do is speed up requires and I/O.

So let's benchmark booting a Rails app. $ time bundle exec rake environment

Ruby Version Seconds
p385 patched with falcon and GCC tweaks 2.481 total
p374 defaults 3.336 total
2.0.0-rc2 2.613 total

In conclusion, p194 looks faster on the macro and micro benchmarks but Falcon patches boot Rails faster.

Hash Choices


As I've previously talked about, Hashes of Hashes are weird to work with. In the previous post about Captain Planet, I showed how to select, filter and manipulate 2D hashes and arrays but ultimately concluded that a hash of hashes is both weird and unnecessary (most of the time).

If you can control the data, inline your key into the hash data and make an Array of Hashes. It's really where it belongs. If you don't, you'll find yourself doing a few extra iterations or work. Below you'll see a simple example of the two data structures.

# Hash of Hashes
  :dave => { :age => 1, :height => 2 },
  :tony => { :age => 1, :height => 2 }
# Better --> Array of Hashes
  { :id => 'dave', :age => 1, :height => 2 },
  { :id => 'tony', :age => 1, :height => 2 }

In this case of an array of hashes, the data is easier to manipulate with array operations and filters. But before talking about matching on arrays of hashes, I want to talk about matching on dates.

doc = Date.parse "Nov 12 1955"
marty = Date.parse "Oct 27 1985"
future = Date.parse "Jan 13 2094"
doc < marty
=> true

So comparisons with dates work like you might expect. But fuzzy matches do not. Take this example. Is Marty in between Doc Brown and the distant future? The answer should be true.

We use a Range (x..y) object to make a date range. Then we can use === to check if we get a match.

marty === (doc .. future)  # wrong, wrong, wrong
=> nil

But wait, === on Date is different than === on Range. The method === on Date checks to see if it's the same Date whereas === on Range checks to see if the argument is within the range. So if you flip it, it returns true.

(doc .. future) === marty
=> true
require 'date'

holidays = {
  :halloween => { :date => Date.parse("Oct 31 2012"), :presents => false },
  :christmas => { :date => Date.parse("Dec 25 2012"), :presents => true },
  :july_fourth => { :date => Date.parse("July 4 2013"), :presents => false },
  :valentines_day => { :date => Date.parse("Feb 14 2013"), :presents => true },
  :thanksgiving => { :date => Date.parse("Nov 28 2012"), :presents => false }

# turn hash of hashes into array of hashes
holiday_array = []
holidays.keys.each do |key|
  holiday_array << { :id => key }.merge(holidays[key])

# find all the holidays with presents
puts "yay presents!"
puts holiday_array.select {|holiday| holiday[:presents] == true }

# find all the holidays within a date range
winter = (Date.parse("Dec 21 2012")..Date.parse("Mar 20 2013"))

puts "Winter holidays"
puts holiday_array.select {|h| winter === h[:date] }

So what we did in the middle there was flatten the hash of hashes into an array of hashes by merging the key with the 'data' part of the hash. Hopefully that makes sense.

What a tech stack has to do on our first date

Listen to Dave Thomas

Watch this video -> please. Start at 45:50 and watch at least 10 minutes.

I tell a lot of people about this video. I don't tell 10,000 people up on a stage on some world tour. But whenever I get into a good discussion with geeks, I reference this video and tell them to watch it. So I tell a lot of people in terms of percentage in conversations that I'm enjoying. So in other words, I really like the video personally, it struck a chord (or a brain-thorn) and I want other people to consider his point of view. Specifically, Dave Thomas says you should learn a new language every year. Invariably, I'll tell people that Dave Thomas writes books and a few times people will dismiss his main idea because they think he's selling books.

Dismissing Dave Thomas' opinion that you should learn a new language every year just because he is an author is a crap position to take. First, FUD. Second, Dave Thomas doesn't sell books about the languages he says to check out in the video. Thirdly, I trust Dave Thomas. This isn't some fucking Larry Ellison power play.

So in terms of following his advice, I might pick up Scala. It's different enough and anti-hipster-omg-embrace-the-java-enterprise enough that it will challenge me and/or be different than say continuing with Python (will is really Ruby like Spanish to Portugese) learning. I've picked up a Scala book but haven't started it yet. I looked at the Lift docs and was completely deflated. Chad Fowler tweeted about the Play! framework (thanks for the reminder) and thank god this looks better.

If I get sick of a city, I might move. When I move, I'm looking for something new. Similarly, starting out with Scala is going to be a bit like jumping into a new town. Ok, I need to find the equivalents of the old things that I'm used to. But wait, I moved to a new town to do something new! If I bring my old habits with me, I'll just be writing Ruby code in Scala. And that's crap.

Marge: I've dug myself into a happy little rut here and I'm not about to hoist myself out of it.
Homer: Just bring the rut with ya, Honey.
-- "You Only Move Twice"

What any technology stack is going to have to do on our first date

I need a list of libraries at least searchable by category and popularity. Popular gems are likely to be used and therefore good. This is great to get to higher level operations and gain productivity. For example, ruby-toolbox.com.

I need screencasts ordered by date so you can learn community trends. I'm going to need a way to keep up with the Jones' and know what's falling out of favor in the community because a shiny new library, technique or concept is getting the attention of key people. This also operates as curated content. For example, everyone is talking about CORS. I'm sure there will be a Railscast on CORS even though CORS is not specific to Ruby or Rails.

Related to that last point, the community should be of high quality. I know people like to hate on PHP but if you took a core sample of PHP-land, I think you wouldn't be too happy. Of course there are exceptions, for example, Facebook is full of brilliance.

Some easy way to install things that's Internet-enabled. Maven is close. Rubygems is good. NPM is good. Yum beats plain RPM. Pip is fine. CPAN is crap. All these package managers connect to the Internet and resolve dependencies. As a bonus, being able to repackage, bundle, freeze or tar up libs into a lib directory is great for corporate firewalls or making deployment not suck so hard.

I need a way to separate projects from one another. Gemsets with RVM does this. Virtualenv does this with projects. Lib implicitly does this in Java (although CLASSPATH issues suck). Explicit project isolation with libs is great.

Give me videos from the community like Confreaks. Video community talks are necessary. The JavaOne video site sucks and only because of the content. It doesn't have to be flashy and overly-produced. Oracle publishes 3 minute videos with zero details and zero content. Yay?

I need quick feedback. This means a REPL and/or event-based test suite runners. If I save a file, I want a test to run (Ruby's Guard gem). If I want to figure out how to do something, I want it to be instant feedback. About the farthest thing I can think of away from this is deploying to Tomcat, reloading the webapp and hitting refresh in a browser (#fml).

I'll call Scala back

So far, Scala has the REPL thing down (just type $ scala). And the SBT tool does auto-reloading similar to Guard. Of course, it has to compile things so it's slightly a slower feedback loop but it's not terrible. SBT also seems to have a manifest thing going on for the equivalent of gemsets. It all goes in project/plugins.sbt which is great. There's also a concept of a global project in a ~/.sbt folder. This reminds me a lot of NPM.

I haven't discovered the community side of things yet like screencasts (railscasts) or a news site (rubyflow). On the negative side, there seems to be some Java crosstalk. For example the Play! framework can generate a Scala or a Java project. So if you are trying to get away from Java, this is a good step away but not a clean break. I get a feeling of Java refugee camp (which is fine).

Something I'm working on right now is understanding the IDE landscape. Eclipse was a bit weird with loading the Scala plugins and Intellij is coming out with a new version of IDEA soon. Also I need to grok the sbt-console inside the IDEA plugin vs command line. Are they the same? I had this same challenge when trying RubyMine. At some point you have to get a feel for what the One True Way was intended to be.

subscribe via RSS