DRY up Methods with Ruby Blocks

pixel-ribbon_redolution

Let’s do something terrible by hand. First, here’s our data. It comes from a database.

Now when working with these people, we probably could get away with doing something like this for a while:

Which is fine. Until you want to find out what people are on the Muffin Project:

But as you keep working, you might be getting a feeling of deja-vu. The two methods above are very similar. You might be inspired by other Ruby libraries which give you a tiny DSL or at least allow you to pass blocks into methods to be more expressive.

The Smell

Here’s the complete code smelly example.

We’re having a meeting between the admins and people who are on the Muffin Project. The only person not matching these rules in this case is Bob Barker (bbarker). He must be busy enjoying retirement eating pie, who knows.

Inspiration

Let’s take a look at Faraday. Faraday uses blocks to great effect to communicate intent just like most libraries in Ruby. In Faraday, this is how a HTTP POST is done using Faraday:

This is kind of nice! You can get more than one thing done at a time and it doesn’t require a lot of temporary variables. Let’s see if we can use blocks like this. We’ll get to blocks in a miniute. Let’s first refactor a little bit first.

The Fix

There’s a certain similarity between the two selects. We really want to get “admins” and “project people” all together, so let’s just do that. We’ll create two methods that essentially replace the instance methods but can be used in the future for other rules. We’ll call them .with_roles and .with_projects.

Next, we’ll create a method that takes a block.

The &block argument and yield block is optional. You could write this as:

But in that case, the block is optional, so you’ll want to check for block_given?. For this example, it’s easier for us to require a block to make this a shorter post … err, well I guess it’s longer now.

In any event, this method’s job is to filter results (users) with whatever code is passed in. Then it uniques the collected array because user IDs are assumed here to be unique. Finally, it returns just user_ids like it’s name implies.

The usage of this user_ids method that takes a block ends up reading very well.

Here’s the completed, less smelly example.

Wrap Up

This is pretty procedural. I’ll leave it to you to put it into a class, maybe add something better than a “plus” operator to combine the user list together. Maybe a UserList abstraction class could help get away from hashes too.

I like going down these paths because you end up with more expressive code that is flexible to change. At the same time, little hints of DSLs come out when using blocks to this effect. This is starting down the path of a Ruby DSL. I’ll be posting about that pretty soon.

Problems with “The Cloud”

pixel-ribbon_grass
I’ve been thinking about the problems with The Cloud outside it being a raging buzzword. It really comes down to Control and Connectivity. That’s the problem but allow me to elaborate.

Control

Google Wave is a great example of control loss. If you really put a lot of energy, stock and trust into Google Wave as a content store for your team, brain or idea then you might feel deflated by its cancellation. Even as an idea and a disruptive alternative to E-mail or SMTP crappiness, it’s a shame it had to die. So what now? Wait for an open source version? Host your own?

The idea was to “put it in the cloud” and forget about it. But when the cloud changes outside your control, you have to be aware of it again. Now you really have to think about the cloud itself. It’s not such a vague black box which is what the cloud diagram really means.

Another example of control is YouTube. I use YouTube favorites as a persistent list. I see a cool video, I favorite it and I feel like I sort of own it, or at least it’s in a list that I can refer to later. But take a look at this:

youtube_whoops

What were those things? Who knows! Now, I have to think about “the cloud” again. These are temporary videos that someone else ultimately controls. I’m just adding references to a list. I don’t own the clips. They are transient. They are ephemeral. I’m out of control again. I don’t even know what media I’ve lost. Do I mitigate again? Do I suck down a list periodically and do a diff?

Connectivity

I recently got a Roku box for my TV. It’s a great box. During registration it does a bunch of sign up and account creation. But it doesn’t work without uPNP enabled on the router. This isn’t even a connectivity outage thing, it’s a connectivity assumption that I have a certain kind of firewall that can’t have holes punched in it … or that I’m not capable of punching the holes myself. I don’t even really know why Roku does this uPNP thing. All I know is, it wouldn’t even finish the setup until I made this change. Now here’s a device that doesn’t work without connectivity or a clear path to connectivity.

Think about how picky that is for a second. If it’s not picky then think about how many technical barriers there are to pure or uniform Internet. Everyone brings their own quilted environment and it’s a mess.

IPv6 Spike

A spike is when you play around with something and then throw it away for the purposes of learning. So, let’s play around with IPv6. I had read a little bit about it but essentially my working experience with IPv6 was nothing except for disabling it. Let’s learn some stuff!

I’m going to skip over all the history of IPv6 and assume that you agree with me and think that this is important and relevant to the future of the Internet.

Setup

First, build 4 Ubuntu VMs. I’m using 13.04 but any current Linux distro should work, just the packages and paths will change. I found the best way is to build a simple VM and then clone it 3 more times (in Fusion this is copy/paste and resetting the MAC address). You’ll need four machines to simulate a local network. You won’t need any network hardware and VMware will be able to simulate everything we need. You can actually do this whole experiment on one real box (cool stuff)!

The goal of this spike is:

  • Build 4 VMs
  • Make a router, a web server, a dns server and a client
  • Hit a web page between two network boundaries over IPv6 only

Super practical IPv6 primer

Addressing is WEIRD. That’s really what I wanted to spike on. Getting comfortable with the addressing length, hexadecimal and understanding the addressing layout.

In IPv4, a network segment might look like this: 10.0.0.0/24
So a box with an IP on that network might be this: ip: 10.0.0.136 netmask:255.255.255.0

IPv6 is a lot different. Private addresses don’t start with 192.168., 172. or 10. Private addresses start with fc00 (from what I’ve read). So I made up two network segments called
fc00:deed:d34d:b33f
fc00:deee:deee:deee

But that’s only 4 sets of hex. IPv6 addresses have 8 sets of 4 hex bytes like this:

So let’s configure a box with an ip. Our boxes are named after onomatopoeias (boing, wap, rawr and piff). Boing’s address is “dot” 10.
boing: fc00:deed:d34d:b33f::10/64

So there’s a box that’s configured with an IP. Notice the double colons. That just means it fills in the zeros between. It’s shorthand. The /64 is the network segment. Like in ipv4 192.168.0.1/24 is a common private ip. The /24 is out of 32. So it means X.X.X.Y where Y is the host part and X.X.X is the network part. So 192.168.0.* is the network and .1 is the host. In IPv6 it’s /64 out of a total /128.

So my private address space is just like an IPv4 private range. I’m assigning this IPv6 space and I have 18 trillion private address for my ONE SUBNET. For a router to work, I need two subnets. So now I have 36 TRILLION free private addresses. O_o

Address Configuration

I’m using ipv4 just for remote admin and installing things through apt. So you’ll have to add another ip to your vm’s network card. Ubuntu does this in the file /etc/network/interfaces. Here’s an example.

If you type ifconfig or `ip addr` you will see that it has two IPs now. One IPv4 and one IPv6 address. We’re not quite done. I drew a picture of the network layout and you’ll have to configure all the VMs like this.

ipv6_lab

Router Configuration

This is really easy. You just need the Rawr box to forward IPv6 packets like a router but not like a firewall. So Linux can do that will a simple kernel switch. But first, you’ll need to add a second network card in VMWare. So:

  • Shutdown rawr
  • Add a second network card
  • Boot rawr
  • Edit /etc/sysctl.conf, change net.ipv6.conf.all.forwarding = 1
  • Run sysctl -p

Routing through rawr should work at this point. For example, from piff, you should be able to ping boing through ipv6 even though they aren’t on the same network segment. Use ping6 and traceroute6 to sanity check.

DNS Configuration

Boing is our DNS server so let’s make some changes. First, apt-get install bind9. Then edit these files below. I configured a temporary subdomain on squarism.com called ipv6.squarism.com but this can be anything you want.

Notice that the reverse zone (ip6.arpa) is super annoying to type out. It needs to be the reverse bytes (afaik).

Here we’re using the AAAA records for IPv6. In IPv4 these would be A records. Bind has supported AAAA records for a long time. CNAME records don’t change. Notice that you could easily run a DNS server that serves both stacks.

I used a reverse zone generator at rdns6.com for this last time. It’s amazingly annoying to type out. I’m not sure if there’s a more convenient form that could be used.

Restart bind: sudo services bind9 restart Check the logs in /var/log/syslog for any `named:` errors. I had a few typos I had to chase down. DNS can be tricky to set up so take your time.

See DNS working

Ok, let’s take a quick break from this infinite configuration and see how we are doing so far. At this point we should have routing and DNS working. So that means that Piff should be able to ping a DNS name and it should work.

But first, we need to tell Piff and all the other boxes to use our new dns server. Edit /etc/network/interfaces again and add


dns-search ipv6.squarism.com
dns-nameservers fc00:deed:d34d:b33f::10

For example, your eth0 block will look like this

Restart networking. Even if your VMs are running DHCP (they are by default), we’re just adding a DNS server to the static IPv6 address. In other Linux distros, this file will be different (sry).


piff:~$ ping6 www.ipv6.squarism.com
PING www.ipv6.squarism.com(fc00:deed:d34d:b33f::11) 56 data bytes
64 bytes from fc00:deed:d34d:b33f::11: icmp_seq=1 ttl=64 time=1.12 ms
64 bytes from fc00:deed:d34d:b33f::11: icmp_seq=2 ttl=64 time=0.553 ms

Ok great. DNS and routing are working. Now we are ready for the final part.

Web Server Configuration

Get dependencies installed on Wap (the web server).
aptitude install zlib1g-dev libssl-dev libpcre3-dev

We need ipv6 support built in and I’m not sure if the OS packages are going to come with it out of the box. Installing nginx is easy. So let’s download nginx, configure, compile.

# download latest stable and untar ...
./configure --with-ipv6 --prefix=/opt/nginx
make install
cd /opt/nginx
vi conf/nginx.conf
# change this line
listen [::]:80 default ipv6only=on;

Start nginx:
/opt/nginx/sbin/nginx -c /opt/nginx/conf/nginx.conf
Normally, I’d write a init.d script here but whatever. You can stop nginx like this:
sudo /opt/nginx/sbin/nginx -s stop

Sanity check:
netstat -nlp | grep nginx
tcp6 0 0 :::80 :::* LISTEN 24618/nginx.conf

Let’s create a dummy web page on Wap (the nginx box). We’ll test this page in the next section.

End to end test

Ok, everything is set up. So let’s see it work end to end. Our goal was to hit an IPv6 web server through DNS and a router.

Here you can see I’m pinging the webserver from the web client (piff -> wap).

IPv4 is routable directly because of vmware. But IPv6 traffic is split, the client isn’t on the same IPv6 segment as the web server. So I can ping directly with IPv4. So that’s our IPv4 sanity test but not why we did all this.

You can see when I try to hit that ipv6.html test page we created earlier it won’t work.

It actually says connection refused and this makes sense if you look at the netstat information from Wap. It’s not listening on 0.0.0.0:80, it’s listening on :::80. Crazy!

If I use ipv6 (curl needs some special settings for the URL)

And ipv6 DNS is working.

You can see it’s going through a router:

Wget works too

Ssh has no special flags, it just works.

Victory Lap

Even firefox works.
ipv6_firefox_dns

Just to prove that this isn’t IPv4, let’s use the weird numerical URL format for the IP.
ipv6_firefox_ip

Well this was a fun spike and I feel like I understand IPv6 a whole lot more and it doesn’t strike fear into my heart to think about big scary addressing. I think the key is to actually use DNS instead of fudging it with typing manual addresses or managing crazy hosts files. It should be interesting to see when ISPs and cloud providers start offering serious options for IPv6.

The Best Way to Read CSV in Ruby

pixel-ribbon_company_profile

CSV is awful. CSV isn’t well formed. It isn’t hard to use because it’s bloated and slow. CSV is hard to use because it’s just a dumb data format. However, sometimes all you have is stupid data and who cares, let’s do this thing and blot out the memories.

I assume you know how to use the CSV module that’s built into Ruby. It’s pretty easy. You just read a file in and you get some 2D array back. It usually comes out pretty horrible with long methods and little room for nice abstractions.

So what if you want to polish it up a little bit? Maybe you aren’t just going to kludge this thing again and hate yourself later? What if you aren’t just going to load this into a database? What if you want to do some quick CSV analysis but at the same time make it come out sort of readable?

Let’s take a look at an abstraction layer and see how we could write a CSV loader for a guest list. We’re going to have a dinner party and evite gave us a crappy CSV dump of who’s responded so far. Well, it’s what we have. But how many people are coming and how many groups aren’t allergic to peanuts? We want to know how many peanut M&Ms to buy.

Here’s our data:

Supermodel is pretty old and I like it a lot but it hasn’t been updated in a while and has some open pull requests. I took at look at some alternatives but it didn’t work out.
- ActiveModel from Rails 3 is hard to make generic
- ActiveModel::Model from Rails 4 is a great upgrade from 3.x. You can make anything look like a database object but it still doesn’t have the concept of a collection. So now I have to make an array variable called table? This is weird.
- Sequel has a nice interface to an in-memory sqlite3 database. It’s probably the most ‘real’ that I found but it requires you to do a CREATE TABLE statement even for your in-memory database.

None of these alternatives above are bad but let’s take a look and see how nice we can get it with Supermodel.

First, we are going to use a supermodel fork so that we automatically get rails 3.2.13 instead of 3.0.x. Create a project folder and a Gemfile file:

Run bundle.

You can see that Guest.all is much more intent revealing than manipulating a 2D array by hand.

Rails Dev Shops in Washington DC

pixel-ribbon_copper_leaf_hotel_lobby

What shops, companies, consultants, startups or other folks are using Ruby or Rails (on any level)? Contact me on twitter if you want to be added or you have corrections: @squarism or leave a comment below.

rails_dev_shops_radiant Radiant CMS Radiant is a no-fluff, open source content management system designed for small teams.
rails_dev_shops_triple_dog_dare Triple Dog Dare Has your Rails (or Ruby) project gone off of the tracks? Did you outsource your work on the cheap only to find that your application is bug-ridden and slow? I can help make it better. Wrangling chaos is one of my specialities.
rails_dev_shops_intridea Intridea We don’t just make web apps – we solve problems. At Intridea we write intelligently designed software to help businesses develop strategic solutions and launch new ideas.
rails_dev_shops_codesherpas Code Sherpas CodeSherpas is a full life-cycle software development and design firm based in Reston, VA.
rails_dev_shops_livingsocial Living Social LivingSocial is a deal-of-the-day website that features discounted gift certificates usable at local or national companies. Based in Washington, D.C., LivingSocial now has more than 70 million members around the world.
rails_dev_shops_bloomberg Bloomberg Bloomberg, the global business and financial information and news leader, gives influential decision makers a critical edge by connecting them to a dynamic network of information, people and ideas.
rails_dev_shops_monkeysee Monkeysee.com Monkey See captures the skill and knowledge of the world’s top experts and delivers it to inquisitive audiences everywhere.
Gannett ???
Comcast ???
Sprint ???

The Supermodel Ruby Gem Loses Data

pixel-ribbon_northface
Actually no.

I love Supermodel. It might be overcome by ActiveModel::Model in Rails 4 but until then Supermodel is a fantastic in-memory database for Ruby that has a lot of advantages over using just a plain hash or trying to roll your own.

However using it with a large amount of data, we noticed it loses data. Sometimes, a few records. Other times, a few more. It was really random. We were confused. Looking at the docs, this is the default class maccman has in his README.

That works no problem. We looked at the IDs that it uses and saw that it’s using the Ruby ObjectID which is about 14 digits long.

Ok, that ID of 70095779847820 seems good enough right? Let’s see!

Run it.

What.

Well. I’m no expert but I bet the object_ids in ruby aren’t very random. I would hope they wouldn’t be. Because you’re creating objects all the time right? Ruby is slow enough without some super accurate id field. Should we abandon all hope and scatter our dreams in despair? Nope.

Supermodel has a documented solution for this. Just add this mixin into your class.

This will make the IDs more random and you’ll find 1,000 pairs of fancy pants in your class. The odd ball thing for me was realizing that supermodel ‘loses data’. But it doesn’t. IMHO, this mixin should probably be the default. I find Supermodel an awesome quick and dirty database but a database shouldn’t lose records silently.

I still love Supermodel. I’ve played around with other in memory databases such as Rails3, Rails4, sqlite3 with datamapper and Supermodel works like I want it to.

Super Interesting Talks from RubyConf 2012

pixel-ribbon_chem

Trying to summarize someone’s 30-60minute talk is really hard. So apologies go out to anyone I’m trying to paraphrase here. I took it upon myself to watch every single video from RubyConf 2012 which started airing in November. It’s May now. There’s a lot of content there and you can’t just slurp it down and expect to process it all. So I thought I’d leave little breadcrumbs to myself noting which things were super interesting to me.

Real Time Salami – Aaron Patterson
Any presentation by @tenderlove is great and this one was fun and interesting as expected. Aaron talks about parallelism, streaming and making Salami (actual salami).

Tokaido: Making Ruby Better on OSX – Yehuda Katz
This was an exciting talk about Tokaido which is a work in progress to make a Rails.app one-click super-easy dev tool for Mac. He talks about other platforms too, don’t worry. This talk really makes you appreciate how hard this problem is. There are some super interesting low-level OSX details in there.

Why JRuby Works – Charles Nutter, Thomas Enebo
This was a great talk about JRuby and was very convincing presentation. Since watching it, I’ve been playing with Torquebox and JRuby. Unfortunately the audio and video are a bit weird. For me, I loved the part about garbage collection. It was a great summary about how good the JVM is at garbage collection.

Zero Downtime Deploys Made Easy – Matt Duncan
This talk was great. Matt walks through all the problems you will encounter when trying to reach a large number of nines. He covers a lot of gotchas, like “whoops that database migration locks the entire table and just took your site down”. He covers how Yammer does database changes, managing job queues and external services when you are trying to keep uptime at maximum. This was definitely an eye opening talk.

Y Not — Adventures in Functional Programming – Jim Weirich
OMG. This talk left my brain on the floor. I can’t really explain how awesome Jim (he wrote rake) is. If you want to see the best live coding I’ve ever seen and learn about the y-combinator, watch it. I didn’t follow along 100% but I was blown away.

The Celluloid Ecosystem – Tony Arcieri
This was a great intro into everything surrounding the celluloid gems. More importantly though, it was a _reference_ concurrency state of the union talk. If you want to learn why the actor model is the way to go (in Ruby or Scala actually) then watch it.

Ruby vs. the world – Matt Aimonetti
A great overview of languages other than Ruby. His starting point about the Sapir-Whorf Hypothesis – that language influences thought is a great opening to this talk. Matt chooses really interesting topics and does a good job. He covers Clojure, Scala and Go. This is a great talk if you don’t know what any of those are or want a quick ‘Rosetta Stone’.

Your app is not a black box – Josh Kalderimis
This talk is easy to watch. He does a great job of keeping it interesting. It’s basically a talk about DevOps but more importantly about tooling. I found this talk very interesting from an ops, polish and motivation. Please watch this.

How to build, use and grow internal tools – Keavy McMinn
One of my favorites. I forwarded to a bunch of people. Github is worth emulating and Keavy shares insight about tools, culture and teams.

Asynchronous Processing for Fun and Profit – Mike Perham
A great talk about sidekiq vs redis from the authority on sidekiq.

Change your tools, change your outcome – Dr. Nic Williams
Dr. Nic nailed this talk. Some NSFW language. Hilarious and interesting talk about getting over nice to haves (like fast MRI spin-up time) and making your app more awesome for Ops people. Super great talk.

Grow Your Unix Beard Using Ruby – Jesse Storimer
A reference talk all about Unix. I found this very educational even though I consider myself pretty unix savvy. Jesse is great, he has a book on pragprog.

Boundaries – Gary Bernhardt
Amazing talk by destroyallsoftware’s Gary. He talks about an imperative shell vs a functional core which is all the rage right now. Gary is brilliant.

Abstracting Features Into Custom Reverse Proxies – Nick Muerdter
Some great ideas about reverse proxies.

Service Oriented Architecture at Square – Chris Hunt
Chris walks you through creating a web service like they do at square as if you were working there. He introduced some amazing open source libraries from square that I need to check out (cane, fdoc, jetpack). For example, they use jetpack to auto-pack up and deploy a rails app with Jetty. So all you need is a JVM.

I’m pretty sure I picked more than half of the talks as ones that I found super interesting. There were many more but I can’t just pick everything. It takes a while to watch all these videos but they are worth your time.

Using a Redis as a Database

pixel-ribbon_cor

The Spike

I was spiking on Redis recently. I wanted to use the redis-objects gem to simulate a shopping cart app even though the README specifically says

Just use MySQL, k?

I wanted to see what would happen if I tried it anyway. So the README and examples for the redis-objects gem are great so I’m not going to rehash what’s there. However, I will say though that the example has you hardcode the id field to 1. That detail snuck up on me.

If you don’t set an ID then you can’t work with a redis-object instance. You get an exception: Redis::Objects::NilObjectId: Attempt to address redis-object :name on class User with nil id (unsaved record?)

It’s basically trying to tell you, “hey, save the record first or set an ID”. Well, honestly, I don’t want to set an id myself. This is where the meat of the README is. Redis-objects really fits organically in an existing ActiveRecord model. That means Rails. In this case though, I don’t want an entire Rails app. I can see the value though in a plain old Rails app. Just look at the examples if you want to see more.

Anyway, continuing on with the spiking, I tried to integrate the Supermodel gem with Redis-objects. That sort of worked. You just class User < Supermodel::Base and you can sort of get it to work. This is great because Supermodel gives you finders like User.find_by_email('bob@yahoo.com') to make it act like ActiveRecord but you can't use .create(email: 'bob@yahoo.com') to begin with because of the same errors as I mentioned above. Redis-objects really wants the record to have an ID already. Even using Supermodel's RandomID mixin didn't work. The initialize order and callback hooks don't really work (or at least I couldn't get them to work).

Finally, I tried combining just redis-objects and datamapper redis. That worked. And it's pretty nice. Check it out.

So using this is pretty easy.

When you look at Redis, the keys are already composited for you and magic has happened.

Yay!

The name field is from redis-objects and the create uses datamapper. This is a really odd pairing but I like the fact that I have no sql database in the mix but still have finders similar to an ORM. Something to keep in mind, datamapper's finders are a bit different than the Rails 3 ones (no .where method).

Benchmarking A Million Things

Ok fine. So maybe this works, maybe it doesn't. Maybe it's not the right idea. What about the good stuff? Like, how fast can we load a whole lot of names into MySQL versus Redis using the above code and techniques? Is it even relevant?

A gist of these test results is here.

A More Complete Example

If you know the ID and don't need something like an auto-incrementing column outside your code/control then you can greatly simplify the code above by getting rid of Datamapper. You can simply use redis-objects to fake an ORM. I had great success using it as long as you USE NATIVE REDIS TYPES. Listen to the redis-objects author, don't try to force the tool into the use case.

The Blub Paradox and Delicious Pie

good_enough
Anything worth doing is worth doing well. “That’s Good Enough” isn’t good enough. There are some cases where you are writing a one-off script but those tasks are few and far between. Most people don’t get paid to write throw-away code. Most developers aren’t sysadmins writing procedural code on the level and complexity of Bash. Even if you are, is this the only style you know? Do you shy away from more complex tasks? Can you not break down an any size problem into smaller pieces?

A musician practices their art constantly. An athlete trains. And yet, we in the tech community have to go out of our way to find people who work on personal projects on the side or have the will/motivation to learn new things outside of “the job”. If you don’t learn on your own then you have to figure it out as you go. And you are already very busy so I know you’re just skimming over this post anyway. So let’s break this down in a bulleted list.

  • Developers work best when they’re challenged but not overwhelmed. Optimal work is a run, not a jog or a sprint.
  • Developers will step-up or fill-in a task to make it challenging as stated above.
  • Your classic boss doesn’t really care how you solve problems.
  • Culture takes 20 years to change.
  • The Blub Paradox says (among other things) that you can’t make anyone understand the power of a different language because developers become sedentary.
  • Good consultants are already busy so no one is going to save you.
  • Training or bootcamps won’t change your skillset or habits by a large percentage.

Let me put this another way.

My cat will play with a toy on the floor. And after a while, it becomes boring. So she swats the toy under a cabinet. And now it’s exciting! She can barely reach the toy and its a challenge. When she gets the toy out, she’ll bat it under the cabinet again. What is she going to do? Be bored again? She’s good at reaching for the toy and using her claws to hook stuff, tail counter-balance to stay in control and seeing in dark places. She needs to be playing under the cabinet and not given an mouse in a barrel. This is what you must do. You must improve your skills by going beyond “Good enough”. Otherwise, you will stagnate and have limited tools and approaches to problems.

But how do you even know what to learn? When you are looking up and down the power continuum in a problem domain, how can you grow beyond your own knowledge boundaries to see advantages that other people are enjoying? If you were looking at a menu of pies, how would you choose which one you will like?

Here’s the pie menu. Pick the one that you like!


PIE MENU
(all pies come with whipped cream on the side)
---------------------------------------------
Cranston Ermu Supreme Pie
Tagasnackle Mound Blat Pie
Rumination Flip Pie
Rainclouds and Humility Pie
Are You Seeing The Problem Pie

$3.99 per slice

What? This is terrible. This isn’t fair. Certainly technology cannot be like this. This is computer science! How can you choose? How can you know?

Hungry: “Waiter. What’s in humility pie?”
Waiter: “Humility.”
Hungry: “What does that mean?”
Waiter: “Have you ever literally tasted humility?”
Hungry: “No.”
Waiter: “Well it’s kind of like tasting regret but also different than pride. You kind of taste like you should have prepared more or set expectations lower.”
Hungry: “That doesn’t make any sense.”
Waiter: “Of course it doesn’t. It’s an experience. It’s not a definition.”

Given enough questions, the Waiter will quit unless the Hungry customer chooses a pie or leaves.

Now let’s look at some horrifyingly contrived personalities looking down (or not looking down) the pie continuum.

The Eternal Procedural Coder

Let’s say you are a developer who just writes scripts all the time. You piece things together and when they work the first time, you ship it (whatever shipping means to you). If a more complicated task comes along, you don’t change your design, tools or strategy, you just change the amount of time you throw at it. Once it works, you call it done.

  • If you don’t know how to build something in a variety of ways then you are always going to build it the same way.
  • If you don’t pre-learn outside work then you have to run to catch up to change your habits.
  • You can’t train new developers easily on what you are doing because everything you build is “your way”.
  • You can’t learn new techniques, tools or study good design because you don’t have time at work.
  • You don’t feel like learning on the weekend or after work.
  • When you look at another style, it looks like a bunch of crap and you don’t see the value/power/happiness/productivity/peace-of-mind because you haven’t gone through even one example yet.
  • Exactly what is the exit strategy here? What’s the end game? Who or what is going to save you or change your situation? What would the disruption look like? What would the disruption be called?

    Meanwhile, you hate looking at old code. You wonder “what was I thinking?” or “I must have had a lot of coffee this day”. Projects come and go, you hope you never have to go back to old projects. You leave things alone because they are terrible but do the job. You continually start new tasks but they all come out the same. Huge single files, huge methods/functions. One mess after another. No wonder why they call this “work” right? Who would do this for fun or learning?

    What’s interesting is that The Blub Paradox applies here too. It’s difficult to explain what development is supposed to be like to someone who has never had a success. Success breeds success. Failure is of course crucial to learning and courage is needed to fail when trying to succeed. But experience really is the success side of trying and learning. You can gain experience through failure but in terms of visualizing the ideal, success is very important.

    The It Can’t Be Done Guy

    Let’s say there’s a website called Searchbox.com that’s a stripped down version of Google. In a meeting, everyone is talking about how Facebook search and Twitter are all hot right now and you feel like you are behind the curve. So you decide you want to add friends, followers and other social features to Searchbox.com. Your rockstar lead developer says “it can’t be done”. No one questions him. The idea dies, jokes are cracked and the grapes were probably sour anyway. Yay team?

    Software is supposed to be soft. So why can certain things be so easy and other things be so hard? I understand certain problems just “are hard” (concurrency, cache invalidation). But not even entertaining ideas or experiments could be a sign of bad design. Even with crappy procedural scripts, simple changes mean omfg-lots-of-lines changed. Design isn’t about avoiding change (and change will happen), it’s about reducing the cost of change.

    If you want to learn from a master, read POODR. Sandi says:

    …successful but undesigned applications carry the seeds of their own destruction; they are easy to write but gradually become impossible to change.

    POODR is full of these insightful observations and Sandi doesn’t just give you advice. She walks you through refactoring code and demonstrates the concepts that are related to these insights.

    Anyway, back to our example project. If searchbox.com is broken up into modules then there is a user class or object. Surely this class could be forked into a feature branch and worked on in isolation by a few team members to add Friend and Follower traits/mixins/interfaces? Or is the user really a session cookie combined with a database row and “who knows” how to change it?

    If there isn’t really a concept of a user but it’s just a bunch of hacks then yes, no one is going to entertain massive changes to this concept of a “user”. It’s too hard (in your system). However, if a system was explicitly designed thoughtfully (not by default or by accident) then features are easy to add. The software stays soft. This is what Sandi is talking about in POODR.

    The Waterfall by Default Guy

    Let’s say there’s a guy in charge of a project. He hasn’t ever tried Agile of any sort. He’s read about it and he’s confident that he knows all about it. He thinks it sounds crazy, lazy and chaotic. He’s done plenty of projects in the past that have come out under-budget and on-time so why change anything? On the other hand, he’s not technical so his opinion of past projects is completely dependent on the feedback given to him by developers. If developers ever felt like talking to him was a waste of time then his opinion of how good “waterfall” is could be slightly off. The real question is, can you explain how delicious Agile Pie is to him?

    Here’s the wheelhouse of agile type stuff.

    • Developers aren’t going to want to work with you again if you burn them out.
    • Project heroes burn out.
    • You can’t hire your way out of a failing project.
    • Most people don’t like the term Waterfall because it’s viewed old.
    • If you don’t declare a methodology it’s Waterfall by default.
    • You can’t do Agile. Agile is a type of thing. Like Car. You have to buy a type of Car.

    So when you try to explain any type of agile techniques to someone who doesn’t have any experience with it, it’s the same thing as the Blub Paradox. The Waterfall Guy is happy enough doing no explicit methodology, why would he change? When asked how to prevent projects from getting off-track he will say, “before you start working on a project, you need to know what you are building”. Even though Agile (and common sense) says that both the customer and the developers are the stupidest at the beginning of a project. Big Design Up Front (BDUF) doesn’t work. BDUF is undesigned project methodology. BDUF is how you put together simple things like grocery lists and legos. BDUF isn’t how you deal with change and complicated systems where developers will upscale the project difficulty until it becomes interesting.

    So even if you discuss these items with Waterfall Guy, he’s not going to get a taste of Agile Pie without eating it. You could talk all day long about the ingredients, how people keep eating it, how there’s these case studies showing it’s delicious. “Meh, whatever. I have my dessert already.” At the same time, Waterfall Guy is always asking for visibility as to what developers are doing through status reports. In his heart he knows that he has a few hero developers and if he loses them, he’s screwed. That’s ok, that’s why they call it work right?

    The Non-Tester

    There’s a team who are running a production app. They have a testing team. When developers are working on “omg the really important thing”, they tell the testing team to run “whatever test”. It usually involves someone manually going down a list with a pencil. When a problem is found then a tester will tell a developer something went wrong. “It showed me this error 0×2123 when I did the thing you told me to do.”

    Except a lot of the time there isn’t anything wrong. A lot of times the problem is the testers aren’t developers. Otherwise, they’d be developers. A lot of times the tester doesn’t really understand the system and the developer has to explain the system to them. So while the developer is working on “omg the next really important thing” the tester is asking questions and the developer is getting annoyed. Jokes are made “you only come to me when there’s a problem, haha”. Or, “I bug you so much I owe you a beer! Haha.”.

    Looking down the power continuum, a developer looks at testing libraries. They look annoying and time-consuming. Who needs this hassle? They don’t have time to find out if they do. They don’t understand the different levels of testing (from most involved and beneficial):

    • No testing at all – you have to hit refresh in a browser and you have no idea if you broke anything else. Old bugs pop up.
    • Some testing – unit tests but you have a team run through end-to-end scenarios.
    • A lot of testing – you have unit and integration tests and maybe even code coverage reports and CI.
    • Test-first development – you write your tests before code but that’s the end of that methodology.
    • Test-driven development (TDD) – you let your tests completely drive the design of the system.
    • Testing end game – You do red-green-refactor. Your tests run fast. You do UI testing. Your customer requirements map to easily read executable stories. You spend very little time in red. You ping-pong pair with TDD. No code is committed without passing CI. Your testing tools and process are constantly evolving in response to pain points.

    The non-tester doesn’t see the value or the constant effort people are making to try to get to the Testing End Game. “I’m not a tester! Why would I test!” Meanwhile, they complain about how little time they have because they have to manually have to see if their code works. Or maybe they are embarrassed / mad / frustrated that they broke the development/production build. In essence, the Blub Paradox is telling us that they can’t see their problems through the lens of Blub because they don’t know Blub. Meanwhile, Blub (TDD) practitioners can’t imagine doing it the way the non-tester does it. In fact, TDD practitioners are honing their craft so much that it is a rare event when they’ll take the time to explain TDD or the End Game to a non-tester. Even if they do, both will be frustrated because any insight or evangelizing will be misunderstood.

    The Long Time Ignorant

    Take any of the above and put it in the context of time. How long has SQL been around? Does someone on your team not know SQL? Is it because they haven’t needed it? What if they are writing flat database files by hand and do need SQL? Have they just been living under a rock or can no one change their mind? What can you do as an individual to change such a blind eye?

    How long has JUnit been around? Is there a Java developer you know that hasn’t written a unit test? Or they have written a few but they don’t anymore because they think they don’t have time to do it?

    That Seems Hard

    Someone who can only write code in a procedural style can’t break down problems into easily solvable bits. Without concepts like Mixins/Traits, Composing Behavior and TDD; a complex problem is just going to seem too hard. “I don’t do that kind of stuff.” When someone that has OO or Functional experience might say “here’s how I would do it, who wants to help me?” Of course there are things that are just too big but the difference is in experience. A Bash scripter isn’t ever going to understand ncurses events because they haven’t ever written a desktop GUI. So even though ncurses can be great for turning “scripts” into “programs”, they are going to shy away from ncurses because “wtf that seems hard”.

    • You can’t explain the benefits of a web framework because of the Blub Paradox.
    • You can’t explain the benefits of an ORM vs raw SQL because of the Blub Paradox.
    • You can’t explain the advantages of tmux to someone because of the Blub Paradox. Even while they keep losing their ssh sessions over unstable wifi. Given a simple enough problem, you might be able to convince them to try it out.

    Tmux is a perfect example. No one understands how great it is until they use it. Then and only then do they never want to go back. Tmux isn’t always great though. Sometimes the terminal gets all weird with certain keyboards or maybe you want the native buffer scrollback to work. So it’s not some silver bullet default. However, once you grok tmux, you know when to use it and miss it when you don’t have it. This is true for many things that people evangelize. However, sometimes you need to understand their world view and take that into consideration.

    Tmux pie is delicious. Ask anyone who has tried it.

    Answers

    I don’t have all the answers. Ok, see you later. *silence* Well listening to Rubyrogues and reading POODR has given me some memorable advice which I’ll repeat here.

    • Spike on foreign concepts. Ok, you don’t know OOP and breaking down problems. Make up a problem and do it. Don’t try to learn in your main project. Spike on a concept and throw the code away.
    • Pair with someone who knows this stuff and check your ego at the door. When you are lost, don’t play with your iPhone. Ask questions. Pretend to be stupid. Check your ego.
    • Learn a different tech stack. When I write Java (poorly these days), I bring Ruby experiences back with me.
    • Watch conference videos and screencasts. Pre-learn before work. You have no choice. No one is going to pay you to get better.
    • Read the POODR book even if you’re not a Rubyist.
    • If you are resistent to an idea, change sides. Let’s say you don’t like soccer. Take the position of a soccer fan and argue with yourself. You might be amazed at the argument you make just looking from another perspective.

    Ruby p385 benchmarks

    pixel-ribbon_electric_fun

    I was playing around with the falcon p385 patch to see if it’s any faster than some of the more recent MRI rubies.
    TL;DR version: looks like p192 is faster than p385 of any type or tweak.

    Here’s how to get a p385 Ruby version patched with funny falcon’s performance patches using RVM.

    mkdir ~/.rvm/patches/ruby/1.9.3/p385
    curl https://github.com/funny-falcon/ruby/compare/p385...p385_falcon.diff > \
    $rvm_path/patches/ruby/1.9.3/p385/falcon.patch
    rvm install 1.9.3-p385 -n perf --patch falcon

    Then rvm use 1.9.3-p385-perf or set it as your global ruby.

    Test Setup

    The following benchmarks were run on an i7 server with a RAID5 array. The disk is slow (lack of large cache) but the benchmarks were run on the same box so it should compare apples-to-apples.

    From here on out, here are the definitions for the Ruby versions.
    p194 = 1.9.3p194 default
    p385 = 1.9.3p385 default
    falcon = 1.9.3p385 with the above falcon diff patch applied
    gcc_tweak = the falcon patches with GCC compile flags tweaked.

    So what are these GCC tweaks? Explicitly setting the CFLAGS for your machine’s CPU type and recompiling ruby with the Falcon patches applied.

    Micro and Macro Benchmarks

    I used the ruby-benchmark-suite to run these tests.

    Here are some example results. I can’t list them all. There are over 100 benchmarks. These are results for the mean times in seconds.

    test 1.9.3p194 1.9.3p385 falcon gcc_tweak
    bm_sudoku.rb 1.379112226 1.598182153 1.495923579 1.526717563
    bm_open_many_files.rb 0.175602996 0.197096826 0.197673286 0.194135045
    … etc etc

    Here’s the winner summary for mean times. This is the number of times the ruby version was the fastest for a particular benchmark.
    1.9.3p194 – 64 wins
    1.9.3p385 – 29 wins
    1.9.3p385-falcon with GCC tweaks – 10 wins
    1.9.3p385-falcon 5 wins

    Boot time and IO

    Timing rails boot time is a bit more important to me. If you want to know how to really save “rails boot time” see the DAS screencast on not loading Rails at all.

    Even when using domain objects and lib tricks, it’s nice to have Rails and all I/O boot fast. The main thing that funny falcon’s patches do is speed up requires and I/O.

    So let’s benchmark booting a Rails app.
    $ time bundle exec rake environment

    Ruby Version Seconds
    p385 patched with falcon and GCC tweaks 2.481 total
    p374 defaults 3.336 total
    2.0.0-rc2 2.613 total

    In conclusion, p194 looks faster on the macro and micro benchmarks but Falcon patches boot Rails faster.