Ruby Resources

Ruby — Dillon @ 10:50 am

destroyallsoftware.com
$9 for his entire back catalog. You can subscribe monthly or just sign up, download all of them and then cancel.

tryruby.org
15 minute ruby intro. Recently bought by the code school guys.

codeschool.com
Has a free 2 hour rails tutorial called rails for zombies. This is really fun but longer. It’s totally interactive. Very well done. They have non-free classes too. Focuses on rails 3 and some advanced stuff.

railscasts.com
Not very hand-hold-y. I regularly watch this site just to learn new things and watch someone work. These are short in length and very focused.

rubykoans.com
Ruby Koans are practice exercises/meditations on different aspects of Ruby. There are many similar projects (Python Koans etc). It’s like an interactive cookbook. Very valuable to do even if you’re comfortable with Ruby. It’s a project that has failures built into it and it’s your job to fix them. I did a blog post on it. I’m not explaining this very well. It’s a bunch of tests set up to fail and you fix the first test case and it continues on. For example, you might set true = true and then the test suite will have 1 test passing and give you a progress bar. Very cool and easy to pick up and put down. Good for reference afterwards.

confreaks.net
Lots of conference videos organized by events. The HTML5 player sucks. Use the flash one, even that one is a bit picky with pausing and other stuff.

quickie_mart 20 min app
A plug! This is a talk that I did. The README is very short but has all the steps to build a store rails app. The repo is the completed app.

MongoDB CSV importing

Ruby — Dillon @ 10:49 am

Following this tutorial over at mongovue about mapreduce in mongodb. They have you export data from MySQL to MongoDB using their .NET tool. I’m on a Mac so here’s what I did instead.

If you followed the instructions, you have the .txt files (which are CSVs) in your MySQL data directory (weird but ok). Importing CSV is really easy in MongoDB. Just make sure you are in the directory with the .txt files.


mongoimport -d geobytes -c cities --type csv --file cities.txt --headerline
mongoimport -d geobytes -c regions --type csv --file regions.txt --headerline
mongoimport -d geobytes -c dmas --type csv --file dmas.txt --headerline
mongoimport -d geobytes -c countries --type csv --file countries.txt --headerline

Now you can continue with the tutorial.

> db.cities.findOne();
{
"_id" : ObjectId("4fa2a734779f0ea93dd13df6"),
"CityId" : 42231,
"CountryID" : 1,
"RegionID" : 833,
"City" : "Herat",
"Latitude" : 34.333,
"Longitude" : 62.2,
"TimeZone" : "+04:30",
"DmaId" : 0,
"Code" : "HERA"
}

A Spork nil:NilClass fix

Ruby — Dillon @ 7:40 pm


This might not be your issue. I had a really weird problem that was ungoogleable so I thought I’d post it.

With a combination of spork, guard and rspec, none of my testing stuff was working. The spork prefork block was throwing this error.

Using RSpec
Preloading Rails environment
undefined method `gsub' for nil:NilClass (NoMethodError)

The solution was to comment all my test gems one at a time. And then I realized that my Ubuntu edited version of my Gemfile was conflicting with my Mac edited version of this same file.

So I made my Gemfile a little smarter.

 
group :test do
  gem 'rspec'
  gem 'rspec-rails'
  gem 'spork'
  # 
  gem 'rb-inotify' if RUBY_PLATFORM.downcase.include?("linux")
  gem 'rb-fsevent' if RUBY_PLATFORM.downcase.include?("darwin")
  gem 'guard'
  gem 'guard-rspec'
  gem 'guard-spork'
end

Then things started to work again. With any weird bugs in Rails, try commenting out gems one at a time.

Upgrading Rbenv

Ruby — Dillon @ 9:53 pm


I’ve recently switched from rvm to rbenv on most of my dev boxes. I loved RVM to death, no offense to all the hard work that Wayne did. He’s a great guy and I listen to him talk on podcasts. I just think RVM is a bit heavy handed in some things and dealing with readline failures (despite doing the same steps as I’ve done many times before) and other things was getting tiresome. I’m not sure how complete all my testing is (such as Textmate or Sublime Text 2 support) so a bit of this is not “I’ve converted!” it’s more of a “I’m currently converting”.

The biggest change is how to pull new rubies in. What I mean is, let’s say a future version comes out. Let’s call it Ruby 14.0 so that this blog post doesn’t look dated for a really long time (#wat). If you wanted to update your global ruby with RVM, it’d go something like this: 1) rvm get latest 2) rvm install ruby-14.0 Then you’d migrate your gemsets or play with .rvmrc files.

Rbenv is a bit easier in this regard but it requires slightly more typing. I’ll also show you a little trick on how to incorporate some plugins neatly.


> rbenv versions
1.8.7-p357
* 1.9.3-p0 (set by /home/user/.rbenv/version)

Bah. We already dated this blog post. Anyway. 1.9.3-p125 is currently out. So let’s try to pull it in.


> rbenv install 1.9.3-p125
ruby-build: definition not found: 1.9.3-p125

“What? It’s out! I even just did this on another rbenv install! What is going on?”
[Keep raging, don't ship apps.]

So what’s happening here is the rbenv is so old that it doesn’t know what p125 is. So let’s update our rbenv install.


> cd ~/.rbenv
> git pull

Great. But that’s not the equivalent of ‘rvm get latest’ because we are using ruby-build to do the ‘rbenv install’. Now the default documentation has you checkout ruby-build in your home directory. So if you copied and pasted (like me) then you have a ~/ruby-build dir. Let’s move that to ~/.rbenv/plugins (make sure plugins is a directory). If you are using rbenv-gemsets to mimic rvm gemsets then you already have a plugin directory.


> ls ~/.rbenv/plugins
rbenv-gemset ruby-build

> cd ~/.rbenv/plugins/ruby-build
git pull

If you installed ruby-build into /usr/local then you can leave off the PREFIX variable.

> PREFIX=~/local ./install.sh

If you installed it into your home, you’ll need to modify your PATH variable:
# for ruby build and other PREFIX overridden builds

if [ -d "$HOME/local/bin" ] ; then
PATH="$HOME/local/bin:$PATH"
fi

If you want it in /usr/local, then leave off the above if statement etc and just run sudo ./install.sh like the ruby-build docs say.

Now rbenv knows what you’re talking about:

> rbenv install --definitions 2>&1 | grep p125

And now you can install the new ruby and set it to be your default. The good thing here is, unless you’re switching ruby versions, you don’t need to update all your gemset files like you need to in RVM. So for me, there’s less impact when keeping Ruby up-to-date.

Setting Defaults in Ruby

Ruby — Dillon @ 11:52 pm


Let’s start out with a plain old method.

def hello
  puts "Hi!"
end

Now let’s un-hardcode that string in the puts by adding a parameter.

def hello(greeting="Hi!")
  puts greeting
end
 
>> hello
Hi!
>> hello("Hola!")
Hola!

Great. We have a default String. But what about something more complex. What if we want a hash of options. Say we have a little piece of an IRC client.

def connect(options={})
  defaults = {
    :server => "irc.freenode.net"
  }
  options = defaults.merge(options)
  puts "Connecting to #{options[:server]} ..."
end
<pre lang="ruby">
 
Now when we use it like this, we can connect to a default server or override it.
<pre lang="ruby">
>> connect
Connecting to irc.freenode.net ...
 
>> connect({:server => "irc.efnet.net"})
Connecting to irc.efnet.net ...

Now a more complicated example. All we’re doing here is loading defaults from a YAML file and doing the same thing as before.

require 'yaml'
 
class Preferences
  def initialize
    if !File.exists?("preferences.yml")
      # example file
      options = {:server => "irc.efnet.net"}
      self.save!(options)
    end
    @values = YAML::load(File.open("preferences.yml"))
  end
 
  def to_hash
    @values
  end
 
  def save!(options)
    preferences = File.open("preferences.yml", "w") do |f|
      f.write(options.to_yaml)
    end
  end
end
 
def connect(options={})
  defaults = {
    :server => "irc.freenode.net",
    :username => "CHANGE-USER-NAME, see README.txt",
    :channel => "#chat"
  }
  options = defaults.merge(options)
  puts "Connecting to #{options[:server]} as #{options[:username]}..."
end
 
 
# Main
prefs = Preferences.new
 
options = {:channel => "#meow"}
connect(prefs.to_hash)
 
options[:username] = "Bob"
options[:server] = "irc.efnet.net"
prefs.save!(options)

Go ahead and give it a try and play with it. It’s a good recipe with many uses.

Rubygems Size, Bad Algorithms and a Bad Data Structure

Ruby — Dillon @ 11:01 pm


I mirrored ruby gems just to see how big it would be. I used the rubygems-mirror gem. It’s pretty simple. Just cd into a directory with a lot of space (ie: /opt/gems or something) and type `gem mirror`.

After a massive initial load of 155k gems, the size was about 45GB (currently, it grows pretty quick per week). The rubygem and gem mirror command is smart enough to just download just the deltas when you run it again:


$ gem mirror
Fetching: http://rubygems.org/specs.4.8.gz
Total gems: 170843
Fetching 16176 gems
................................................................

Then I wanted to know the size of all the latest gems only. If I had to do a lazy sneakernet, this might be one method of grabbing a whole bunch of dependencies (of course this would never work). Regardless of that, I still wanted to know what percentage of ruby gems space is old versions.

So I wrote a ruby program to find all the latest versions of the gem files and total up their size. I was not very happy about my experiments with #sort and #sort_by. The biggest problem is that it took 64 HOURS to run. I knew it had lots of problems but I didn’t want to kill it. I wanted to see how bad it really ran.

I’m not going to post the actual code. You can see the old version at this git commit url. The basic gist of the crappy algorithm was something like this:

Find all the files in the gem mirror off the filesystem.
Get the basename of the file name (ie: strip the path).  /tmp/foo-0.1.gem -> foo-0.1.gem
Go through all the basenames (gem names) find the gem family.

Here’s the problem. I had a massive list of 170k gems and then I’m trying to do a find_all right here to sort the gems into gem families. For example: there might be foo-0.1.gem, foo-0.2.gem and foo-async-0.1.gem. In this example, there are two gem families out of the three gems. Foo-async and foo are two different gems with their own versions. Later on, I would:

Do a version compare.
Push the latest version name to an array.
Delete the gem family name from the gem_names array.

Sounded good on paper. And then it took 65 hours to run (227305.19 seconds) and CPU was absolutely pegged the entire time. This algorithm was easy to come up with in IRB using a small test data set but scaling up in the real use case completely sucked. So I pushed it to github for versioning and rewrote the loop.

The latest version runs in 8.5 seconds and spits out a total size of all the latest ruby gems at 6.5GB. Of course, this information is useless since it’s not going to check compatibility or anything. I was just curious to know how much space is back versions.

The real key to the new version is the fact that I’m using a proper “grouped” data structure (Hash) instead of a massive flat Array. This allows the regexes and other operations to work on a smaller data set. The compound nature of the previous inefficiency is pretty amazing (hours to seconds).

So hopefully you see above that a huge array of a File glob is flat and makes regex’s or grouping operations very time consuming. Ruby’s magic group_by method sorts and groups the data structure once and then it’s much easier to regex out versions and do other things.

See below for the code inline or take a look at the github repo.

# refactored the 63 hour version to a much better 8.5 second version
 
require 'action_view'
include ActionView::Helpers
 
# change to location of rubygems mirror
GEM_DIR = "/opt/rubygems/gems"
 
gems = Dir.glob("#{GEM_DIR}/**/*.gem"); 1
gems = gems.collect {|g| g.split("/").last}; 
 
class Version
  include Comparable
  attr_reader :major, :feature_group, :feature, :bugfix, :version_string
 
  def initialize(version="")
    @version_string = version
    @major = "0"; @feature_group = "0"; @feature = "0"; @bugfix = "0"
 
    v = version.split(".")
    # puts v.join("|")
 
    if v[0]; @major = v[0]; else; raise "Major number blank."; end
    if v[1]; @feature_group = v[1]; end
    if v[2]; @feature = v[2]; end
    if v[3]; @bugfix = v[3]; end
  end
 
  # strangely enough .to_i works even for
  # >> "6-mswin32".to_i
  # => 6
  def <=>(other)
    return @major <=> other.major if ((@major.to_i <=> other.major.to_i) != 0)
    return @feature_group <=> other.feature_group if ((@feature_group.to_i <=> other.feature_group.to_i) != 0)
    return @feature <=> other.feature if ((@feature.to_i <=> other.feature.to_i) != 0)
    return @bugfix <=> other.bugfix if ((@bugfix.to_i <=> other.bugfix.to_i) != 0)
    # we probably have two things equal here
    return -1
    puts "FALLING THROUGH in <=>, not good"
  end
 
  def self.sort
    self.sort!{|a,b| a <=> b}
  end
 
  def to_s
    @version_string
  end
end
 
# temporary benchmarking
RubyProf.start
 
group_r = Regexp.new(/(.*)-(\d+\.\d+.*)\.gem$/)
gems_grouped = gems.group_by {|g| g.scan(group_r).flatten[0] }
# => {"firewool"=>["firewool-0.1.0.gem", "firewool-0.1.1.gem"}], ... }
 
latest_gems = []
 
gems_grouped.each do |g|
  versions = g[1].collect {|ver| ver.scan(group_r).flatten[1] }
  # => ["0.1.0", "0.1.1", "0.1.2"]
 
  begin
    latest = versions.collect {|v| Version.new(v)}.sort.reverse.first
    # => "0.1.2"
  rescue ArgumentError
    puts g
  rescue NoMethodError
    # somebody's got some crazy gem naming conventions
    # for example: chill-1.gem
    gems_grouped.delete g
  end
 
  latest_gems << "#{g[0]}-#{latest}.gem"
end
 
total = 0
latest_gems.each do |gem|
  begin
    total = total + File.size("#{GEM_DIR}/#{gem}")
  rescue Errno::ENOENT => e
    puts "WTF no #{gem}"
  end
 
end
 
puts "Total size of newest gems in #{GEM_DIR} is #{number_to_human_size(total)}"

Algorithm win. Rubygem mirror size curiosity complete. 6.5GB is current gems out of 45GB (right now).

Quickie Mart

Ruby — Dillon @ 2:38 pm

A 20 minute Rails demo that I used as part of a “What is Ruby on Rails?” talk. The store was not designed or developed from scratch in 20 minutes but serves as a Cooking Show style demo of what is possible in a very short amount of time.

Quickie Mart on Github

Rbenv bash prompt

Ruby — Dillon @ 4:05 pm

Sam Stephenson has a pretty looking bash prompt screenshotted in his rbenv project’s README. All you have to do is:

  • Set your Terminal theme to Basic. Make sure to re-set any preferences you might have (like no audible bell etc).
  • Set your Terminal font to 13pt Inconsolata. This isn’t the exact font he uses but it’s as close as I could find.
  • Set your ANSI Color for Normal White(7) to Tin (Crayons tab in color picker)
  • Put this bash code from this gist at the end of your .profile file.
  • Install rbenv so you don’t get the errors I got below because I’m still on RVM. :)

I covet, I steal.

Even though I don’t show it there, the bash script will do the git repo status magic for you on OSX. You need to brew install git and have the shell completion scripts in /usr/local (homebrew will do this).

Hash of Hashes and Captain Planet

Ruby — Dillon @ 12:09 am

Don Cheadle is the best version of Captain Planet there is. I hope you’ve already seen the video. If not, go now. I’ll wait.

As it typically happens, I was creating some nested data structure and was reminded of all the different combinations that there are. For example:

  • An array of arrays.
    [ [1,2,3], [4,5,6] ]
  • A hash of arrays.
    { :lucky => [77,42], :unlucky => [666,13] }
  • An array of hashes.
    [ {:cat=>"meow"}, {:dog=>"ruff"} ]
  • A hash of hashes.
    { :best_in_life => {:enemies => "crushed"}, :worst => { :meatloaf => "old"} }

And I realized that I hadn’t really played with a hash of hashes much. And now I realize why. It’s really pretty useless. It’s hard to work with and the additional key really isn’t all that useful. I found it much better to just denormalize the key into the data attributes. Anyway, you can see what I mean by reading and running what’s below.

We’re going to create Captain Planet and the planeteers in a 2D hash of hashes and do some searching, iterating and other simple things. This should illustrate also how an array of hashes is a bit better. You’ll see halfway through the program we redefine the planeteers.

# search a 2d hash
 
def spacer(msg)
  puts "\n"
  puts "-" * 50
  puts msg
  puts "-" * 50
end
 
# this is not really a good data structure but we'll use it anyway.
planeteers = {
  :kwame    => { :element => "earth", :from => "Ghana, Africa", :actor => "LeVar Burton" },
  :wheeler  => { :element => "fire", :from => "Brooklyn, NY", :actor => "Joey Dedio" },
  :linka    => { :element => "wind", :from => "Soviet Union", :actor => "Kath Soucie" },
  :gi       => { :element => "water", :from => "Thailand", :actor => "Janice Kawaye" },
  :ma_ti    => { :element => "heart", :from => "Brazil", :actor => "Scott Menville" }
}
 
spacer "Here are our planeteers and their elements:"
puts planeteers.keys.collect {|p| {p => planeteers[p][:element]} }
 
 
puts "\nFind the fire planeteer:\n"
fire = planeteers.keys.collect {|p| {p => planeteers[p][:element]=="fire"} }
# => [{:kwame=>false}, {:wheeler=>true}, {:linka=>false}, {:gi=>false}, {:ma_ti=>false}]
 
only_fire = fire.select{|h| h.values[0] == true}
# => {:wheeler=>true}
 
# print just one
puts only_fire.first.keys.first
 
 
spacer "Let's do this a bit cleaner with a better data structure."
planeteers = [
  { :name => "kwame", :element => "earth", :from => "Ghana, Africa", :actor => "LeVar Burton" },
  { :name => "wheeler", :element => "fire", :from => "Brooklyn, NY", :actor => "Joey Dedio" },
  { :name => "linka", :element => "wind", :from => "Soviet Union", :actor => "Kath Soucie" },
  { :name => "gi", :element => "water", :from => "Thailand", :actor => "Janice Kawaye" },
  { :name => "ma_ti", :element => "heart", :from => "Brazil", :actor => "Scott Menville" }
]
 
planeteers.max_by {|p| p[:name]}
# => {:name=>"ma_ti", :element=>"heart", :from=>"Brazil", :actor=>"Scott Menville"}
 
planeteers.max_by {|p| p[:element]}
# => {:name=>"linka", :element=>"wind", :from=>"Soviet Union", :actor=>"Kath Soucie"}
 
puts "Find the heart planeteer:"
puts planeteers.select {|p| p[:element] == "heart" }.first[:name]
 
# first we'll put a fake planeteer on the end for cpt planet
planeteers << { :name => "all", :element => "go planet" }
 
spacer "Let's summon Captain Planet!"
planeteers.each do |planeteer|
  puts "#{planeteer[:name].capitalize}: #{planeteer[:element].capitalize}!"
end
 
# pop off fake guy
planeteers.pop

Here’s what it spits out:

--------------------------------------------------
Here are our planeteers and their elements:
--------------------------------------------------
{:kwame=>"earth"}
{:wheeler=>"fire"}
{:linka=>"wind"}
{:gi=>"water"}
{:ma_ti=>"heart"}

Find the fire planeteer:
wheeler

--------------------------------------------------
Let's do this a bit cleaner with a better data structure.
--------------------------------------------------
Find the heart planeteer:
ma_ti

--------------------------------------------------
Let's summon Captain Planet!
--------------------------------------------------
Kwame: Earth!
Wheeler: Fire!
Linka: Wind!
Gi: Water!
Ma_ti: Heart!
All: Go planet!

HBase Shell Color

Ruby,Systems — Dillon @ 7:59 pm


Since the hbase shell is irb, I wanted to get color output because that’s what I’m used to. Although the appropriate place to put this is in an .irbrc file, that would conflict with any ruby development environment already on the system and luckily jruby and hbase don’t seem to invoke it anyway.

First find a copy of wirble. If you don’t have it anywhere, download it from github:


cd ${hbase_home}/lib/ruby
wget https://raw.github.com/blackwinter/wirble/master/lib/wirble.rb

Now edit ${hbase_home}/bin/hirb.rb. Add to the end but above IRB.start

begin
  # load wirble
  require 'wirble'
 
  # start wirble (with color)
  Wirble.init
  Wirble.colorize
rescue LoadError => err
  warn "Couldn't load Wirble: #{err}"
end
 
IRB.conf[:HISTORY_FILE] = "#{ENV['HOME']}/.hbase-history"
 
# add right before end but above this line
IRB.start

Now when you start hbase shell, you’ll have lovely color output. Why would you want this? I don’t know. You probably don’t want it. But I was happy to understand how the hbase shell works. It’s just jruby irb that loads hirb automatically.

Next Page »
This work is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 Unported License.
(c) 2012 SQUARISM | powered by WordPress with Barecity