Ruby array sort and uniq bug

Ruby — Dillon @ 5:45 pm

I found a ruby bug, reported it and it got fixed. I’m posting this because I thought the whole process was pretty cool. I very rarely find bugs but it’s always fun feeling like you’re giving back to the community you’ve lurked in for a long time.

First, the bug.

a = [
  { :color => "blue",  :name => "water" },
  { :color => "red",   :name => "fire" },
  { :color => "white", :name => "wind" },
  { :color => "green", :name => "earth" },
  { :color => "green", :name => "moss" },
  { :color => "white", :name => "snow" }
]
 
# taking out the sort_by solves the problem
a.sort_by! { |e| e[:color] }
a.uniq! {|e| e[:color]}
 
puts a

Now it’s supposed to print this:

{:color=>"blue", :name=>"water"}
{:color=>"green", :name=>"moss"}
{:color=>"red", :name=>"fire"}
{:color=>"white", :name=>"wind"}

But instead ruby crashes. On OSX, it’s a BAD_EXEC error. On Linux, it drops core with another error. You can read my whole bug report here. I wanted to test it a bunch and I found that the sort_by! is what causes it. There were many workarounds possible but ruby should handle this case.

Anyway, I submitted a bug on the redmine site. I knew redmine from work so this was easy. Some time later, Yui Naruse committed a fix. Now, I had attempted to trace the issue myself. But it’s all C and ruby is huge. So I was completely lost. I can’t even tell what the solution actually is even when looking at it. :(

So the issue was closed, revision 30739 had the fix. So I tried to update ruby-1.9.2-head using RVM but it kept pulling an older version. I tried doing rvm cleanup all 1.9.2-head but it kept pulling and building an older revision. So I just checked out ruby from SVN and built it:

cd ~/tmp
svn co http://svn.ruby-lang.org/repos/ruby/trunk ruby
mv ruby/ ruby_svn_30739
cd ruby_svn_30739/
autoconf ./configure.in > configure
chmod u+x ./configure
./configure
make

This was on a mac, so you have to have autoconf (I think I’m using the homebrew version). Anyway, ruby is built but I didn’t want to install it if I couldn’t built it with RVM (because it’d be hard to tear it out — or at least I didn’t know how). So I was able to run the built ruby without installing like this:

user@box:ruby_svn_30739$ ./ruby -I lib:. bug.rb

Where bug.rb is the code from above that crashes ruby. And when you run it, it prints out the hash after removing the uniques based on a hash key.

Pretty awesome day today. And I can always check if it’s in the Ruby interpreter by doing this:

 wget --no-check-certificate -O - https://github.com/ruby/ruby/raw/trunk/array.c \
| grep -A 4 ARY_SHARED_P |grep -B 4 ary_resize

Grep will return this.

if (ARY_SHARED_P(ary) && !ARY_EMBED_P(ary)) {
  rb_ary_unshare(ary);
  FL_SET_EMBED(ary);
}
ary_resize_capa(ary, i);

Of course the better way is to run the included ruby tests that Yui Naruse wrote. :)

Setting the default ruby with Pik

Ruby — Dillon @ 10:08 pm

Pik is a nice alternative to RVM if you’re on Windows. RVM has a few more features than pik but all in all, pik does exactly what I want with very similar commands as RVM so it was a really nice transition. I’m extremely impressed that they could get the whole thing to work actually.

However, there were a few gotchas (all detailed below).

  1. You need some version of ruby installed to get pik up and running.
  2. After installing any ruby using pik, you can switch your default and uninstall the bootstrap version if you wish.
  3. Pik supports proxies using the http_proxy environment variable.
  4. Installing a specific version has slightly different syntax than RVM.

First, you need some version of ruby installed (#1 up there). I used JRuby for Windows (jruby_windows_x64_1_6_0_RC1.exe — ymmv) because it had no dependencies. JRuby gets installed to C:\jruby-1.6.0.RC1 (ymmv based on version) and pik picks it up and adds it to its list (very nice). If you don’t have a version of ruby installed you’ll get:
error: can't dup nilclass
when you try to run pik.

What’s really nice is that jruby’s binary is jruby and not ruby. But pik handled it. I just ran both installers and then I had jruby in my “pik list”.

What do I mean by pik list? It’s just like RVM.
C:\Users\you>pik list
160: jruby 1.6.0.RC1 (ruby 1.8.7 patchlevel 330) (2011-01-10 769f847) (Ja...
187: ruby 1.8.7 (2010-12-23 patchlevel 330) [i386-mingw32]
* 192: ruby 1.9.2p136 (2010-12-25) [i386-mingw32]

Second, I wanted to set the default ruby with pik (why you are here reading this). This was a bit odd and different than RVM. RVM loads in the .bashrc so it makes sense that pik can’t override the Windows cmd lifecycle. The PATH variable sets which ruby is the default. So just go to your Windows System Properties and set your user’s %PATH% variable to the bin path of whichever ruby you want to use. You can get the path like this:

C:\Users\you>pik list -v
160: jruby 1.6.0.RC1 (ruby 1.8.7 patchlevel 330) (2011-01-10 769f847) (Ja...
...va HotSpot(TM) 64-Bit Server VM 1.6.0_22) [Windows 7-amd64-java]
path: C:\jruby-1.6.0.RC1\bin

187: ruby 1.8.7 (2010-12-23 patchlevel 330) [i386-mingw32]
path: C:\Users\you\.pik\rubies\Ruby-187-p330\bin

* 192: ruby 1.9.2p136 (2010-12-25) [i386-mingw32]
path: C:\Users\you\.pik\rubies\Ruby-192-p136\bin

So for example, if you wanted 1.9.2 to be your default ruby, just add C:\Users\you\.pik\rubies\Ruby-192-p136\bin to the beginning of your user defined %PATH% variable in System Properties. When you fire another cmd.exe, ruby should be all set. Apparently this is the equivalent of the rvm use 1.9.2 --default on a ‘nix system.

Pik supports proxies (phew). Just do:
set http_proxy=http://yourproxy

You can test with viewing all the remote rubies:
pik list -r

Also, add this environment variable to your user’s variable list just like you did with the %PATH% variable.

Installing a specific version is a bit different than RVM, you specify the version with a space. In RVM, this would be “rvm install ruby-1.9.2″
pik install ruby 1.9.2-rc1

Update:
If you want to install a new ruby and move your gems do this:
pik install ruby 1.9.2-p180
pik use 1.9.2-p180
(set your %PATH% to default as described above if you want)
pik gemsync p136 (imports from p136 into current which is p180)

Watchr Continuous Testing with Growl

Ruby — Dillon @ 6:45 pm

An improvement to the post I did about the doom faces for growl status and watchr config with ruby testing. This improvement will give you a pass screen that is a bit easier to read because it will have a slight green or red tint to it.

First, it requires a bit of Growl set up. You just need to change the severity colors from all black to look like this. You can keep the normal level as black and hopefully this will keep your normal growl messages from other programs as normal black.

Note that the high severity color in that screenshot is a really dark red. This is important to get that slight red tint from the dead doomguy shot above.

This also requires and new and improved watchr.rb file to go along with it.

def growl(message)
  growlnotify = `which growlnotify`.chomp
  title = "Watchr Test Results"
  passed = message.include?('0 failures, 0 errors')
  image = passed ? "~/.watchr_images/passed.png" : "~/.watchr_images/failed.png"
  severity = passed ? "-1" : "1"
  options = "-w -n Watchr --image '#{File.expand_path(image)}'"
  options << " -m '#{message}' '#{title}' -p #{severity}"
  system %(#{growlnotify} #{options} &)
end

Optional Bit

My previous post was inside a rails project, which is fine and good until you want to write a command-line ruby program or something outside of rails. First, you’ll need some tests and then you’ll need a Rakefile that invokes your tests. I have an example project here. There’s also a zip archive here. It’s a non-rails app with a rake file that you can run for testing:
combo-guesser$ rake
(in ./combo_guesser)
/bin/ruby ./test/combo_guesser_test.rb
Loaded suite ./test/combo_guesser_test
Started
.
Finished in 0.000755 seconds.

1 tests, 10 assertions, 0 failures, 0 errors, 0 skips

All this makes for a really fast testing feedback loop.

Ruby Koans Notes and Solutions

Ruby — Dillon @ 12:04 am

Just finished EdgeCase UK’s rubykoans test/project. It’s a test driven development style learning session that you should really check out even if you’re only partial curious in what ruby is. If you’ve got some ruby experience under your belt, you should go through the exercise too because it’s good practice, reference and you’ll probably learn something new.

I’m not posting this so someone can cheat. I’m posting this for the me out there who wants to see if they did it the same. There were many “THINK ABOUT IT” sections that were open-ended or tricky. The whole thing took me about 4 days while on vacation. Maybe 16-20 hours of solid work. It’s good, just do it!

I used auto_enlighten.rb which I nabbed from davesquared.net. It was really handy. Using watchr, all I had to do was save an .rb file and the test ran again. So I could just try my answer and continually monitor my progress and get feedback on errors. This was very much the #1 bullet point in the Ted talk 7 ways games reward the brain. The whole exercise is one long experience bar with continuous feedback. Excellent! Books should be this way!

I definitely had some learning experiences along the way and I’ve posted my solutions with notes on gotchas to github. And there were many gotchas. Sometimes, I would nail a solution, first shot, very elegantly. Other times, I would churn on a test for many minutes and research for more information. There were some very, very deep and specific issues that I would read more about.

For example, take this excerpt one of the early tests: test_slicing_arrays. For the ruby koans, they created a special __ method that is equal to “FILL ME IN” (slick move EdgeCase). So you just have to change the __ for the test to pass. In this case, I expected it to be nil. So I filled in nil and it failed.

def test_slicing_arrays
    array = [:peanut, :butter, :and, :jelly]
    # snip for blog post
    # Learned: this is crazy!  empty when beyond range but at .size, nil when beyond .size
    assert_equal __, array[4,0]
    # snip for blog post
end

Notice my Learned: note there in the middle. This was surprising. If you open up irb and follow along, you’ll find that slicing an array of 4 things from 4 to 0 gives you an empty array. Wow. That’s a bug if you ask me. There’s a bunch of notes marked “Learned:” like this in the github project. The README has a grep example you can use to find all the things I thought were gotchas. But do them yourself first! It’s way more fun to do this interactively.

BTW, Internet, I would love a node.js koans project.

Reconnect a Mac mouse without a mouse

Mac — Dillon @ 7:23 pm

So you’ve replaced the batteries, restarted your mouse or whatever. You have a bluetooth mouse that you need to reconnect or re-pair but you don’t have a mouse to get to the bluetooth icon in the notification area or status bar location.

Here what you do:

  1. Command Key + Shift + / – gets you into the help area in any menu
  2. Right arrow – gets you to the Apple menu
  3. Down arrow until System Preferences
  4. Enter – opens System Preferences
  5. Type mouse – in the search box and hit enter will open the mouse pane
  6. Hit enter at the “no mouse found” screen – this will pair your mouse if it’s powered on and trying to pair

Problem solved. Leave a comment if it works or a similar problem but different solution worked for you. If you can’t click in the comment box, just scream really loud towards my email address. :)

Biggest object_id value in Ruby

Ruby — Dillon @ 4:38 pm

0 = 1, 1 = 3, 2 = 5, 100 = 201. What is going on? Ruby has object_ids on objects but they are shorter on small integers. Fire up irb and follow along:

> obj = Object.new
 => #<Object:0x100362a30> 
> obj.object_id
 => 2149258520

The value of object_id is the normal looking ids I’ve seen in the past. But then I did this:

 0.object_id
 => 1 
> 1.object_id
 => 3 
> 2.object_id
 => 5 
> 100.object_id
 => 201

Ok, why the values? Why is 0′s id 1 and 100′s id 201? It looks like it’s doubling the value and adding one but that seems a little random and probably not what it’s actually doing. So I read a bit by Caleb Tennis on Oreilly and found a tip off that it’s related to binary:

Thus 0×0101 (5) becomes 0×1011 (11).

Notice that Caleb is noting that “101″ is shifting left and adding 1 at the lowest bit. So then this should work:

x = 0
puts x.object_id == ( (x << 1) + 1 )
true
=> nil

And it does. We can set a bigger number and run it again and it still works.

x = 65536
puts x.object_id == ( (x << 1) + 1 )
true
=> nil

I will call this predictable behavior later. x.object_id is 131073 on both sides because we know how object_id is generated:

x = 65536
 => 65536 
> puts x.object_id
131073
> puts ( (x << 1) + 1 )
131073

So what is going on? Well first, let’s print out the binary. Taken from icfun.blogspot.com, we’ll define a number to binary string method in irb.

def dec2bin(number)
   number = Integer(number);
   if(number == 0)
      return 0;
   end
   ret_bin = "";
   ## Untill val is zero, convert it into binary format
   while(number != 0)
      ret_bin = String(number % 2) + ret_bin;
      number = number / 2;
   end
   return ret_bin;
end

We can test that it works with any integer. Let’s test with the number 5 like this: puts dec2bin(5). Prints out 101. If we shift left by one bit like this: dec2bin((5 << 1)) then we'll get 1010. Add 1 bit and we get the binary of eleven: 1011. We can see if this is eleven with dec2bin(11) and we get "1011" which is what we expected. It makes sense.

So wtf. Why know this stuff? Well, when doing object comparisons, I always expected some garbage or randomness and length similar to a hash. Like when doing (Object.new).object_id but this isn't dependable or consistent when getting the object_id of a Fixnum. And even that is not consistent. The examples above are using small numbers. When we try this:

x = 4; puts x.object_id == ( (x << 1) + 1 )
true

It works fine. 4 is small. Let's call this "predictable". We can bit shift and add 1 to get the same value as ruby does with object_id.

But a big enough number like 5000000000000000000 is false. So what's the inflection point? Maybe the inflection point is 2^32 because it's memory related:

x = 4294967296; puts x.object_id == ( (x << 1) + 1 )

Nope. Strange.

So I played around manually and started finding that 4 quintillion is true but 5 quintillion is false. Weird. If it's memory related, it's no number I recognize. So we know that there's some inflection point between 4 quintillion and 5 quintillion. I'm not about to figure this thing out by hand. Let's write a program to find our magic number.

Note: the current version (if any) is on github.

# find the inflection point of how ruby calculates object_ids predictably
# for example:
# x = 4; x.object_id == ( (x << 1) + 1 )
# => true
# however,
# x = 5000000000000000000; x.object_id == ( (x << 1) + 1 )
# => false
 
# answer = 4611686018427387903
 
# we'll start with a number that's close
starting_number = 4000000000000000000
 
# this will be the number we'll try with
current_number = starting_number
 
# our decimal position that will be used for the loop
# we don't need to start with 4, which is starting_nmber[0..0]
index = 1
digit_at_index = current_number.to_s[index..index].to_i
 
# digit state
digit_second = 0
digit_third = 0
 
# vector for current_number
# true = up, false = down
direction = true
last_direction = direction
changed_directions = 0
 
# have we exhausted all digits for the current rank/position
# if so, we move on to the next position
digit_done = false
 
jump = 5
 
 
# is our result the same as shift left plus one?
def predictable?(number)
  number.object_id == ( (number << 1) + 1 )
end
 
 
# go until we've iterated along the length of starting_number
while (index <= starting_number.to_s.length - 1)
  digit_done = false
 
  digit_at_index = current_number.to_s[index..index].to_i
 
  # shift our cheap history variables
  digit_third = digit_second
  digit_second = digit_at_index
 
  # this tests whether we went over our solution
  if predictable?(current_number.to_i)
    # if true, try incrementing but only if we can later
    direction = true
  else
    # if false, number is too high
    direction = false
  end
 
  #puts "index: #{index}"
  if last_direction != direction
    jump -= jump / 2
    changed_directions += 1
  else
    jump -= 1 unless jump == 1
  end
 
  last_direction = direction
 
 
  # split the distance
  # if we start with 0, this becomes 5 if going up
  # if we start with 5, this becomes 3 if going down
  # it's a half to target number
 
  # increase
  if direction
    digit_at_index += jump unless digit_at_index == 9
 
    #puts digit_at_index
    if digit_at_index == 9 && digit_second == 9
      digit_done = true
    else
      digit_done = false
    end
 
  # decrease
  else
    digit_at_index -= jump unless digit_at_index == 0
 
    if digit_at_index == 0 && digit_second == 0
      digit_done = true
    else
      digit_done = false
    end
 
  end
 
  # if we flip-flop back and forth between finding our number is too high
  # or too low then our number is probably in the middle
  # 4 swaps will leave us with the lower number
  # which then the next digit will need to increase
  # TODO: if this flops the wrong way, the algorithm breaks.
  if changed_directions == 4
    digit_done = true
    digit_at_index = digit_second
  end
 
 
  # substitute our done digit in place
  current_number_string = current_number.to_s
  current_number_array = current_number_string.chars.to_a
  current_number_array[index] = digit_at_index
  current_number = current_number_array.join.to_i
 
  # move on to the next digit
  if digit_done
    index += 1
    digit_at_index = current_number.to_s[index..index].to_i
    digit_ceiling = 10
    digit_floor = 0
    jump = 5
 
    changed_directions = 0
    direction = true
 
    digit_second = 0
    digit_third = 0
  end
 
  puts current_number
 
  # bug avoid
  if current_number.to_s.length != starting_number.to_s.length
   exit
  end
 
  # not needed but useful for watching how the algorithm works
  sleep 0.1
 
end

When we run this little number searcher we find our overflow (or inflection) point. It will run for a while, finding each digit like this:
4400000000000000000
4700000000000000000
4500000000000000000
.
.
.
4611686018427387900
4611686018427387904
4611686018427387902
4611686018427387903
4611686018427387904
4611686018427387903
4611686018427387903

Eventually we'll have a final value of 4611686018427387903 which we can test like this:

4611686018427387903.object_id => 9223372036854775807
4611686018427387904.object_id => 2152612560

You can see it overflow making 9223372036854775807 the largest object_id in Ruby.

This program is not very efficient or well written. There's a better way to find a number but I decided to go with a digit-by-digit algorithm. It was not very easy and quick to write. I nearly scrapped it and did it the right way but managed to get it working around midnight one night. I hope it makes for a good (or anti-pattern) example.

Nmap won’t compile in homebrew solution

Development,Mac,Unix — Dillon @ 1:11 am

I recently moved my Mac Pro dev box to homebrew from macports. Doing the mysql move (the lazy way — moving the data directory) was easy enough but doing a massive `brew install` of a bunch of packages didn’t work when I got to nmap (the network port scanner).

Specifically the linker error (maybe the linker … looks to a stacktrace to a Ruby and Java guy) was this:

Undefined symbols:
  "ScriptResult::get_id() const", referenced from:
      formatScriptOutput(ScriptResult)       in output.o
      printhostscriptresults(Target*)     in output.o
      printportoutput(Target*, PortList*)    in output.o
  "ScriptResult::get_output() const", referenced from:
      formatScriptOutput(ScriptResult)       in output.o
      printhostscriptresults(Target*)     in output.o
      printportoutput(Target*, PortList*)    in output.o
  "open_nse()", referenced from:
      nmap_main(int, char**)in nmap.o
  "close_nse()", referenced from:
      nmap_free_mem()     in nmap.o
  "script_scan(std::vector >&)", referenced from:
      nmap_main(int, char**)in nmap.o
ld: symbol(s) not found
collect2: ld returned 1 exit status
make[1]: *** [nmap] Error 1
make: *** [all] Error 2
Exit status: 2

http://github.com/mxcl/homebrew/blob/master/Library/Formula/nmap.rb#L1

Error: Failure while executing: make
Please report this bug at http://github.com/mxcl/homebrew/issues
These existing issues may help you:

http://github.com/mxcl/homebrew/issues/#issue/3128

I love the little URL for more help at the bottom, unfortunately a redirect killed the help. The correct URL is here. Maybe an anchor tag in the wrong place on the URL, who knows. Anyway, I posted a note on the issue site as well, I got around the issue by doing this first:
brew install lua

Then just install nmap like before:
brew install nmap

Worked for me.

Macports upgrade breaks ImageMagick

Mac,Ruby — Dillon @ 1:55 pm

A blind upgrade of all macports packages broke my ruby project because it uses imagemagick.


dlopen(/opt/local/lib/ruby/gems/1.8/gems/rmagick-2.13.1/lib/RMagick2.bundle, 9): Library not loaded: /opt/local/lib/libMagickCore.3.dylib
Referenced from: /opt/local/lib/ruby/gems/1.8/gems/rmagick-2.13.1/lib/RMagick2.bundle
Reason: image not found - /opt/local/lib/ruby/gems/1.8/gems/rmagick-2.13.1/lib/RMagick2.bundle

Massive hack incoming:

cd /opt/local/lib
sudo ln -s libMagickCore.4.dylib libMagickCore.3.dylib

There should be a better solution but I can’t find one until macports gets it’s goddamn *bEEp* straight. I swear to deity, every time I do a sudo port upgrade outdated, something breaks. I know it’s not macports fault but the nature of native ruby gem extensions but I’m just going to avoid upgrading.

Something else to consider is my recent switch to homebrew. I haven’t made the switch on all my boxes and the ImageMagick problem isn’t related to the package manager. I do prefer brew’s command structure. Moving MySQL wasn’t insanely difficult but I did cheat by just doing a stupid copy of the MySQL data files.

Ubuntu One True Way

Ruby,Unix — Dillon @ 11:38 am

For a Linux development box, I typically install the same packages over and over again. For any new OS install, I call this the “One True Way” (OTW taken from a now defunct source online). Almost every box needs development libraries at some point. There’s a lot of extra stuff thrown in here too. These packages are circa ubuntu 10.04. Keep this in mind.

# aptitude install ruby ruby1.8-dev mysql-server mysql-client libmysql-ruby libmysqlclient-dev nmap htop nginx git-core postgresql nethack-console irssi tightvncserver xtightvncviewer distcc build-essential linux-kernel-headers linux-source couchdb solr-common solr-tomcat tomcat6 rar unrar p7zip k3b libk3b6-extracodecs gparted ubuntu-restricted-extras libdvdcss libxml2-dev libxslt-dev curl libcurl4-openssl-dev lua50 liblua50-dev libsqlite3-ruby libmagick9-dev

Download rubygems and install.

Add git hub to our gems list:# gem source -a http://gems.github.com.

Install some gems I like. # gem install mysql RedCloth capistrano rmagick termios warbler sinatra feedzirra

Install Chrome. Download .deb 64 package (google-chrome-beta_current_amd64.deb or newer) and double-click on the deb file. Click install.

Grab proggy fonts: clean, square, tiny, small. Set terminal font to proggy font. Set scrollback to 9000 lines, hidden scrollbar (shift+page up works instead), turn off menu, turn off terminal bell.

Install dropbox.

Go to town.

Homebrew behind a proxy

Blog — Dillon @ 10:12 am

Git doesn’t work behind a proxy with homebrew (the macports new hotness). Because git:// is blocked at my office. There’s a patch here. Unfortunately, the drop-in replacement didn’t work for me (it’s an old commit).

Instead I made the modifications myself. Be warned, that this file will probably only work for the version I’m using (0.7.1), in which case you’ll have to look at the SVN commit yourself. :(


cd /usr/local/Library/Homebrew
cp download_strategy.rb download_strategy.rb.orig
wget http://squarism.com/files/download_strategy_proxy_fix.rb -O download_strategy.rb
export HOMEBREW_GIT_VIA_HTTP=1
brew install [something]

Any brew installs that use git should work now.

But then there’s curl. Curl doesn’t quite use the same env that others do. So solve it like this:

export http_proxy=http://proxy:80
export ALL_PROXY=$http_proxy

Curl likes that ALL_PROXY env for some reason.

After that, I was able to get my favorite homebrew apps installing:
brew install couchdb irssi git mysql watch lua p7zip htop openssl node npm nmap netcat

Update: If the above still doesn’t work for you (I had problems with Git URLs), try this:
cd /usr/local/Library/Homebrew
vi download_strategy.rb

Find the line when %r[^git://] then GitDownloadStrategy and replace (or comment it out) it with this:
when %r[^git://]
url.gsub!(/^git\:\/\//, ‘http://’)
GitDownloadStrategy

Git should use the http:// method of downloading code and brew install should work. Just to be clear, the relevant part of download_strategy.rb looks like this:

  when %r[^bzr://] then BazaarDownloadStrategy
  #when %r[^git://] then GitDownloadStrategy
  when %r[^git://]
        url.gsub!(/^git\:\/\//, 'http://')
        GitDownloadStrategy
  when %r[^hg://] then MercurialDownloadStrategy

Another thing to try is to edit ~/.curlrc to enable a SOCKS5 proxy if you have one.
socks5 = "yourserver:port"

« Previous PageNext Page »
This work is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 Unported License.
(c) 2012 SQUARISM | powered by WordPress with Barecity