JSON and the Args (or Not)

Posted by code_monkey_steve, Thu Dec 31 17:51:00 UTC 2009

(Admittedly not my best work, but it’s hard coming up with JSON puns)

Playing with JSON Document Stores has led me down the path to a few other exciting JSON toys. The first is …

JSON Schema

Just like XML Schema, JSON Schema allows you to specify (in JSON form) the semantics of a particular JSON data structure. While in theory the schema is useful for validation, in practice validation just sucks up too much CPU time to be worth the trouble. Where things get really interesting is the possibility of automatically generating user interfaces that compose JSON-based messages behind the scenes. There’s just one thing missing …

Service Mapping Description

Ta-da! SMD allows you to describe all of the methods of a web service using JSON Schema to describe each method’s parameters. It supports a variety of transports and envelopes from simple GETs or POSTs up through RESTful resources and JSON-RPC.

The beauty of all of this is that it allows for services (web and otherwise) to advertise its functionality in user- and computer-readable ways. It has the potential to be a generic communication mechanism between disparate bits of software (potentially controlling hardware), allowing us to virtually wire-up appliance X to service Y through user interface Z, without any of them having prior knowledge of each other. Pretty sweet.

You Are 'Here':{} ...

One last bit of JSON goodness: GeoJSON. As you might expect, it’s a standard for representing geographic information in JSON. It’s already the de facto standard, and supported by just about everybody.

0 comments | Filed Under: | Tags: json

Long Time, No Blog

Posted by code_monkey_steve, Mon Nov 23 12:13:00 UTC 2009

It seems to work out that when I’m most busy coding, I have the least time to blog (and vice-versa). That’s my official excuse for it being two months since my last update: work has been busy.

First, at the beginning of October, me and the Kashless Krew went to AlohaOnRails, my first Ruby/Rails conference in sunny (read: “sweltering”) O’ahu, Hawai’i:

The conference was a lot of fun: got to hear about some neat Rails tech and meet some of the rock stars in the Ruby world.

I’ve also been continuing my exploration of No-SQL databases. While I still love the design of CouchDB, I’ve been playing with MongoDB and MongoMapper. Mongo is a bit more mature, even if it does still have a whiff of the SQL smell about it. I’ll have more Useful Information about that later.

The other thing I’ve been learning is the RSpec testing framework. When I first looked into it a few years back, I found the RSpec syntax unwieldy compared with Test::Unit. But after having to do various hacky things to make Shoulda run decently (e.g. fast_context) and even contemplating writing my own framework (the now defunct Mustard), I’ve come to appreciate the RSpec Way and can’t see myself ever going back.

0 comments | Filed Under: | Tags:

Safe Mailing with mail_safe

Posted by code_monkey_steve, Mon Sep 21 10:50:00 UTC 2009

Working on a production website can be a bit nerve-wracking, especially when it comes to testing features that send email as a side effect: one little bug could wind up spamming all of your precious users. Of course, Rails has the basic safety feature of simply disabling mail delivery in certain environments (i.e. test and development), but that’s no good because sometimes you do want to test that mail is actually delivered, just without having to worry that it’s delivered to live users.

Enter mail_safe, a handy little gem written by my co-worker Myron. Instead of disabling mail delivery environment-wide, mail_safe allows you to define one or more domains for which mail should be delivered, and a catch-all address for those that shouldn’t. This allows for testing with a real account (if it’s in the appropriate domain), while still keeping you secure against unintentional spam.

Once you’ve started using it, you’ll wonder how you ever slept soundly without it. It’s the sort of thing that should probably be included in the Rails core (IMNSHO).

0 comments | Filed Under: | Tags:

String Conversions in Ruby

Posted by code_monkey_steve, Thu Sep 17 12:53:00 UTC 2009

Here’s a tip so simple and elegant it’s amazing I ever got along without it. Consider String#to_f:
  --> '12.34'.to_f
  ==> 12.34
Which works great, so long as the string is a valid number, but if not:
  --> 'splat'.to_f
  ==> 0.0

#to_f doesn’t throw an exception with invalid input, making it dangerous to use with any user-supplied data. I’ve seen various solutions that involve regular expressions, but these are kludgy and don’t handle all proper numeric representations.

Fortunately, Ruby provides type-cast-style methods to do proper conversions with validation:
  --> Float('12.34')
  ==> 12.34
  --> Float('splat')
  ArgumentError: invalid value for Float(): "splat" 

It might look weird, until you realize that Float is both a class (::Float) and a kernel method (Kernel.Float). There are also methods for converting to an Integer, Array, or even back to a String (which is essentially just #to_s).

0 comments | Filed Under: | Tags: ruby

I Hate Shoulda (But I Blame Test::Unit)

Posted by code_monkey_steve, Sun Jul 26 17:31:00 UTC 2009

I hate Shoulda.

But I blame Test::Unit.

Because Test::Unit can’t scale.

There, I’ve said it. It wasn’t easy, I drank the TDD Cool-Aid a long time ago, and never looked back. But the fact of the matter is that Test::Unit is rotten from the very core, and it makes the seductive Shoulda features nothing but bitter lies. Let me demonstrate:

Hopeful Optimism

Take a sufficiently-contrived test, where you create some object and verify some of its properties:
require 'test/unit'
require 'shoulda'

class Numeric
  def even? ; (self % 2).zero?    ; end
  def odd?  ; (self % 2).nonzero? ; end
end

class OddNumberTest < Test::Unit::TestCase
  context 'an odd number'  do
    setup do
      # create object
      @n = 97 ; sleep 1
    end

    should('be true')     {  assert  @n        }
    should('be odd')      {  assert  @n.odd?   }
    should('not be even') {  assert !@n.even?  }
  end
end
Now, someone naive in the ways of Test::Unit, might expect this test to take approximately one second to execute, right? After all, it’s only the object creation that takes any time (in these examples, the sleep 1 represents some non-trivial database or network operation). So we run it, and …
$ ruby -rubygems ./why_shoulda_sucks.rb
Loaded suite why_shoulda_sucks
Started
...
Finished in 3.006617 seconds.
3 tests, 3 assertions, 0 failures, 0 errors

Three seconds? Why did it take so long?!”, our poor naive tester cries. Because, expecting Shoulda to act like a Domain-Specific Language (as all Right-Thinking Rubyists would), he doesn’t realize that under the covers it’s just creating three different Test::Unit tests. So what’s so bad about Test::Unit?

A Sense of Dread

What’s so bad about Test::Unit is that it makes the following assumptions:
  • Each test may be run in any order, not the order define (in fact, it’s in sorted order by test name)
  • Each test may modify the state at any time, not just in the setup function

Therefore, the setup (and teardown) functions must be called for every test, whether they really need it or not.

So instead of:
  setup
  should be true
  should be odd
  should not be even
  (teardown)
We get:
  setup
  should be true
  (teardown)
  (setup)
  should be odd
  (teardown)
  (setup)
  should not be even
  (teardown)
  (setup)

That’s a lot of extraneous setting-up and tearing-down, and since those are the parts that actually do stuff (as opposed to the assertions themselves), that’s the slowest part of the test.

Crushing Disappointment

The greatest features in Shoulda (as opposed to, say, RSpec) is the ability to use nested contexts. This lets us do sub-tests that inherit their parent context’s state, but roll-back their own changes. So let’s add one:
class OddNumberTest < Test::Unit::TestCase
  context 'an odd number'  do
    setup do
      # create object
      @n = 97 ; sleep 1
    end
    should('be true')     {  assert  @n        }
    should('be odd')      {  assert  @n.odd?   }
    should('not be even') {  assert !@n.even?  }

    context 'add one' do
      setup do
        # modify object
        @n += 1 ; sleep 1
      end
      should('be true')    {  assert  @n        }
      should('be even')    {  assert  @n.even?  }
      should('not be odd') {  assert !@n.odd?   }
    end

    should('still be odd')  {  assert @n.odd?  }
  end
Again, on first glance you might expect this test to take two seconds, but actually:
$ ruby -rubygems ./why_shoulda_sucks.rb
Loaded suite why_shoulda_sucks
Started
.......
Finished in 10.014666 seconds.
7 tests, 7 assertions, 0 failures, 0 errors

Ten seconds, over five times what it really should be if Test::Unit was just smart enough execute the tests in the order given, and perform the setup and teardown appropriately.

The Test::Unit Fail Whale

And that’s just with one level of nesting. If we try an even slight-complicated test, with several contexts nested even only a few deep:
class OddNumberTest < Test::Unit::TestCase
  context 'an odd number'  do
    setup do
      # create object
      @n = 97 ; sleep 1
    end
    should('be true')     {  assert  @n        }
    should('be odd')      {  assert  @n.odd?   }
    should('not be even') {  assert !@n.even?  }

    context 'add one' do
      setup do
        # modify object
        @n += 1 ; sleep 1
      end
      should('be true')    {  assert  @n        }
      should('be even')    {  assert  @n.even?   }
      should('not be odd') {  assert !@n.odd?  }

      context 'subtract one' do
        setup do
          # modify object
          @n -= 1 ; sleep 1
        end
        should('be true')     {  assert  @n        }
        should('be odd')      {  assert  @n.odd?   }
        should('not be even') {  assert !@n.even?  }
      end
    end

    should('still be odd')  {  assert @n.odd?  }

    context 'multiply by two' do
      setup do
        # modify object
        @n *= 2 ; sleep 1
      end
      should 'be true'    do  assert  @n        end
      should 'be even'    do  assert  @n.even?  end
      should 'not be odd' do  assert !@n.odd?   end
    end

    should('even still be odd')  {  assert @n.odd?  }
  end
end
Can you say “exponential growth”?
$ ruby -rubygems ./why_shoulda_sucks.rb
Loaded suite why_shoulda_sucks
Started
.......
Finished in 26.048316 seconds.
14 tests, 14 assertions, 0 failures, 0 errors

26 seconds, that’s six times longer than it should take.

You Think That’s Bad?

Now that I’m working at a Real Ruby Shop, I’ve gotten to experience the joy of having thousands of tests to make sure I haven’t done something stupid. But I also get to experience the pain of running all these tests under this profoundly inefficient framework:
$ rake test:units test:functionals test:integration
Finished in 1042.379506 seconds.
2141 tests, 4576 assertions, 0 failures, 0 errors

Finished in 578.284529 seconds.
613 tests, 896 assertions, 0 failures, 0 errors

Finished in 34.538012 seconds.
22 tests, 65 assertions, 0 failures, 0 errors

Almost half an hour on a decent system, over 50 minutes on our Continuous Integration server. That’s an awful lot of waiting.

Screw You Guys, I’m Going Home

Fork You then, I’ll make my own testing framework that keeps track of dependencies and instantiates them in the most efficient way (and blackjack, and hookers!) . And while I’m at it:
  • “Should” is not the correct word . “Must” is the correct word (plus, less typing).
  • If a test fails, it should not run any other tests in that context. They’ll almost certainly also fail and unhelpfully spam you with error messages.
  • Autotest should be baked right in, so that if a subset of the tests fail, I should be able to re-run just the failing tests, which will in turn only instantiate the necessary prerequisites, and in the most efficient order.

So far I have a proof-of-concept project on GitHub called Mustard . I’m going to start migrating my other projects to it from Shoulda and will write more on the subject later. Watch this space …

0 comments | Filed Under: | Tags:

All Your (Data)Base ...

Posted by code_monkey_steve, Mon Jun 15 22:26:00 UTC 2009

After failing to make CouchDB doing anything useful, and being completely unwilling to go back to 1974 I decided to go back and revisit my assumptions. Both of my current home projects are essentially attempts to treat real-world interactions as Routing Problems, but after doing some research, I decided that was one wheel I didn’t even want to attempt to reinvent (graph theory is not my specialty).

Somewhere along the way, I discovered what I really needed was a Graph Database. That led me to apparently the only significant implementation: Neo4j, an embedded Java graph database. But I’d rather juggle flaming porcupines than touch Java again … and thanks to JRuby and the neo4j gem, I don’t have to! Yay!

So if your problem domain is graph-like, you should definitely checkout neo4j, it’s looking like a seriously sweet storage solution.

0 comments | Filed Under: | Tags: db neo4j

Couch Trouble

Posted by code_monkey_steve, Tue May 26 11:12:00 UTC 2009

I’ve been wrangling with CouchDB for a few weeks now, and it’s starting to feel a bit like this:

(Ooh, my first embedded video. Feel the Web 2.0 Awesome-ness-age-ality).

First, let me say that I can’t really blame CouchDB for any of my troubles, which are essentially:
  • There are a excess of Ruby/Rails gems for accessing CouchDB, all of whom have different dependencies and do things in slightly different ways. I’m sure that eventually a consensus will emerge on the best Ruby/CouchDB way of doing things, but it hasn’t happened yet.
  • CouchDB is not yet 1.0, so the design can support lots of spiffy features that don’t actually exist yet. Specifically, the lack of partial replication stalled my attempts at using Couch for a distributed media server project.
  • CouchDB doesn’t work perfectly for absolutely everything (whoda thunkit?). My other big project (more on that later) isn’t really Document-Oriented, no matter how much I try to beat it flat. I’m now thinking Git is actually be best storage solution, and if you understand Gits internals well enough, you’ll see how mind-warping that concept is.

So I think I’ll put CouchDB down for a while, at least until 1.0, or until I run across a project where it’s appropriate. Of course, it’s still a gazillion times better than any RDBMS

0 comments | Filed Under: | Tags: couchdb

CouchDB Testing Tip

Posted by code_monkey_steve, Wed May 13 13:52:00 UTC 2009

Finished my first stab at converting my current toy project from AR to CouchDB, and so far so good. I ran into an issue where associations aren’t getting saved, but I’m most likely just doing something stupid.

One minor annoyance is that, unlike ActiveRecord, the test database doesn’t get purged and after a while can get cluttered with randomly-generated fixtures1. No problem, just drop in this little Rake task to recreate the DB on each run:

# lib/tasks/couchdb.rake
require File.expand_path( RAILS_ROOT + '/config/environment' )
require 'couch_potato'

task 'couchdb:test:purge'  do
  CouchPotato::Config.database_name = 
    YAML::load(File.read(Rails.root.to_s + '/config/couchdb.yml'))['test']
  CouchPotato.couchrest_database.recreate!
end
task 'db:test:purge' => 'couchdb:test:purge'

1 BTW, have I mentioned how cool factory_girl is? Another new tool for my bag of tricks.

2 comments | Filed Under: | Tags: couchdb

CouchDB: Frankie Says Relax

Posted by code_monkey_steve, Tue May 12 11:49:00 UTC 2009

I’m falling in love (or at least lust) with CouchDB, especially after seeing this presentation at AAC and this presentation for the BBC. My summary: JSON document storage, sliced-and-diced with Javascript MapReduce, all served on a RESTful platter.

As a long-time XML fanboy, the lack of schema in JSON makes me a bit twitchy, and using Javascript as a query language just looks a lttle wrong. But I see the advantages to the document-centric model (versioning, replication, access control) and MapReduce is definitely the Wave of the Future Present. It looks like you can encapsulate all of your model logic in views, so I’m not sure if an explicit schema is really even necessary. The more I learn about That Way of doing things, the more it grows on me.

So how do we make CouchDB play nicely with Rails? I first tried activecouch, but found its lack of Ruby-type casts and one-database-per-model scheme irritating. couch_potato definitely looks slicker, but there seems to be quite a few other CouchDB interfaces out there that might be just as good or better. I see this as a good sign that many others also see CouchDB’s potential, and are experimenting with ways to deal with it in a Ruby Way.

0 comments | Filed Under: | Tags: couchdb rails