Implementing define_method

September 17th, 2007

A walkthrough of how 'define_method' is implemented in Rubinius

This week I participated in the first Rubinius sprint in Denver, CO. A good time was had by all, and quite a lot of useful code was cranked out.

Brian has covered the basics rather well at the above link, so I won't bother to repeat them here, other than to thank Sun for sponsoring the travel expenses. Sun is showing a lot of class in the Ruby community by paying attention to more than their own JRuby project.

Prior to the sprint, one Ruby feature that was horribly broken in Rubinius was Module#define_method. In its most commonly-encountered form, this feature takes a block and 'promotes' it into an actual method. While it has some unfortunate limitations in ruby 1.8, this is still a very mainstream feature, and it needs to work.

I won't torture you by showing you the code as it existed prior to the sprint, but basically it:
  1. Made a copy of the method object that called define_method
  2. Surgically removed the compiled code from said method
  3. Injected the bytecodes representing the block into the method
  4. Placed the newly-built method into the MethodTable of the appropriate class

While it is a testament to the incredible dynamism of Rubinius that this approach was even possible, it turns out that define_method has some unique requirements that weren't obvious to me at first.

Let's say we have a class like this:
class SomeClass
  def to_s
    "someclass"
  end
end

Now we want to define a new method on it called 'some_method'. Generally, define_method is used when you need to create a method that 'encloses' variables that are available to the caller, just like a block or a Proc.

class SomeClass
  x = 5
  define_method(:some_method) { x }
end

In this case, we've got 'x', a local variable, that we want to be able to access when we call the newly-defined method.

Simply doing "def some_method" would prevent us from accessing this variable.

SomeClass.new.some_method # => 5
As expected, this return '5'.

So far this is looking pretty straightforward. We can access the calling scope at runtime, when the defined method is invoked.

How about this, though?

class SomeClass
  define_method(:some_method) { self }
end

If, based on the earlier code, you expected 'self' to be the caller of define_method, you would be wrong.

p SomeClass.new.some_method # => "someclass"

If you were, don't feel bad. Evan and I guessed wrong too.
If you weren't fooled, you are smart and should come contribute to Rubinius.

As you can see, 'self' is what it would be if you had defined the method normally. Calling the new method seems to behave more like 'instance_eval' than 'call'. Even more importantly, self is known only when the method is called, not when it is defined. Each invocation might give a different result, just like a normal method.

To implement this, I added a new Rubinius 'primitive' that implements what we are calling a 'Delegated Method'. A delegated method is a placeholder in the method table that, when called, executes the necessary code in the correct context.

t1 = NTH_FIELD(mo, 4); // Which method to call
t2 = NTH_FIELD(mo, 5); // What are we calling this method on
t3 = NTH_FIELD(mo, 6); // Do we need 'self' to be available?
if(Qtrue == t3) {
  num_args++; // Method expects 'self' as an argument
} else {
  stack_pop(); // Discard self
}
cpu_send_method2(state, c, t2, t1, num_args, Qnil); // Invoke it

This code is fairly typical of the parts of Rubinius that are implemented in C.
In other words, it is pretty easy to understand at first glance, and very short.

This approach suddenly makes implementing define_method straightforward:

def define_method(name, meth = nil, &prc)
  meth ||= prc

  if meth.kind_of?(Proc)
    block_env = meth.block
    cm = DelegatedMethod.build(:call_on_instance, block_env, true)
  elsif meth.kind_of?(Method)
    cm = DelegatedMethod.build(:call, meth, false)
  elsif meth.kind_of?(UnboundMethod)
    cm = DelegatedMethod.build(:call_on_instance, meth, true)
  else
    raise TypeError, "wrong argument type #{meth.class} (expected Proc/Method)"
  end

  self.method_table[name.to_sym] = cm
  VM.reset_method_cache(name.to_sym)
  meth
end

In the case of our example code, execution will follow the "if kind_of?(Proc)" path. define_method can also take a Method object in ruby 1.8, hence the other code branches.

This code:

  1. Fetches the block that was given to define_method
  2. Makes a new DelegatedMethod that wraps up the block
  3. Adds the new method to the method table
  4. Resets the method cache so that any older versions of this method are discarded

A nice side-effect of this approach is that the newly-defined method is almost as fast as a normal one, bypassing the extremely large slowdown experienced in ruby 1.8.

Implementing a Ruby VM in Ruby turns out to feel pretty natural. Next up on the chopping block, eval.

Sorry, comments are closed for this article.