Rubyinline rules.
View blog reactions Written on May 2, 2007 by Chris Heald
Yesterday, I discovered RubyInline. In short, it’s a gem that allows you to write Ruby extensions in C.
I decided to see what I could do with it.
I profiled a few complex model finder methods I have, and came up with something like this:
[chris@polaris jor]$ script/performance/profiler "Conflict.search \"girl\"" 50 2> c_out
Loading Rails...
Using the standard Ruby profiler.
% cumulative self self total
time seconds seconds calls ms/call ms/call name
16.55 4.07 4.07 8312 0.49 0.92 Mysql#get_length
0.00 24.59 0.00 1 0.00 24590.00 #toplevel
Ouch. 16% of my time spent in Mysql#get_length! For the execution of a single finder action, it looks like we’re calling this an awful lot. Maybe it can be optimized.
I took a look at what the method actually does:
def get_length(data, longlong=nil)
return if data.length == 0
c = data.slice!(0)
case c
when 251
return nil
when 252
a = data.slice!(0,2)
return a[0]+a[1]*256
when 253
a = data.slice!(0,3)
return a[0]+a[1]*256+a[2]*256**2
when 254
a = data.slice!(0,8)
if longlong then
return a[0]+a[1]*256+a[2]*256**2+a[3]*256**3+
a[4]*256**4+a[5]*256**5+a[6]*256**6+a[7]*256**7
else
return a[0]+a[1]*256+a[2]*256**2+a[3]*256**3
end
else
c
end
end
It takes an incoming string, looks at the first byte, and then either returns that byte, nil, or a calculated int based on the value of the first string.
First thing to do is to get rid of those exponents. They’re static, so we can just pre-calculate them and save the CPU the effort. Next, I went ahead and replaced get_length with a method that would let me inspect what’s going in and out:
require 'active_record/vendor/mysql'
class Mysql
def get_length(data, longlong = false)
puts "Input: #{data.inspect}"
x = get_length_c(data, longlong)
puts "Output: #{x}"
return x
end
def get_length_c(data, longlong=nil)
return if data.length == 0
c = data.slice!(0)
case c
when 251
return nil
when 252
a = data.slice!(0,2)
return a[0]+a[1]*256
when 253
a = data.slice!(0,3)
return a[0]+a[1]*256+a[2]*65536
when 254
a = data.slice!(0,8)
if longlong then
return a[0]+a[1]*256+a[2]*65536+a[3]*16777216+a[4]*4294967296+a[5]*1099511627776+a[6]*281474976710656+a[7]*72057594037927936
else
return a[0]+a[1]*256+a[2]*65536+a[3]*16777216
end
else
c
end
end
end
The results of this inspection showed me that the most common case was the “else” - “c” was getting returned a lot. It looks like we’re checking for 251-254 as special cases, and since this is a single byte, 254 is the maximum value it may ever have. So, since “c < 251″ is the most common case, I went ahead and put it first.
def get_length(data, longlong = false)
return if data.length == 0
c = data.slice!(0)
return c if c < 251
return nil if c == 251
end
I then used rubyinline to build an inline C method to do the actual computation in my 252-254 cases, and changed the case statement in the ruby get_length to just call into the C method:
require 'active_record/vendor/mysql'
require 'inline'
class Mysql
def get_length(data, longlong = false)
return if data.length == 0
c = data.slice!(0)
return c if c < 251
l = longlong ? 1 : 0
case c
when 251
return nil
when 252
a = data.slice!(0,2)
return get_length_c(c, a, l)
when 253
a = data.slice!(0,3)
return get_length_c(c, a, l)
when 254
a = data.slice!(0,8)
return get_length_c(c, a, l)
end
return c # Shouldn't ever get here
end
inline do | builder |
builder.c_raw '
static VALUE get_length_c(int argc, VALUE *argv, VALUE self) {
int len = NUM2INT(argv[0]);
int longlong = NUM2INT(argv[2]);
switch(len) {
case 252:
return INT2FIX(
RSTRING(argv[1])->ptr[0] +
RSTRING(argv[1])->ptr[1] * 256
);
break;
case 253:
return INT2FIX(
(RSTRING(argv[1])->ptr[0]) +
(RSTRING(argv[1])->ptr[1]) * 256 +
(RSTRING(argv[1])->ptr[2]) * 65536
);
break;
case 254:
if(longlong == 1) {
return LL2NUM(
(RSTRING(argv[1])->ptr[0]) + 0 +
(RSTRING(argv[1])->ptr[1]) * 256 +
(RSTRING(argv[1])->ptr[2]) * 65536 +
(RSTRING(argv[1])->ptr[3]) * 16777216 +
(RSTRING(argv[1])->ptr[4]) * 4294967296LL +
(RSTRING(argv[1])->ptr[5]) * 1099511627776LL +
(RSTRING(argv[1])->ptr[6]) * 281474976710656LL +
(RSTRING(argv[1])->ptr[7]) * 72057594037927936LL);
} else {
return LONG2NUM(
(RSTRING(argv[1])->ptr[0]) + 0 +
(RSTRING(argv[1])->ptr[1]) * 256 +
(RSTRING(argv[1])->ptr[2]) * 65536 +
(RSTRING(argv[1])->ptr[3]) * 16777216);
}
break;
default:
return INT2NUM(len);
break;
}
}
'
end
end
Let’s run it and see what we get.
[chris@polaris jor]$ script/performance/profiler "Conflict.search \"girl\"" 50 2> c_out
Loading Rails...
Using the standard Ruby profiler.
% cumulative self self total
time seconds seconds calls ms/call ms/call name
12.03 2.32 2.32 8312 0.28 0.44 Mysql#get_length
Wow! We went from 0.49 ms/call to 0.28 ms/call, and an overall run time of 4.07 seconds to an overall run time of 2.32 seconds. That’s a 75.4% overall performance increase for the method!
Using ruby-prof to profile one of my query heavy pages produced the following:
Before:
%Total %Self Total Self Children Calls Name
0.00 0.00 0.00 30/2597 Mysql#read_query_result
27.03% 5.41% 0.10 0.02 0.08 2597 Mysql#get_length
0.00 0.00 0.00 2591/5099 String#slice!
0.00 0.00 0.00 18/4278 String#[]
0.00 0.00 0.00 9/2047 Fixnum#*
0.00 0.00 0.00 9/1976 Fixnum#+
0.06 0.06 0.00 10133/10500 Kernel#===
0.01 0.01 0.00 2597/2722 String#length
0.01 0.01 0.00 2597/17811 Fixnum#==
After:
0.00 0.00 0.00 40/2805 Mysql#read_query_result
6.38% 6.38% 0.03 0.03 0.00 2805 Mysql#get_length
0.00 0.00 0.00 2796/5504 String#slice!
0.00 0.00 0.00 80/3030 Kernel#===
0.00 0.00 0.00 2805/2957 String#length
0.00 0.00 0.00 2805/8523 Fixnum#==
0.00 0.00 0.00 9/9 Mysql#get_length_c
0.00 0.00 0.00 2787/2837 Fixnum#<
Not too shabby!
If you want to get the extension, it’s licensed under the MIT license, and can be found at:
http://svn.digitalsentience.com/svn/rails/extensions/
Enjoy!
Posted in 