In my 2 previous posts, we saw how to write C extensions, and how to package them. Now time has come to see a little bit of the Ruby C API.
This post will deal with the basics, next ones will go deeper.
Normally a little C knowledge will be enough to understand most of it.
The goal of these posts is not to be exhaustive, but rather to learn quickly the most useful techniques through simple examples.
In all the examples below, I give the Ruby and C version, for you to better understand the behavior.
- C API usage
- Ruby objects
- Declaring modules, classes and methods
- Instantiating objects of a given Ruby class
- Calling methods of existing Ruby objects
- Numeric types conversions between C and Ruby
- Strings
- Arrays
- Hashes
- Next step
C API usage
The Ruby C API can be accessed through the inclusion of 1 single file: ruby.h.
#include <ruby.h>
This file declares the whole Ruby C API, types…
Then every C extension has to implement a function named Init_xxxxx (xxxxx being your extension’s name), which is being executed during the “require” of your extension.
void Init_xxxxx() {
// Code executed during "require"
}
Ruby objects
The types system is soooooo easy: every Ruby type/class is a VALUE C type. In fact VALUE is a just a uintptr_t C type, or void* to simplify things. This is how Ruby’s loose typing is implemented in C.
There is also another type representing Ruby symbols (like :my_symbol): this is the C type ID. To get a Ruby symbol from its name (as a C string), we will use the rb_intern method:
ID sym_myclass = rb_intern("MyClass");
Therefore, each time we use the Ruby C API, we have to use VALUE types to handle Ruby objects, and ID types to handle symbols.
Then the Ruby C API defines a bunch of handful C constants defining standard Ruby objects:
| C | Ruby |
|---|---|
| Qnil | nil |
| Qtrue | true |
| Qfalse | false |
| rb_cObject | Object |
| rb_mKernel | Kernel |
| rb_cString | String |
You guessed it: most of Ruby standard modules are named rb_mModuleName and standard classes are named rb_cClassName.
Declaring modules, classes and methods
A good example is better than thousand words.
What we want to achieve in Ruby:
module MyModule
class MyClass
def my_method(param1, param2, param3)
end
end
end
How we can get it in C:
static VALUE myclass_mymethod(
VALUE rb_self,
VALUE rb_param1,
VALUE rb_param2,
VALUE rb_param3)
{
// Code executed when calling my_method on an object of class MyModule::MyClass
}
void Init_xxxxx()
{
// Define a new module
VALUE mymodule = rb_define_module("MyModule");
// Define a new class (inheriting Object) in this module
VALUE myclass = rb_define_class_under(mymodule, "MyClass", rb_cObject);
// Define a method in this class, taking 3 arguments, and using the C method "myclass_method" as its body
rb_define_method(myclass, "my_method", myclass_mymethod, 3);
}
We can notice the following:
- The use of the
VALUEtype for every Ruby object (modules and classes are Ruby objects). - The use of
rb_define_moduleandrb_define_class_underAPI methods to define new Ruby modules and classes. - The definition of the body of our method as a C function.
- The argument
rb_self, always put as the first C function’s argument for classes’ methods. This contains the Rubyselfobject.
Instantiating objects of a given Ruby class
Ruby way:
myobject = MyModule::MyClass.new
mystring = String.new("With argument")
C way:
// Get symbol for our module's name
ID sym_mymodule = rb_intern("MyModule");
// Get the module
VALUE mymodule = rb_const_get(rb_cObject, sym_mymodule);
// Get symbol for our class' name
ID sym_myclass = rb_intern("MyClass");
// Get the class
VALUE myclass = rb_const_get(mymodule, sym_myclass);
// Create a new object, using the default initializer, having 0 argument
VALUE argv[0];
VALUE myobject = rb_class_new_instance(0, argv, myclass2);
// Use String initializer with 1 argument
VALUE strargv[1];
strargv[0] = rb_str_new2("With argument");
VALUE mystring = rb_class_new_instance(1, strargv, rb_cString);
A few points to notice there:
- Usage of
rb_internto convert C strings to Ruby symbols. - Usage of
rb_const_getto get a Ruby constant object. - Usage of
rb_class_new_instance, taking the number of arguments, then a C array containing arguments given to the initializer, and last the class name. - The use of
rb_str_new2to create a string (see further section on Strings).
Calling methods of existing Ruby objects
Ruby way:
result = myobject.my_method(nil, true, false) puts "Hello world!"
C way:
// Get the method's symbol
ID sym_mymethod = rb_intern("my_method");
// Call the method, giving 3 parameters
VALUE result = rb_funcall(myobject, sym_mymethod, 3, Qnil, Qtrue, Qfalse);
// Get the puts method's symbol
ID sym_puts = rb_intern("puts");
// Call puts, from Kernel
rb_funcall(rb_mKernel, sym_puts, 1, rb_str_new2("Hello world!"));
Here we note:
- The call to
rb_funcall, taking the object on which the method is called, then the method’s symbol, arguments count, and arguments. Generally methods called in Ruby without specifying the class are called from Kernel (rb_mKernel) or self.
Numeric types conversions between C and Ruby
Ruby has types that translate pretty well into numerical C types, such as integers or floats. Convenient methods are given by the C API to convert one way or the other:
// rb_myint is a Ruby integer (Fixnum) int c_myint = FIX2INT(rb_myint); // Create a new Ruby Fixnum based on the C int VALUE rb_mynewint = INT2FIX(c_myint + 42); // rb_myfloat is a Ruby float (Numeric) double c_myfloat = NUM2DBL(rb_myfloat); // Create a new Ruby Numeric based on the C double VALUE rb_mynewfloat = DBL2NUM(c_myfloat + 0.5);
FIX2INTandINT2FIXare used to convert C integers from and to Ruby Fixnum.NUM2DBLandDBL2NUMare used to convert C doubles from and to Ruby Numeric.
Strings
The Ruby C API has it all to handle strings.
// rb_mystring is a Ruby String. Get the corresponding C string (! Not guaranteed to be NULL terminated)
char* c_mystring = RSTRING_PTR(rb_mystring);
int c_mystring_len = RSTRING_LEN(rb_mystring);
// Create a Ruby String from a C string (NULL terminated this time)
VALUE rb_mynewstring = rb_str_new2("This is a C string");
RSTRING_PTRis used to get the underlying C string: a pointer to the stored string, with no copy or memory allocation. Beware however: you can’t be rely on the NULL terminated character, first because your Ruby string can have NULL characters, and then because the Ruby C API does not guarantee it.RSTRING_LENis used to get the string’s length. This is the safest method to assume your string’s length. Do not use C methodstrlenas it can be confused by extra or missing NULL characters.rb_str_new2is used to create a Ruby string from a NULL terminated C string.
Arrays
Here again, every array manipulation can be done using the C API:
Ruby way:
rb_myelement = rb_myarray[3] rb_myarray << true rb_mynewarray = [ nil, true, false ]
C way:
// rb_myarray is a Ruby array containing at least 4 elements int c_myarray_len = RARRAY_LEN(rb_myarray); // Access the 4th element VALUE rb_myelement = rb_ary_entry(rb_myarray, 3); // Push the element "true" at the end of the array rb_ary_push(rb_myarray, Qtrue); // Create an array from scratch, containing nil, true, false VALUE rb_mynewarray = rb_ary_new3(3, Qnil, Qtrue, Qfalse); // Do the same using a C array VALUE c_myarray[3]; c_myarray[0] = Qnil; c_myarray[1] = Qtrue; c_myarray[2] = Qfalse; VALUE rb_mynewarray2 = rb_ary_new4(3, c_myarray);
RARRAY_LENreturns the number of elements in the array.rb_ary_entryreturns the element stored at the given index.rb_ary_pushadds an element at the end of the array.rb_ary_new3creates an array from the list of arguments.rb_ary_new4creates an array from a C array.
Hashes
Hashes are also easily accessed:
Ruby way:
rb_four = rb_myhash['foo']
rb_myhash['yop'] = 42
rb_mynewhash = {}
C way:
// rb_myhash is { 'foo' => 4, 'bar' => 2 }
// Get the value of key "foo"
VALUE rb_four = rb_hash_aref(rb_myhash, rb_str_new2("foo"));
// Add a new pair 'yop' => 42
rb_hash_aset(rb_myhash, rb_str_new2("yop"), INT2FIX(42));
// Create a Hash from scratch
VALUE rb_mynewhash = rb_hash_new();
rb_hash_arefreturns the value indexed by a given key.rb_hash_asetsets the value of a given key.rb_hash_newcreates a new hash.
Next step
Now we’ve seen how to access basic Ruby structures and how to extend Ruby by creating new modules, classes and methods. This is the basics.
Next we’ll go deeper into it: Memory management, blocks, wrapping C structures, and many more await!
Stay tuned.
Pingback: Packaging Ruby C extensions in nice gems | Muriel's Tech Blog