The Ruby C API – Basics

In my 2 previous posts, we saw how to write C extensions, and how to package them. Now time has come to see a little bit of the Ruby C API.

This post will deal with the basics, next ones will go deeper.
Normally a little C knowledge will be enough to understand most of it.

The goal of these posts is not to be exhaustive, but rather to learn quickly the most useful techniques through simple examples.

In all the examples below, I give the Ruby and C version, for you to better understand the behavior.

  1. C API usage
  2. Ruby objects
  3. Declaring modules, classes and methods
  4. Instantiating objects of a given Ruby class
  5. Calling methods of existing Ruby objects
  6. Numeric types conversions between C and Ruby
  7. Strings
  8. Arrays
  9. Hashes
  10. Next step

C API usage

The Ruby C API can be accessed through the inclusion of 1 single file: ruby.h.

#include <ruby.h>

This file declares the whole Ruby C API, types…

Then every C extension has to implement a function named Init_xxxxx (xxxxx being your extension’s name), which is being executed during the “require” of your extension.

void Init_xxxxx() {
  // Code executed during "require"
}

Ruby objects

The types system is soooooo easy: every Ruby type/class is a VALUE C type. In fact VALUE is a just a uintptr_t C type, or void* to simplify things. This is how Ruby’s loose typing is implemented in C.

There is also another type representing Ruby symbols (like :my_symbol): this is the C type ID. To get a Ruby symbol from its name (as a C string), we will use the rb_intern method:

ID sym_myclass = rb_intern("MyClass");

Therefore, each time we use the Ruby C API, we have to use VALUE types to handle Ruby objects, and ID types to handle symbols.

Then the Ruby C API defines a bunch of handful C constants defining standard Ruby objects:

C Ruby
Qnil nil
Qtrue true
Qfalse false
rb_cObject Object
rb_mKernel Kernel
rb_cString String

You guessed it: most of Ruby standard modules are named rb_mModuleName and standard classes are named rb_cClassName.

Declaring modules, classes and methods

A good example is better than thousand words.

What we want to achieve in Ruby:

module MyModule
  class MyClass
    def my_method(param1, param2, param3)
    end
  end
end

How we can get it in C:

static VALUE myclass_mymethod(
  VALUE rb_self,
  VALUE rb_param1,
  VALUE rb_param2,
  VALUE rb_param3)
{
  // Code executed when calling my_method on an object of class MyModule::MyClass
}

void Init_xxxxx()
{
  // Define a new module
  VALUE mymodule = rb_define_module("MyModule");
  // Define a new class (inheriting Object) in this module
  VALUE myclass = rb_define_class_under(mymodule, "MyClass", rb_cObject);
  // Define a method in this class, taking 3 arguments, and using the C method "myclass_method" as its body
  rb_define_method(myclass, "my_method", myclass_mymethod, 3);
}

We can notice the following:

  • The use of the VALUE type for every Ruby object (modules and classes are Ruby objects).
  • The use of rb_define_module and rb_define_class_under API methods to define new Ruby modules and classes.
  • The definition of the body of our method as a C function.
  • The argument rb_self, always put as the first C function’s argument for classes’ methods. This contains the Ruby self object.

Instantiating objects of a given Ruby class

Ruby way:

myobject = MyModule::MyClass.new
mystring = String.new("With argument")

C way:

// Get symbol for our module's name
ID sym_mymodule = rb_intern("MyModule");
// Get the module
VALUE mymodule = rb_const_get(rb_cObject, sym_mymodule);
// Get symbol for our class' name
ID sym_myclass = rb_intern("MyClass");
// Get the class
VALUE myclass = rb_const_get(mymodule, sym_myclass);
// Create a new object, using the default initializer, having 0 argument
VALUE argv[0];
VALUE myobject = rb_class_new_instance(0, argv, myclass2);

// Use String initializer with 1 argument
VALUE strargv[1];
strargv[0] = rb_str_new2("With argument");
VALUE mystring = rb_class_new_instance(1, strargv, rb_cString);

A few points to notice there:

  • Usage of rb_intern to convert C strings to Ruby symbols.
  • Usage of rb_const_get to get a Ruby constant object.
  • Usage of rb_class_new_instance, taking the number of arguments, then a C array containing arguments given to the initializer, and last the class name.
  • The use of rb_str_new2 to create a string (see further section on Strings).

Calling methods of existing Ruby objects

Ruby way:

result = myobject.my_method(nil, true, false)
puts "Hello world!"

C way:

// Get the method's symbol
ID sym_mymethod = rb_intern("my_method");
// Call the method, giving 3 parameters
VALUE result = rb_funcall(myobject, sym_mymethod, 3, Qnil, Qtrue, Qfalse);

// Get the puts method's symbol
ID sym_puts = rb_intern("puts");
// Call puts, from Kernel
rb_funcall(rb_mKernel, sym_puts, 1, rb_str_new2("Hello world!"));

Here we note:

  • The call to rb_funcall, taking the object on which the method is called, then the method’s symbol, arguments count, and arguments. Generally methods called in Ruby without specifying the class are called from Kernel (rb_mKernel) or self.

Numeric types conversions between C and Ruby

Ruby has types that translate pretty well into numerical C types, such as integers or floats. Convenient methods are given by the C API to convert one way or the other:

// rb_myint is a Ruby integer (Fixnum)
int c_myint = FIX2INT(rb_myint);
// Create a new Ruby Fixnum based on the C int
VALUE rb_mynewint = INT2FIX(c_myint + 42);

// rb_myfloat is a Ruby float (Numeric)
double c_myfloat = NUM2DBL(rb_myfloat);
// Create a new Ruby Numeric based on the C double
VALUE rb_mynewfloat = DBL2NUM(c_myfloat + 0.5);
  • FIX2INT and INT2FIX are used to convert C integers from and to Ruby Fixnum.
  • NUM2DBL and DBL2NUM are used to convert C doubles from and to Ruby Numeric.

Strings

The Ruby C API has it all to handle strings.

// rb_mystring is a Ruby String. Get the corresponding C string (! Not guaranteed to be NULL terminated)
char* c_mystring = RSTRING_PTR(rb_mystring);
int c_mystring_len = RSTRING_LEN(rb_mystring);
// Create a Ruby String from a C string (NULL terminated this time)
VALUE rb_mynewstring = rb_str_new2("This is a C string");
  • RSTRING_PTR is used to get the underlying C string: a pointer to the stored string, with no copy or memory allocation. Beware however: you can’t be rely on the NULL terminated character, first because your Ruby string can have NULL characters, and then because the Ruby C API does not guarantee it.
  • RSTRING_LEN is used to get the string’s length. This is the safest method to assume your string’s length. Do not use C method strlen as it can be confused by extra or missing NULL characters.
  • rb_str_new2 is used to create a Ruby string from a NULL terminated C string.

Arrays

Here again, every array manipulation can be done using the C API:

Ruby way:

rb_myelement = rb_myarray[3]
rb_myarray << true
rb_mynewarray = [ nil, true, false ]

C way:

// rb_myarray is a Ruby array containing at least 4 elements
int c_myarray_len = RARRAY_LEN(rb_myarray);
// Access the 4th element
VALUE rb_myelement = rb_ary_entry(rb_myarray, 3);
// Push the element "true" at the end of the array
rb_ary_push(rb_myarray, Qtrue);

// Create an array from scratch, containing nil, true, false
VALUE rb_mynewarray = rb_ary_new3(3, Qnil, Qtrue, Qfalse);
// Do the same using a C array
VALUE c_myarray[3];
c_myarray[0] = Qnil;
c_myarray[1] = Qtrue;
c_myarray[2] = Qfalse;
VALUE rb_mynewarray2 = rb_ary_new4(3, c_myarray);
  • RARRAY_LEN returns the number of elements in the array.
  • rb_ary_entry returns the element stored at the given index.
  • rb_ary_push adds an element at the end of the array.
  • rb_ary_new3 creates an array from the list of arguments.
  • rb_ary_new4 creates an array from a C array.

Hashes

Hashes are also easily accessed:

Ruby way:

rb_four = rb_myhash['foo']
rb_myhash['yop'] = 42
rb_mynewhash = {}

C way:

// rb_myhash is { 'foo' => 4, 'bar' => 2 }
// Get the value of key "foo"
VALUE rb_four = rb_hash_aref(rb_myhash, rb_str_new2("foo"));
// Add a new pair 'yop' => 42
rb_hash_aset(rb_myhash, rb_str_new2("yop"), INT2FIX(42));
// Create a Hash from scratch
VALUE rb_mynewhash = rb_hash_new();
  • rb_hash_aref returns the value indexed by a given key.
  • rb_hash_aset sets the value of a given key.
  • rb_hash_new creates a new hash.

Next step

Now we’ve seen how to access basic Ruby structures and how to extend Ruby by creating new modules, classes and methods. This is the basics.
Next we’ll go deeper into it: Memory management, blocks, wrapping C structures, and many more await!

Stay tuned.

About Muriel Salvan

I am a freelance project manager and polyglot developer, expert in Ruby and Rails. I created X-Aeon Solutions and rivierarb Ruby meetups. I also give trainings and conferences on technical topics. My core development principles: Plugins-oriented architectures, simple components, Open Source power, clever automation, constant technology watch, quality and optimized code. My experience includes big and small companies. I embrace agile methodologies and test driven development, without giving up on planning and risks containment methods as well. I love Open Source and became a big advocate.
C, Howto, Ruby , , , , , , , , , , , , , , , , , , , , , , , ,

9 comments


  1. Pingback: Packaging Ruby C extensions in nice gems | Muriel's Tech Blog

Leave a Reply

Your email address will not be published.