Serialize Ruby objects: ruby-serial

Many times I have come across the need to serialize a whole bunch of Ruby objects, with all their references.
This is often useful to save the state of a program at one point, and retrieve it later in another session.

Then I looked for a solution which could guarantee the following:

  • Serialize directly Ruby objects, without having to convert/copy them to a specific format (no ORM, just native API).
  • Be efficient enough to handle a lot of objects (either big objects or a ton of smaller ones). Namely several hundreds of Mb of data.
  • Serialize user-made objects (not just native types).
  • Serialize/deserialize using the same format in different Ruby versions.
  • Be backward compatible when new serialization versions come out.
  • Be optimized enough to not serialize twice objects that are shared among several Ruby objects.
  • Be able to deserialize shared objects without duplicating them (keep references, don’t memory copy).

And I could not find a simple solution to handle that.
Most of the serialization libraries I found either

  • require a JSON/YAML kind of data conversion (and mem copy),
  • were not compatible across Ruby versions (yes I’m looking at you Marshal),
  • or were wasting space and memory by keeping a human-readable format.

So I decided to create one using what I found best in other solutions: ruby-serial.

Basically, ruby-serial is using the great MessagePack serialization library to encode on-the-fly Ruby objects (without mem copy) by simulating a JSON-like structure for user-made objects and shared object references.

Its API is quite straight-forward (same as Marshal) and can be customized easily.
Here is an example of its installation and usage:

> gem install ruby-serial
require 'ruby-serial'

# Create example
class User
  attr_accessor :name
  attr_accessor :comment
  def ==(other)
    other.is_a?(User) and (@name == other.name) and (@comment == other.comment)
  end
end
shared_obj = 'This string instance will be shared'
user = User.new
user.name = 'John'
user.comment = shared_obj # shared_obj is referenced here
obj = [
  'My String',
  shared_obj, # shared_obj is also referenced here
  1,
  user
]
 
# Get obj as a serialized String
serialized_obj = RubySerial::dump(obj)

# Get back our objects from the serialized String
deserialized_obj = RubySerial::load(serialized_obj)

# Both objects are the same
puts "Same? #{obj == deserialized_obj}"
# => true

# The shared object is still shared!
puts "Shared? #{deserialized_obj[1].object_id == deserialized_obj[3].comment.object_id}"
# => true

Currently ruby-serial is still young, and may lack some features.
It is under active development, so feel free to report any problem you may find in using it.

Complete documentation is accessible here. RDoc here.

Contributions are highly welcomed!

Enjoy!

About Muriel Salvan

I am a freelance project manager and polyglot developer, expert in Ruby and Rails. I created X-Aeon Solutions and rivierarb Ruby meetups. I also give trainings and conferences on technical topics. My core development principles: Plugins-oriented architectures, simple components, Open Source power, clever automation, constant technology watch, quality and optimized code. My experience includes big and small companies. I embrace agile methodologies and test driven development, without giving up on planning and risks containment methods as well. I love Open Source and became a big advocate.
Uncategorized

Leave a Reply

Your email address will not be published.