Several times in my developer’s life I have found it very useful to use Dependency Graph Programming. That means mapping processes into invokable targets having dependencies on other targets that need to be invoked prior. This is typically what Rake has been created for.
This is especially useful when I want to map processes that can be represented in dependency graphs, and need
- a lot of computing time,
- to reuse previously computed results to re-execute a minimal set of code when changing input data,
- heavy files processing with intermediary files
For such processes, tools such as Make, Ant or Rake are perfectly fit.
However each time I mapped processes with these tools, I faced the same problem: the Dependency Graph itself needs to be changed dynamically during the processing. That means that some targets need to modify other targets’ dependencies. And this is rarely implemented in those tools, therefore needing complex workarounds.
The most obvious example is found when dealing with C libraries compilation and linking (a very common use-case, especially when using Make). If you want to dynamically compute the list of object files to include in a library (based on their symbols, location, naming, whatever …), you want to change the library’s target’s dependencies dynamically. This can be done using a new target computing dependencies, and make the library’s target depend on this new target.
Processing this graph to build mylib.so
will first invoke target compute_dependencies
. This target will compute that only files c_file_1.c
and c_file_2.c
need to be included in mylib.so
and so will modify the dependency graph as following:
This way, a new build of mylib.so
that should change the dependencies will change the graph again by executing compute_dependencies
target again and adapt the list of object files for mylib.so
.
To get this working, the build system needs to have the following properties:
- be able to change dependencies during a target’s execution,
- reprocess dependencies if they have changed during a target’s execution,
- keep dependencies sorted for each target (it is important that the first dependency computing other dependencies be invoked first).
And now onto Rake. The previous example can be coded this way:
require 'rake' # Dump targets' invocation Rake.application.options.trace = true # Tasks for c files task :"c_file_1.c" task :"c_file_2.c" task :"c_file_3.c" task :"c_file_1.o" => :"c_file_1.c" task :"c_file_2.o" => :"c_file_2.c" task :"c_file_3.o" => :"c_file_3.c" # Begin by defining only the compute_dependencies # dependency task :"mylib.so" => :"compute_dependencies" do |t| puts "Dependencies of #{t}: #{t.prerequisites.inspect}" end # The task modifying the graph task :"compute_dependencies" do |t| puts "Modify dependencies of :mylib.so" Rake::Task[:"mylib.so"].prerequisites.replace( [ :"compute_dependencies", :"c_file_1.o", :"c_file_2.o" ] ) end # Build the library: this will output the whole # invocation/execution sequence Rake::Task[:"mylib.so"].invoke
Using Rake 0.8.7, here is the output:
** Invoke mylib.so (first_time) ** Invoke compute_dependencies (first_time) ** Execute compute_dependencies Modify dependencies of :mylib.so ** Invoke c_file_1.o (first_time) ** Invoke c_file_1.c (first_time) ** Execute c_file_1.c ** Execute c_file_1.o ** Invoke c_file_2.o (first_time) ** Invoke c_file_2.c (first_time) ** Execute c_file_2.c ** Execute c_file_2.o ** Execute mylib.so Dependencies of mylib.so: [:compute_dependencies, :"c_file_1.o", :"c_file_2.o"]
We can see that compute_dependencies
invocation has modified the graph, and the modifications were taken into account as c_file_1.o
and c_file_2.o
were correctly invoked before mylib.so
. It works perfectly.
Using Rake 0.9.2.2, here is the output:
** Invoke mylib.so (first_time) ** Invoke compute_dependencies (first_time) ** Execute compute_dependencies Modify dependencies of :mylib.so ** Execute mylib.so Dependencies of mylib.so: [:compute_dependencies, :"c_file_1.o", :"c_file_2.o"]
Now it is broken: dependencies have been modified (as per the last log), but the invocation chain has not been re-evaluated, and targets c_file_1.o
and c_file_2.o
have not been invoked.
A simple way to fix it for Rake 0.9.2.2 is to rewrite its Rake::Task::invoke_prerequisites
method:
module Rake class Task # Keep original method alias :invoke_prerequisites_ORG :invoke_prerequisites # Rewrite it def invoke_prerequisites(task_args, invocation_chain) prerequisites_changed = true while (prerequisites_changed) # Keep original prerequisites list original_prerequisites = prerequisite_tasks.clone # Call original method (this call might change the prerequisites list) invoke_prerequisites_ORG(task_args, invocation_chain) prerequisites_changed = (prerequisite_tasks != original_prerequisites) end end end end
And now the output proves targets have been invoked correctly:
** Invoke mylib.so (first_time) ** Invoke compute_dependencies (first_time) ** Execute compute_dependencies Modify dependencies of :mylib.so ** Invoke compute_dependencies ** Invoke c_file_1.o (first_time) ** Invoke c_file_1.c (first_time) ** Execute c_file_1.c ** Execute c_file_1.o ** Invoke c_file_2.o (first_time) ** Invoke c_file_2.c (first_time) ** Execute c_file_2.c ** Execute c_file_2.o ** Execute mylib.so Dependencies of mylib.so: [:compute_dependencies, :"c_file_1.o", :"c_file_2.o"]
I think this can come very handy for a lot of processes.
EDIT: A pull request has already been made for integrating these specs here.