How to pilot external processes’ stdin and stdout in real time using Ruby

I finally made it!
Regarding my previous post about failing to pilot processes on Windows, I searched again and finally found a solution.

[EDIT] I have bundled the solution I explain in this article in a nice gem that you can use very easily: ProcessPilot.

The 4 underlying problems I was not getting were the following:

Ruby has its own buffer mechanism for STDOUT. This was the reason I was getting the output only on processes’ termination. This problem does not affect non-Ruby programs.
Windows processes cannot take interactive input from a file to stdin.
Windows processes can only redirect STDOUT and STDERR to files (at least with the libraries I tested)
Native Ruby processes on Windows cannot have STDOUT redirected to a file in a thread, then having another thread read the same file without breaking STDIN pipes between parent and child processes (however this works under Cygwin environment)

So here are the points leading to the solution:

Use files to get STDOUT and STDERR
Use a proper descriptor to pilot STDIN in real time (no file)
Pilot from an external thread
Do not use STDOUT and STDERR redirections in child program, but instead create child process with STDOUT and STDERR connected directly to files during creation
When your external process is a Ruby program, deactivate Ruby internal buffering (I don’t like this point but could not find a better way)

A lot of this code is already written in the childprocess gem (implements STDOUT/ERR/IN connections correctly, is threaded, gives access to stdin in real time).
It really needs a better documentation, but looking at the code was ok to get it working.

So now onto the code…

I used 2 different test processes, 1 Ruby script and 1 Windows batch file. Both of them produce output and ask for user input, with kind sleeps in between to better follow the process:

The Ruby test program:

puts 'Line 1'
sleep 1
puts 'Line 2'
$stdout << 'Type a string: '
puts $stdin.gets
puts 'Line 3'
sleep 1
puts 'Line 4'
$stdout << 'Type another string: '
puts $stdin.gets
puts 'Line 5'

The Batch test program:

@echo off
set var=
echo Line 1
ruby -e "sleep 1"
echo Line 2
set /p var="Type a string: " %=%
echo %var%
echo Line 3
ruby -e "sleep 1"
echo Line 4
set /p var="Type another string: " %=%
echo written: %var%
echo Line 5

Both of these programs produce the same output when issued from the command line:

Line 1
Line 2
Type a string: My string 1
My string 1
Line 3
Line 4
Type another string: My string 2
My string 2
Line 5

Here is the code executing correctly the test program:

require 'childprocess'

process = ChildProcess.build("test.bat")

# Indication of stdin usage
process.duplex = true

# Specify files for stdout/stderr
# ! Use w+ mode to make it possible for our monitoring
# thread to reopen the file in r mode
process.io.stdout = File.new('std.out', 'w+')
process.io.stderr = File.new('std.err', 'w+')

# Start the process: this creates the background
# thread running our command
process.start

# In our main thread: open the STDOUT/ERR files
stdout = File.open('std.out', 'r')
stderr = File.open('std.err', 'r')
stdin = process.io.stdin

# Implement a blocking read a new string on an IO.
# Make sure we wait for the end of a string before
# returning.
# This is done to ensure we will get the new string
# we are expecting.
# Proper implementation should add a timeout, and
# have a more efficient algo.
#
# Parameters:
# * *io* (_IO_): The IO to query
# Return:
# * _String_: The next string from IO (separator is $/)
def get_out_str(io)
  rStr = ''

  # Concatenate chunks unless we have the separator.
  # As we deal with stdin flow, it is possible to have a
  # line without ending already written in the file and
  # already flushed by the IO.
  while (rStr[-1..-1] != $/)
    newChunk=nil
    while ((newChunk = io.gets) == nil)
      sleep 0.1
    end
    rStr.concat(newChunk)
  end

  return rStr
end

# Send a synchronized input to an IO.
# Make sure it will be flushed.
#
# Parameters:
# * *io* (_IO_): The IO to send to
# * *str* (_String_): The string to send
def send_str_in(io, str)
  io.write str
  io.flush
end

# Now display the output step by step, and send inputs
# when needed.
# Add some kind sleeps for better following
puts "=Line1=> #{get_out_str(stdout)}"
puts "=Line2=> #{get_out_str(stdout)}"
sleep 1
send_str_in(stdin, "My string 1\n")
puts "=InputLine1=> #{get_out_str(stdout)}"
puts "=Line3=> #{get_out_str(stdout)}"
puts "=Line4=> #{get_out_str(stdout)}"
sleep 1
send_str_in(stdin, "My string 2\n")
puts "=InputLine2=> #{get_out_str(stdout)}"
puts "=Line5=> #{get_out_str(stdout)}"

# Wait for the process termination in case it is late
while !process.exited?
  sleep 1
end

And here is its output, without any input entered manually:

>ruby -w pilot.rb
=Line1=> Line 1
=Line2=> Line 2
=InputLine1=> Type a string: My string 1
=Line3=> Line 3
=Line4=> Line 4
=InputLine2=> Type another string: written: My string 2
=Line5=> Line 5

For executing a Ruby process, things are a little different: we need to make sure STDOUT is not cached by Ruby internals.
The simple way to do so is to use this as one of the first lines of your external Ruby program:

STDOUT.sync = true

However I prefer not modifying external Ruby program sources. Therefore I came up with a special function launching a Ruby file with arguments (which should cover 99% of Ruby’s interpreter usage).
This function then uses a little wrapper to execute the external Ruby file in the context of a non-cached STDOUT.

Replace the ChildProcess creation from the former script with the following:

# Prepare a Ruby process with his arguments to be executed
#
# Parameters:
# * *rbfile* (_String_): The rb file to execute
# * *args* (<em>list<String></em>): The arguments list [optional = []]
# Return:
# * _ChildProcess_: Corresponding child process
def prepare_rb_process(rbfile, *args)
  return ChildProcess.build(*([ 'ruby', 'wrapper.rb', rbfile ] + args))
end

process = prepare_rb_process("test.rb")

And use the wrapper file, responsible for executing a ruby file with its arguments, but with STDOUT caching disabled:

# Disable STDOUT caching
$stdout.sync = true

# Get the rb file to execute
rb_file = ARGV[0]

# Adapt ARGV for this rb file to get its arguments
# correctly
# TODO: Maybe adapt other variables ...
ARGV.replace(ARGV[1..-1])

load rb_file

And here you go with the output:

>ruby -w pilot.rb
=Line1=> Line 1
=Line2=> Line 2
=InputLine1=> Type a string: My string 1
=Line3=> Line 3
=Line4=> Line 4
=InputLine2=> Type another string: My string 2
=Line5=> Line 5

This has been tested on Windows 7 native terminal (cmd.exe) and on Cygwin environment: works perfectly with both Ruby 1.8.7 and 1.9.2.

Conclusion:

It is now possible to pilot either Ruby and non-Ruby applications in real-time, parsing both STDOUT and STDERR, and controlling completely STDIN flow.

I will make a nice Gem wrapping up all this very soon.

I hope this will help other people by avoiding them the 2 headache days I got to come up with this solution!

Enjoy

PS: Thanks to @luislavena and @_philant_ for their precious help on this matter!

Categories

Recent Posts

How to pilot external processes’ stdin and stdout in real time using Ruby

1 thought on “How to pilot external processes’ stdin and stdout in real time using Ruby”

Leave a Reply Cancel reply