Short Order Ruby - Ben Griffiths

So what is "short order ruby"? We've all spent time working on big projects where it takes a while to get a lot done. Ben enjoys the quick turn around you can get with a short Ruby script. This is a talk about the magic you can pull off with a few tricks up your sleeve.

Ben Griffiths presenting Short Order Ruby

What kind of magic does he mean? Well last week @eBeth ran a challenge on Twitter in which she challenged people to publish a day in their life, posting pictures online. Ben's mate took the challenge, then realised that all this stuff would disappear from Twitter after a few days. Could it be made a bit more permanent? Ben has the skills to cook up a script to scrape all the data from off the web (and, most likely, you do too). This talk is a celebration of those skills, and the tools you can use to achieve it. It's about gluing together little bits of software to create something cool.

This idea of gluing stuff together may remind you of the Unix command line, which Ben pointed out "is the idea that the power of a system comes more from the relationships among programs than from the programs themselves."

He gave a quick example of how you can count the number of lines in a bunch of HTML files in a directory:

$ cat *.html | wc -l

You can go further, and loop over a bunch of filenames. This is a pipemill:

$ find . -path '*.html ' |
(
 while read filename; do
   gzip $filename
 done
)

So far everything we've seen has been Unix, but you can do the same thing with ruby:

$ find . -path '*.html ' |\
ruby -e \
  'while(gets) do
    `gzip #{$_}`;
   end'

Note $_ -- a special variable that contains the return value of the last call to gets.

Ruby lets you do it even more easily with the -n flag, which operates on every line of input, effectively building the while loop for you:

$ find . -path '*.html ' | ruby -ne '`gzip #{_}`'

Lots of Unix commands will either take a filename on which to operate, or take some input on STDIN. For example, wc:

$ cat file.html | wc
$ wc file.html

We can do the same thing in Ruby with ARGF. It's a global variable that provides a stream either from STDIN or from the files that you specify on standard in (multiple files are joined together):

# reverse.rb
ARGF.each do |line|
  puts line.revers
end

Run it like this:

$ cat file file2 | reverse.rb
$ reverse.rb file1 file2

Neat. ARGF is actually one of the very few things that I'm glad Ruby borrowed from Perl.

Here's the first magic Ruby script that Ben can recall writing:

#!/usr/bin/env ruby -i
# Using the -i flag allows us to edit the files in place.

Header = DATA.read
ARGF.each_line do |e|
  puts Header if ARGF.pos - e.length == 0
  puts e
end

__END__
#--
# Don't steal this. mkay?
#++

Beware -i (on the first line) as it'll directly modify any files that you specify on the command line. The DATA variable is populated from the bottom of the file; it's the stuff that follows __END__. You can probably work out what the script does by reading it. If not, create a new text file, add a few lines to it, and then try Ben's script on it.

Here's another example. Ben once needed to send his colleagues a graph every week, based on some figures in a MySQL database. He thought the obvious way to do that was to create an ASCII bar chart, using Ruby (I like the use of "obvious"). When you change the pager in MySQL all the output from a query gets passed through the program that you specify (much like the pipe examples above). So here's the chart (the bars are the wrong length -- there's only so fast a boy can type so I've blatantly added the bars in after the event):

mysql> \P barchart.rb
  PAGER set to 'barchart.rb'
mysql> select * FROM languages;

Ruby    ============================= (200/60%)
Java    ============== (100/30%)
Fortran ======= (30/9%)
C#      = (3/1%)

4 rows in set (0.00 sec)
mysql>

Here's the code for parsing the MySQL input:

3.times { ARGF.readline }

ARGF.each do line|
  match = line.match(MYSQL_REGEX)
  category, value = $1, $2
  next unless category && value
  chart[category] += value.to_f
end

Then Ben used this to generate the output (Ben actually drew the bars with a swanky ASCII block character, which I've substituted for an "=" sign, due to abject laziness):

max_value = chart.values.max
max_label_len = chart.keys.max { |x| x.size }

chart.each do |category, value|
  scaled_value = (value / max_label_len) * 40
  stars = "=" * scaled_value
  printf("$-#{max_label_len}s %s (%d)\n",
      category, stars, value)
end

Okay, let's move onward to a discussion of buffering. Try this:

$ ruby -e \
  'loop { puts 1; sleep 1 }' | cat

Nothing will come out for a while. Why? When you pipe output from one command to another the content is buffered. It's a performance thing, but it can cause trouble. It introduces a delay into the pipeline. If you want to turn it off, you can do it like this:

$ ruby -e \
  '$stdout.sync = true; \
   loop { puts 1; sleep 1 }' | cat

You can call other commands from your script with the backtick operator (also available as %x). They capture the output of the command so that you can store it in a variable:

>> username = %x{whoami}
>> username = `whoami`

You can also use system() (which returns true if the command worked successfully) or exec() which will quit your current program, replacing it with the command that you launch. They're less useful in the short order hacking world, more appropriate for "proper work".

Here's a neat one:

def count_lines(text)
  IO.popen('wc -l', 'r+') do |wc|
    wc.puts text  # wc's STDIN
    wc.close_write
    result = wc.read  # wc's STDOUT
  end
end

popen() does a "piped open". It allows you to read/write to another program's STDIN or STDOUT. Note that you need to specify whether you want to be able to read (with r+) or write (with w) to the pipe.

You can also fork a child process in Ruby, and keep a handle on the input and output streams of the parent and child:

IO.popen('-', 'r+') do |filehandle|
  if filehandle
    # I am the parent process
  else
    # I am the child process
  end
end

You may have noticed that we've not seen STDERR yet in any of these Ruby snippets. You can get at it if you use popen3() instead:

require 'open3'

Open3.popen3('rm NO_SUCH_FILE') do |stdin, stdout, stderr|
  puts stderr.read
end

Ben used these tricks to run a text adventure (and not just any game, one of his favourites; Sherlock Holmes and the Adventure of the Crown Jewels) on Campfire. dfrotz is a command line program that runs text adventure games. Here's a snippet of the code that interacts with dfrotz:

frotzin, frotzout, frotzerr = Open3.popen3(
  "/Users/ben/bin/dfrotz -Z0 -w 3000
      -p /Users/ben/if/Sherlock21.z5"
)

So how do you get the text from dfrotz and inject it into Campfire? There's a campfire API for squirting text at Campfire (not shown here), but this is the code that reads text from dfrotz and sends it off to the Campfire room:

Thread.new do
  while output = select([frotzout, frotzerr], nil, nil, 3) do
    output[0].each do |stream|
      room.speak(stream.gets)
    end
  end
end

select() is asking the frotzout and frotzerr streams whether they have any data that is ready to be processed. When there's anything to read on frotzout or frotzerr then select() will return them, so their readable data can be read and squirted into the Campfire room.

All you need to do now is take the text that people type into Campfire and send it back to dfrotz:

room.listen do |m|
  message = m[:message]
  if message == message.upcase then
    frotzin.puts message
  end
end

That's it; text adventures in Campfire. Impressive.

Further things that Ben recommends you check out:

  • The Ruby PTY Library; allows you to operate a pseudo terminal.
  • pbcopy and pbpaste; interact with the Mac clipboard from the terminal.
  • Command line parsing - a lot of libraries exist for doing this. Apparently trollop is good.
  • Unix tools such as awk, sed, grep and find.

Here's a good quote (though I'm not sure who said it) "The tools we use have a profound (and devious!) influence on our thinking habits, and, therefore, on our thinking abilities." Hear hear.

A superb talk from Mr Griffiths. Find him on Twitter at @beng. Take control of your computer, people!

More talks from Ruby Manor