Name: | FormatR
Description: | Perl like formats for ruby
Author: | Paul Rubel (
Release: | 1.09
Homepage: |
Date: | 29 January 2005
License: | You can redistribute it and/or modify it under the same term as Ruby.
Copyright © 2002,2003,2005 Paul Rubel
To Test this code:
Try test_format.rb with no arguments. If nothing is amiss you should see OK
(??/?? tests ?? asserts). This tests the format output against perl output
(which is in the test directory if you don’t have perl). If you would
like to see the format output try test_format.rb —keep which will
place the test’s output in the file format_testfile{1-10}
Class FormatR::Format in module FormatR provides perl like formats for ruby. For a
summary of the methods you’re likely to need please see FormatR::Format. Formats are used to create
output with a similar format but with changing values.
For example:
require "format.rb"
include FormatR
top_ex = <<DOT
Piggy Locations for @<< @#, @###
month, day, year
Number: location toe size
ex = <<TOD
@) @<<<<<<<<<<<<<<<< @#.##
num, location, toe_size
body_fmt = (top_ex, ex)
num = 1
month = "Sep"
day = 18
year = 2001
["Market", "Home", "Eating Roast Beef", "Having None", "On the way home"].each {|location|
toe_size = (num * 3.5)
num += 1
When run, the above code produces the following output:
Piggy Locations for Sep 18, 2001
Number: location toe size
1) Market 3.50
2) Home 7.00
3) Eating Roast Beef 10.50
4) Having None 14.00
5) On the way home 17.50
More examples are found in test_format.rb
Supported Format Fields
Standard perl formats
These are explained at
and include:
- left justified text, @<<<
- right justified text, @>>
- centered text @||| all of whose length is the number of characters in the
- It also supports fields that start with a ^ which signifies that the input
is a large string and after being printed the variable should have the
printed portion removed from its value.
- Numeric formats of the form @##.## which let you decide where you want a
decimal point. It will add extra zeroes to the fractional part but if the
whole portion is too big will write it out regardless of your specification
(regarding the whole as more important than the fraction).
- A line that contains a ~ will be suppressed if it will be blank
- A line that contains ~~ will repeat until it is blank, be sure to use this
feature with at least one field starting with a ^.
Scientific formats of the form @.G##, @.g##, @.E##, and @.e##
- The use of G, g, E, and e is consistent with their use in printf.
- If a G or g is specified the number of characters before the exponent,
excluding the decimal point, will give the number of significant figures to
be used in the output. For example: @.#G### with the value 1.234e-14 will
print 1.23E-14 which has 3 significant figures. This format @##.##g### with
the value 123.4567E200 produces 1.23457e+202, with 6 significant figures.
The capitalization of G effects whether the e is lower- or upper-case.
- If a E or e is used the number of hashes between the decimal point and the
E or e tells how many digits to print after the decimal point. The number
of hashes after the precision argument just adds to the number of spaces
available, I can’t see how to reasonably adjust that given the other
constraints. For example the format @##.E### with the value 123.4567E200
produces 1.2E+202 since there is only one hash after the decimal point.
- More examples of using the scientific formats can be found in
The class FormatR::FormatReader can
be used to read in text that has been output with a given format and
attepmts to extract the values of the variables used as the input. It does
a good job of simple formats, I’m sure that there are complex ones
that can confuse it. Multi-line formats are supported but as the program
can’t be sure what the initial input looked like, and how it was
broken across lines, every piece of a line is made to have at least one
space after it.
For example: if you had the following format:
and you fed it the string abcdef you would get the following:
But when var was assigned to it would be var = ‘abc def’
I don’t know how to decide which is better. Perhaps an argument would
The classes of variables
It’s not always possible to infer the class of the variable that made
the format. By not taking in a binding to compare with many variables will
end up as strings. Numeric formats should come out as numbers but all
others will be strings and will need to be converted manually.
Using the FormatReader is
relatively simple. You pass in a format to the constructor and then call
readFormat and give in an array of formatted text. It will return a hash
with the key/value pairs of the variables in the format. It can also be
called with a block that is passed the hash.
For example:
f = []
# make a format
f.push( '<?xml version="1.0"?>' )
f.push( '@@@ Blah @@@ }Blah @< @|| @#.#' )
f.push( 'var_one,var_one,var_one,var_one,var_one,var_one,' +
' var_two, var_three, var_four')
f.push( '@<<< @<<<')
f.push( 'var_one,var_one')
format =
#set values and print it out.
var_one, var_two, var_three, var_four = 1, 2, 3, 4.3
output_filename = "format_testfile12" output_filename, File::CREAT | File::WRONLY | File::TRUNC ) { |file| = file
# read in the output
output = [] output_filename ){ |file|
output = file.readlines()
# make a new FormatReader
reader = (format)
# Read in the values
res = reader.readFormat (output)
# Check that the values are correct
assert (res['var_one'] == var_one.to_s)
assert (res['var_two'] == var_two.to_s)
assert (res['var_three'] == var_three.to_s)
assert (res['var_four'] == var_four)
# or using a block for reading multiple lines:
reader.readFormat (output) do |res|
assert (res['var_one'] == var_one.to_s)
assert (res['var_two'] == var_two.to_s)
assert (res['var_three'] == var_three.to_s)
assert (res['var_four'] == var_four)
- Added a block form of readFormat that lets you loop through output instead
of having to make your own loop
- Moved to Test::Unit from RubyUnit.
- Made things work with 1.8.0pre releases. Hopefully we’ll be ready for
1.8.0 when it finally comes out while maintaining 1.6.x compatability.
- You can now use formats without having to use eval. If you pass in a hash
of names to values that can be used instead. There is also an optimization
you can use by calling format.useHash(true) that will turn your binding
into a hash while the format is being printed. This may speed things up.
The default is still to use eval so that things do not break as some
dynamic formats may not work with a hash. When a value is computed using
side effects of some other evaluation that has taken place while printing
the format a hash won’t work. You can also use the
printFormatWithHash method is you want to avoid evaling entirely. test_four
in test_format.rb shows one example of how to use hashes to print formats
- Page numbers are now working correctly. Before if you had a page number in
a header or footer it was problematic. The printing of a page has been
refactored and now works much better.
- Thanks to Amos Gouaux for suggesting the setLinesLeft method!
- I thought that the ~ had to be in the front of the picture line, this
isn’t so. If you place the ~~ anywhere in the line it will repeat
until the line is empty.
- Added the FormatReader to read in
formatted text and get values back
- Hugh Sasse sent in a patch to clean up warnings. I was sloppy with my
spacing but hopefully have learned better. Thanks Hugh!
- Fixed a bug in repeating lines using ~~ when the last line wouldn’t
get placed correctly unless it ended with a ’ ‘
- Fixed a bug where a line that started with a <,>, or | would loose
this character if there wasn’t a @ or ^ before it. The parsing of the
non-picture parts of a picture line is greatly improved.
- Added a scientific notation formatter so you can use @#.##E##, @##.#e##,
@#.##G##, or @##.#g##. The use of G and E is consistent to their use in
printf. If a G or g is specified the number of characters before the
exponent excluding the decimal point will give the number of significant
figures to be used in the output. If a E or e is used the number of hashes
between the decimal point and the E tells how many digits to print after
the decimal point. The number of hashes after the E just adds to the number
of spaces available, I can’t see how to reasonably adjust that given
the other constraints.
- If perl isn’t there use cached output to test against.
- better packaging, new versions won’t write over the older ones when
you unpack
- Changed the call. In
the past you could pass in an IO object as a second parameter. You now need
to use the method as the signature of has changed as shown
below. None of the examples used the second parameter so hopefully
it’s safe to change
- Added optional arguments to so you can set top, body,
and middle all at once like so, middle, bottom) or
even, middle). If
you want a bottom without a top you’ll either need to call setBottom
or pass nil or an empty format for top like so (nil, middle, bottom)
- Made the testing script clean up after itself unless you pass the -keep
- Modified setTop and setBottom so you can pass in a string or an array of
strings that can be used to specify a format instead of having to create
one yourself. Thanks again to Hugh Sasse for not settling for a second rate
- Move test_format.rb over to runit.
- Added functionality so that if you pass in a format string, or array of
strings to setTop or setBottom it does the right thing. This way you
don’t need to make the extra formats just to pass them in.
- Allow formats to be passed in as arrays of strings as well as just long
- Added functionality so that if the first format on a page is too long to
fit on that page it will be printed partially with a bottom. Perl seems to
just print the whole thing and ignore the page size in this case.
- Fixed a bug where if your number didn’t have a fractional part it
would crash if you used a format that need a fractional portion like @##.##
- On the recommendation of Hugh Sasse added finishPageWithoutFF(aBinding,
io=@io) and finishPageWithFF(aBinding, io=@io) which will print out blank
lines until the end of the page and then print the bottom, with and without
a ^L. Only works on fixed sized bottoms.
- Moved to rdoc for generating documentation.
- Bottoms work iff you have a fixed size format and print out a top
afterwords. This means that you will only get a bottom if you will print a
top right after it so the last format page you print won’t have a
bottom. It’s impossible to figure out if you are done with the format
and therefore need to print the bottom. Perhaps in a future release we can
just take fixed sized bottoms off the available size and get them to work
that way.
- Added support for Format.pageNumber()
- Support ~ to be a space
- Support ~ to suppress lines when the variables are empty
- Support ~~ to repeat until the variables are empty
- Support comments. If the first character in a line is a # the line is a
- Testing now compares against perl, it’s a bit easier than writing the
tests manually.
- Added support for the ^ character to start a format
- Added end of page characters and introduced line counts.
- Added the ability to manipulate the line count in case you write to the
file handle yourself
- Added format sizes. They just give the number of lines in the current
format. They don’t try to iterate and get some total count including
tops and bottoms.
- If you use bottom be sure to check that you’re happy with the output.
It doesn’t currently work with variable sized bottoms. You can use
the finishPageWith{out}FF(…) methods to print out a bottom if
you’re done printing but haven’t finished a page.
- Watch out for @#@??? as formats, see [ruby-talk:27782] and
[ruby-talk:27734]. This should be fixed in a future version of ruby. The
basic problem is that the here documents are equivalent to "" and
not ’’, they will evaluate variables in them. If this is a
problem be sure to just make a long string with ’’ and pass
that in. You can also pass in a string of arrays.
- Rounding seems to be broken in perl, if you try to print the following
format with 123.355 you won’t get the same answer, you’ll get
123.35 and 123.36. FormatR rounds up and plans
to unless there is a convincing reason not to.
format TEST_FORMAT =
^#.### ^##.##
$num, $num
I’m betting that perl must use round to even or odd. this needs to be
looked into
To Do/Think about:
Thanks go to
Hugh Sasse for his enlightening comments and suggestions. He has been
incredibly helpful in making this package usable. Amos Gouaux has also been
helpful with suggestions and code. Thanks to both of you.