This tutorial will describe the basic usages of the tool and the types supported by RubyBreaker.
RubyBreaker takes advantage of test cases that already come with the source program. It is recommended that RubyBreaker is run as a Rake task, which does require a minimum change in the Rakefile (but no code change in the source program) but is better for a long-term maintenance. Regardless of the mode you choose to run, no source code change is required.
Let us briefly see how RubyBreaker can be run directly as a command-line program to understand the overall concept of the tool. We will explain how to use RubyBreaker in a Rakefile later. The following is the banner of RubyBreaker when running in the command-line mode:
Usage: rubybreaker.rb [options] prog[.rb]
-b, --break MODULES Specify modules/classes to 'break'
-c, --check MODULES Specify modules/classes to check
-l, --libs LIBRARIES Specify libraries to load
--debug Run in debug mode
--style STYLE Select type signature style - underscore or camelize
--io-file FILE Specify I/O file
--save-output Save output to file
-s, --[no-]stdout Show output on screen
-a, --[no-]append Append output to input file
-v, --verbose Show messages in detail
-h, --help Show this help text
Here is an example of running Rubybreaker as a command.
$ rubybreaker -v -s -l lib.rb -b A,B prog.rb
This runs RubyBreaker in verbose mode (-v) and will display
the output on the screen (-s). Before RubyBreaker runs prog.rb, it will
import (-l) lib.rb and break (-b) classes A and B.
Here is the source of lib.rb:
class A
def foo(x)
x.to_s
end
end
class B
def bar(y,z)
y.foo(z)
end
end
And, prog.rb simply imports the library file and executes A#foo with a
number:
require "lib"
A.new.foo(1)
After running the command shown above, the following output will be generated and displayed on the screen:
class A
typesig("foo(fixnum[to_s]) -> string")
end
This example shows how RubyBreaker documents the type of A#foo.
Here, the typesig method call registers foo as a method type that takes
an object that has Fixnum#to_s method and returns a String. This
method is made available simply by importing rubybreaker. Now, assume
that an additional code, B.new.bar(A.new,1), is added at the end of
prog.rb. It generate the following result:
class A
typesig("foo(fixnum[to_s]) -> string")
end
class B
typesig("bar(a[foo], fixnum[to_s]) -> string")
end
Keep in mind that RubyBreaker is designed to gather type information based on the actual execution of the source program. This means the program should be equipped with test cases that have a reasonable program path coverage. Additionally, RubyBreaker assumes that test runs are correct and the program behaves correctly (for those test runs) as intended by the programmer. This assumption is not a strong requirement, but is necessary to obtain precise and accurate type information.
RubyBreaker gives you an option to run the early dynamic type check. If a
method is documented and its module is specified to be type checked, then
RubyBreaker will verify if the argument types and the return type are
appropriate at runtime. To allow this feature in command-line, use -c
(checking) option:
rubybreaker -l lib.rb -c A prog.rb
Here, the library code lib.rb will be imported (-l) and class A will
be instrumented for type checking (-c) during the execution of prog.rb.
We will explain how to run type checking in non-command mode.
Instead of manually inserting the entry point indicator in the source program, you can take advantage of Ruby’s built-in testing framework. This is preferred to modifying the source program directly, especially for the long term program maintainability. But no worries! This method is as simple as the previous one.
require "test/unit"
require "rubybreaker" # This should come after test/unit.
class TestClassA < Test::Unit::TestCase
def setup()
RubyBreaker.break(Class1, Class2, ...)
...
end
# ...tests!...
end
That’s it! The only requirements are to indicate to RubyBreaker which modules
and classes to "break" and to place require rubybreaker after
require test/unit.
The requirement is same for RSpec but use before instead of setup to
specify which modules and classes to "break".
require "rspec"
require "rubybreaker"
describe "TestClassA Test"
before { RubyBreaker.break(Class1, Class2, ...) }
...
# ...tests!...
end
By running RubyBreaker along with the Rakefile, you can avoid modifying the
source program at all. (You no longer need to import rubybreaker in the
test cases neither.) Therefore, this is the recommended way to use
RubyBreaker. The following code snippet describes how it can be done:
require "rubybreaker/task"
...
desc "Run RubyBreaker"
Rake::RubyBreakerTestTask.new(:"rubybreaker") do |t|
t.libs << "lib"
t.test_files = ["test/foo/tc_foo1.rb"]
# ...Other test task options..
t.rubybreaker_opts << "-v" # run in verbose mode
t.break = ["Class1", "Class2", ...] # specify what to monitor
end
Note that RubyBrakerTestTask can simply replace your TestTask block in
Rakefile. In fact, the former is a subclass of the latter and includes all
features supported by the latter. The only additional options are
rubybreaker_opts which is RubyBreaker command-line options and
break which specifies which modules and classes to monitor. Since
Class1 and Class2 are not recognized by this Rakefile, you must use
string literals to specify modules and classes (and with full namespace).
If this is the route you are taking, there needs no editing of the source program whatsoever. This task will take care of instrumenting the specified modules and classes at proper moments.
Previously, we explained how to perform type checking in the command mode.
You can also specify which modules/classes to type check in the Rakfile. The
following shows an example usage of check option.
require "rubybreaker/task"
...
desc "Run RubyBreaker"
Rake::RubyBreakerTestTask.new(:"rubybreaker") do |t|
t.libs << "lib"
t.test_files = ["test/foo/tc_foo1.rb"]
# ...Other test task options..
t.rubybreaker_opts << "-v" # run in verbose mode
t.break = ["Class1", "Class2", ...] # specify classes to monitor
t.check = ["Class3", ....] # specify classes to check
end
Alternatively, you can use RubyBreaker.check() to specify classes to type
check in the setup code of the test case.
The annotation language used in RubyBreaker resembles the method documentation used by Ruby Core Library Doc. Each type signature defines a method type using the name, argument types, block type, and return type. But, let us consider a simple case where there is one argument type and a return type.
class A
...
typesig("foo(fixnum) -> string")
end
In RubyBreaker, a type signature is recognized by the meta-class level
method typesig which takes a string as an argument. This string is the
actual type signature written in the Ruby Type Annotation Language. This
language is designed to reflect the common documentation practice used by
Ruby Core Library Doc. It starts with the name of the method. In the
above example, foo is currently being given a type. The rest of the
signature takes a typical method type symbol, (x) -> y where x is the
argument type and y is the return type. In the example shown above, the
method takes a Fixnum object and returns a String object. Note that
these types are in lowercase, indicating they are objects and not modules or
classes themselves.
There are several types that represent an object: nominal, duck, fusion, nil, 'any', 'or', optional, variable-length, and block. Each type signature itself represents a method type or a method list type (explained below).
This is the simplest and most intuitive way to represent an object. For
instance, fixnum is an object of type Fixnum. Use lower-case letters and
underscores instead of camelized name. MyClass, for example would be
my_class in RubyBreaker type signatures. There is no particular
reason for this convention other than it is the common practice used in
RubyDoc. Use / to indicate the namespace delimiter ::. For example,
NamspaceA::ClassB would be represented by namespace_a/class_b in
a RubyBreaker type signature.
This type is similar to the nominal type but is referring to the current
object--that is, the receiver of the method being typed. RubyBreaker will
auto-document the return type as a self type if the return value is the same
as the receiver of that call. It is also recommended to use this type over
a nominal type (if the return value is self) since it depicts more
precise return type.
This type is inspired by the Ruby Language’s duck typing, "if it
walks like a duck and quacks like a duck, it must be a duck." Using this
type, an object can be represented simply by a list of method names. For
example [walks, quacks] is an object that has walks and quacks
methods. Note that these method names do not reveal any type
information for themselves.
Duck type is very flexible but can be too lenient when trying to restrict
the type of an object. RubyBreaker provides a type called the fusion type
which lists method names but with respect to a nominal type. For
example, fixnum[to_f, to_s] represents an object that has methods to_f
and to_s whose types are same as those of Fixnum. This is more
restrictive (precise) than [to_f, to_s] because the two methods must have
the same types as to_f and to_s methods, respectively, in Fixnum.
A nil type represents a value of nil and is denoted by nil.
RubyBreaker also provides a way to represent an object that is compatible with
any type. This type is denoted by ?. Use caution with this type because
it should be only used for an object that requires an arbitrary yet most
specific type--that is, ? is a subtype of any other type, but any
other type is not a subtype of ?. This becomes a bit complicated for
method or block argument types because of their contra-variance
characteristic. Please refer to the section Subtyping.
Any above types can be "or"ed together, using ||, to represent an object
that can be either one or the other. It does not represent an object that
has to be both (which is not supported by RubyBreaker).
Another useful features of Ruby are the optional argument type and the
variable-length argument type. The former represents an argument that has a
default value (and therefore does not have to be provided). The latter
represents zero or more arguments of the same type. These are denoted by
suffices, ? and *, respectively.
One of the Ruby’s prominent features is the block argument. It allows
the caller to pass in a piece of code to be executed inside the callee. This
code block can be executed by the Ruby construct, yield, or by directly
calling the call method of the block object. In RubyBreaker, this type can
be respresented by curly brackets. For instance, {|fixnum,string| -> string} represents a block that takes two arguments--one Fixnum and one
String--and returns a String.
RubyBreaker does supports nested blocks as Ruby 1.9 finally allows them.
However, keep in mind that RubyBreaker cannot automatically document the
block types due to yield being a language construct rather than a method,
which means it cannot be captured by meta-programming!
Method type is similar to the block type, but it represents an actual method and not a block object. It is the "root" type that the type annotation language supports, along with method list types. Method list type is a collection of method types to represent more than one type information for the given method. Why would this type be needed? Consider the following Ruby code:
def foo(x)
case x
when Fixnum
1
when String
"1"
end
end
There is no way to document the type of foo without using a method list
type. Let’s try to give a method type to foo without a method list. The
closest we can come up with would be foo(fixnum or string) -> fixnum and string. But RubyBreaker does not have the "and" type in the type annotation
language because it gives me an headache! (By the way, it needs to be an
"and" type because the caller must handle both Fixnum and String return
values.)
It is a dilemma because Ruby programmers actually enjoy using this kind of
dynamic type checks in their code. To alleviate this headache, RubyBreaker
supports the method list type to represent different scenarios depending on
the argument types. Thus, the foo method shown above can be given the
following method list type:
typesig("foo(fixnum) -> fixnum")
typesig("foo(string) -> string")
These two type signatures simply tell RubyBreaker that foo has two method
types--one for a Fixnum argument and another for a String argument.
Depending on the argument type, the return type is determined. In this
example, a Fixnum is returned when the argument is also a Fixnum and a
String is returned when the argument is also a String. When
automatically documenting such a type, RubyBreaker looks for the (subtyping)
compatibility between the return types and "promote" the method type to a
method list type by spliting the type signature into two (or more in
subsequent "promotions").