= Tree Representation Module
This module supports the creation, search, manipulation, and serialization of tree structures.
Trees are implemented with Treebank::Node objects. Each Node has a writable label that may be any arbitrary object and a list of other child Node objects.
require 'treebank'
=> true
p = Treebank::Node.new('parent')
=> <Treebank::Node parent []>
p.create_child!('child1')
=> <Treebank::Node child1 []>
p.create_child!('child2')
=> <Treebank::Node child2 []>
Node has a subclass Treebank::ParentedNode that keeps track of the parent of the given node and has methods for iterating up the ancestor tree.
Node objects support breadth- and depth-first iteration. The functions each_depth_first and each_breadth_first yield a node and its descendants in the specified order. The functions depth_first_enumerator and breadth_first_enumerator wrap these functions inside Enumerator[http://www.ruby-doc.org/core/classes/Enumerable/Enumerator.html] objects. See the function documentation for details as to how the enumeration may be controlled.
The default stringification method writes a node and all its children in a bracketed tree format.
puts p
(parent
(child1 )
(child2 ))
Bracketed tree strings can be used to create Node trees.
t = Treebank::Node.from_s('(parent (child1) (child2))')
=> <Treebank::Node parent [child1 child2]>
puts t
(parent
(child1 )
(child2 ))
The bracketed tree format is the one used by the {Penn Treebank Project}[http://www.cis.upenn.edu/~treebank] to annotate linguistic structure.
The children of a node can be dereferenced with the array operator using a list of child offsets. For example:
> t = Treebank::Node.from_s("(A (B b) (C (D d) (E e)))")
=> <Treebank::Node A [B C]>
> t[1,0]
=> <Treebank::Node D [d]>
Here the index (1,0) finds the 0th child of the node's 1st child, which is the node (D d).
= History
1-0-0:: First release
1-1-0:: Removed unnecessary fsa dependency from gemspec
2-0-0:: Changed from_s initialization
2-1-0:: Add indented multiline stringification; Add preterminal?
3-0-0:: Add Node.children function. Add child deferencing by index. The each_depth_first and each_breadth_first functions now yield nodes instead of returning private enumerator objects. To get enumerator objects, use depth_first_enumerator and breadth_first_enumerator instead. Likewise, the each_parent function in the ParentedNode class yields nodes, while parent_enumerator returns an enumerator.
= See Also
Lingua::Treebank[http://search.cpan.org/~kahn/Lingua-Treebank/] implements similar functionality in Perl.
= Copyright
Copyright 2006-2008, William Patrick McNeill
This program is distributed under the GNU General Public License.
= Author
W.P. McNeill mailto:billmcn@u.washington.edu