E-Database Brian Jepson

A Tree Grows in Perl

Some examples of true cross-platform programming in Perl


I've been thinking a lot about Perl lately. Maybe it has something to do with the fact that just after I finished working on O'Reilly and Associates' Unix and Win32 Perl Resource Kit, I worked on a project with ActiveState, the company that provides one of the leading distributions of Perl on Win32. Then I spent some time at the Perl Conference listening to how Perl is making everyone's life a little bit more interesting, if not a little bit better. While you might think I've gotten my fill of Perl by now, I'm actually quite excited about it right now.

One of the nice things about working on the Perl Resource Kit was that I had the pleasure of writing another book (the Perl Resource Kit Utilities Guide); I'm always happy when I get to write more books. But the best thing about the Perl Resource Kit was that two companies, ActiveState and O'Reilly, put resources into the improvement of a freely redistributable tool--Perl. This is anything but business as usual.

When a company puts resources into a software package that it owns, there is a tight coupling between the success of the software package and the success of the company. In the case of a freely redistributable software package such as Perl, the coupling is much looser: O'Reilly certainly has no shortage of books on Perl, but Perl's success does not guarantee increased book sales.

I believe that only individuals can pour their hearts into freely redistributable software. Corporations simply don't have hearts to pour into software, but they can do the next best thing: pay their employees to make nonproprietary enhancements to the software.

The New Perl

Enough of the soapbox; you're probably all wondering what O'Reilly and ActiveState did to Perl--clone it? Splice a second head onto it? Before I tell you what, I have to say they didn't do it alone; they had the help of the oneperl team and the Perl porting community. The oneperl team is a group of people charged with merging the core Perl source with the Win32 Perl source. (At one time, Perl for Win32 development branched off from the core Perl development and gained some platform-specific features at the expense of portability.)

Although the core Perl integrated support for Win32 some time ago, Windows users had to choose between using a distribution of Perl based on the core Perl sources or ActiveState's Perl for Win32, a specialized version of Perl that was not compatible with extensions designed for core Perl.

For instance, I could choose to use a distribution based on core Perl, and I would get nice things like the Tk module (a GUI environment for Perl), the GD module (a GIF-generation package for Perl), and a database independence module (DBI) for Perl. But I wouldn't get Win32-specific tools, such as PerlScript (an ActiveX scripting engine based on Perl that lets you use Perl instead of Visual Basic under Active Server Pages) or Perl for ISAPI (a version of the Perl interpreter that runs as a DLL through the Internet Server API), which runs under the Microsoft Internet Information Server family of Web servers and Web servers that support ISAPI.

When the oneperl team completed its effort, Windows users got the best of both worlds; they could use the ActiveState distribution of Perl to take advantage of Win32-specific tools and still enjoy the benefit of the many modules available for Perl. One of the first examples I developed was an Active Server Pages script that used the GD and GIFgraph modules to generate a graph each time you launched the document. This sort of thing used to be difficult with Perl on Win32!

The oneperl effort, as well as the efforts of the core Perl porters, resulted in Perl 5.005. Not only did 5.005 offer the merge of the core and Win32 Perl source, but it included some new (if slightly experimental) features such as threads and the Perl compiler.

These breakthroughs in Perl for Win32 mean I can use the same modules on Win32 that I'm so fond of on Unix. One module is DBI, which, like ODBC and JDBC, is based on drivers and driver managers. A driver manager defines the API that programmers use to talk to databases, and a driver implements the low-level details that let your programs actually connect to the database. The purpose of the driver manager is to shield the developer from the low-level interface details; these are left to the database driver.

Using the Tools

It's not enough for me to go on about some of the tools that are now in my toolchest; it's important to show you that we can build something useful with them. A good example of a GUI representation of a problematic data model is the self-referential table. (See Table 1.) Because Table 1 is small, I've shown you the contents of it in Table 2.

ColumnTypeDescription
foodCHARThe name of some kind of food (such as Milk, Skim Milk, or Cheese)
food_idCHARThe primary key.
parent_idCHARThe id of the parent of this food (for example, Milk is the parent of both Skim Milk and Cheese).
Table 1. The self-referential table.

 
foodfood_idparent_id
Food001NULL
Beans and Nuts002001
Beans003002
Nuts004002
Black Beans005003
Pecans006004
Kidney Beans007003
Red Kidney Beans008007
Black Kidney Beans009007
Dairy010001
Beverages011010
Whole Milk012011
Skim Milk013011
Cheeses014010
Cheddar015014
Stilton016014
Swiss017014
Gouda018014
Muenster019014
Coffee Milk020011
Table 2. An expanded version of the self-referential table.

Now, I'd like to see this information organized as a tree, as shown in Listing 1. (Note that the ID is included in case you want to try to correlate this to the table contents.) Is this too much to ask? Let's hope not. When I set out to tackle this problem, I realized I had done it many times before. In a former life, I tackled this problem in FoxPro, a dialect of xBase. I eventually tackled it in Perl and Java, too! So I had plenty of code to look at.

Food (001)
Dairy (010)
Beverages (011)
Coffee Milk (020)
Whole Milk (012)
Skim Milk (013)
Cheeses (014)
Cheddar (015)
Stilton (016)
Swiss (017)
Gouda (018)
Muenster (019)
Beans and Nuts (002)
Beans (003)
Black Beans (005)
Kidney Beans (007)
Red Kidney Beans (008)
Black Kidney Beans (009)
Nuts (004)
Pecans (006)

Listing 1. The Tree as we'd like to see it.

Something Old, Something New

I talked earlier about the new Perl 5.005; now I'm going to talk about something old. When I considered a test platform for the examples in this column, I decided I was going to go with Mini SQL as well as Sybase SQL Server (with the database drivers DBD::mSQL and DBD::Sybase).

Using DBD::Sybase, I can connect to a Sybase data server. I chose to use System 11, as well as an older version. I recently had the pleasure of obtaining a vintage version of Sybase for NeXTStep/Mach: Sybase SQL Server 4.1. This version is wild; it comes on two 2.88MB floppy diskettes, and it installed and ran seamlessly on my Color Turbo

NeXTStation. Amazingly, my example programs worked perfectly with both versions of Sybase, as well as Mini SQL.

Something Borrowed, Something Blue

The code that went into this example came from two sources. The visual tree component is from the Tk::Tree module for Perl/Tk. The code that traverses the tree is inspired by the expanding hierarchies example that I stumbled across in the "Microsoft SQL Server Database Developer's Companion" section of the Microsoft SQL Server Programmer's Toolkit. This example can be found on the Microsoft Developer's Network under the "Database and Messaging Services" section of Platform SDK Documentation.

Tried and true blue is the CGI example. If you are talking about Perl and databases you expect it, so I'll indulge your expectations and also supply a CGI-based example.

DBIx::Tree

DBIx::Tree is a module that deals with working over these self-referential databases. I wrote it just for this column. I figured if I was going to write some reusable code, I better go all the way. This module is available from the CPAN archive at: www.perl.com/CPAN/authors/Brian_Jepson/DBIx-Tree-0.90.tar.gz. It may also be available in the future as a package for the ActivePerl package manager.

The module is called DBIx::Tree because it uses the DBI module to achieve database independence. So far, I've tested it with Sybase SQL Server and Mini SQL, so the database independence end of things seems to be working.

If you don't have DBI and a database driver (DBD) installed on your box, you can get it quite easily. Using the ActivePerl distribution of Perl, you can run the commands:

ppm.pl install DBI DBD-drivername

If you are using Perl on Unix, or a nonActivePerl distribution on Win32, you should be able to install these modules with:

perl ýMCPAN ýe "install DBI"
perl ýMCPAN ýe "install DBD::drivername"

You will need to replace DBD::drivername with the name of the driver you select. See www.arcana.co.uk/technologia/perl/DBI/index.html for information on the DBI project and available drivers.

How does DBIx::Tree work? The first thing it needs to do is issue a query against the database, which you can handle using the do_query method. After that, you call the tree method, which uses a nonrecursive technique to traverse the tree.

DBIx::Tree is not omniscient; you need to supply certain things when you create an instance of it. The constructor is typically invoked as:

my $tree = new DBIx::Tree(connection => $dbh,
             table   => $table,
             method   => sub { disp_tree(@_) },
             columns  => [$id_col, $label_col, $parent_col],
             start_id  => $start_id);

(See Table 3 for an explanation of each parameter.)

ParameterDescription
connectionA DBI database handle.
tableThe name of the table containing the tree data.
methodA coderef to the method you want to invoke for each node.
columnsAn array reference containing the primary key column, the name column, and the parent primary key column (in that order).
start_idThe primary key ID at which to start traversing the tree.
Table 3. An explanation of each parameter in DBIx::Tree.

When it processes your tree, DBIx::Tree starts at the first element in the tree (start_id tells it where to start), and looks to see if that element has any children. If it does, it adds them to its "to-do list" (%stack). It then goes to the to-do list, and sees if there are any children to process, and as it did with the first element, adds any children of the current level to the to-do list. If it has processed all the children on a given level, it ascends a level, and looks for more items to process. Eventually, it makes its way through the tree, and processes each element. As it processes each element, it invokes a callback method, which is a method that you have to write yourself. Each time it's called, it is passed a single argument: a Perl hash (a Perl array that is indexed by a string, it is also known as an associative array). This hash contains the following elements:

You probably won't use all these parameters in an application, but because DBIx::Tree is designed to be multipurpose, enough information is included to at least support Tk (for a GUI display of the tree) and plain text or HTML rendering.

If you want to display the tree in plain text, you'll definitely be interested in the item and the level. (You can use the level number to indent the text.) Listing 2 is a simple example that prints out the text-based tree shown earlier in Listing 1. Now, if we want to get a little fancier, we'll need to develop a Web version of this. On to the CGI version!

#!/usr/bin/perl -w

use strict;

use DBI;
use DBIx::Tree;

# Connect to the Mini SQL Server. Replace the parameters with an appropriate
# connection string.
#
my $dbh = DBI->connect('DBI:mSQL:test:localhost', undef, undef);
if ( !defined $dbh ) {
    die $DBI::errstr;
}

# Create a new instance of the DBIx::Tree class.
#
my $tree = new DBIx::Tree(connection => $dbh, 
                          table      => 'food', 
                          method     => sub { disp_tree(@_) },
                          columns    => ['food_id', 'food', 'parent_id'],
                          start_id   => '001');

$tree->do_query;  # Load the data from the table.
$tree->tree;      # Make the tree. disp_tree() is called for each node.
$dbh->disconnect; # Close the database connection.

# This is the callback function for creating the tree. It's called for 
# each node in the tree.
#
sub disp_tree {

    # Get the parameters.
    #
    my %parms = @_;

    # We are interested in the item, the level, and its ID.
    #
    my $item  = $parms{item};
    my $level = $parms{level};
    my $id    = $parms{id};

    # Strip leading and trailing space from the item.
    #
    $item =~ s/^\s+//;
    $item =~ s/\s+$//;

    # Display each item and its ID, indented with (2 * level) spaces.
    #
    print ' ' x ($level * 2), "$item ($id) \n";

}

Listing 2. The source code for textexample.pl.

Trees and Webs

It's fairly easy to convert this example to run on the Web. However, instead of merely printing out formatted text, this example will use the <UL> tag to generate nested lists. It will follow two simple rules:

1. If the level increases, open another list with <UL>.
2. If the level decreases, close any lists (using </UL>) between this level and the level we're moving to. It's possible to ascend several levels, but we can only descend one level at a time.

See Listing 3 for the source code for the CGI version of this example. Its output is shown in Figure 1.


Figure 1. The output of sample.cgi.

 
#!/usr/bin/perl

use strict;

use CGI qw(:standard);
use DBI;
use DBIx::Tree;

use vars qw($curr_level);

# Connect to the Mini SQL Server.
#
my $dbh = DBI->connect('DBI:mSQL:test:localhost', undef, undef);
if ( !defined $dbh ) {
    die $DBI::errstr;
}

# Create a new instance of the DBIx::Tree class.
#
my $tree = new DBIx::Tree(connection => $dbh, 
                          table      => 'food', 
                          method     => sub { disp_tree(@_) },
                          columns    => ['food_id', 'food', 'parent_id'],
                          start_id   => '001');

$tree->do_query;  # Load the data from the table.

# Print the header and the start of the html.
#
print header();
print start_html( -title => 'Tree' );
print qq[<H1>Tree Example</H1><HR>];

$tree->tree;      # Make the tree. disp_tree() is called for each node.
$dbh->disconnect; # Close the database connection.

# Close any remaining <UL>s.
#
while ($curr_level > 0) {
    print ' ' x ($curr_level * 2), "</UL>\n";
    $curr_level--;
}

# End the HTML.
#
print end_html();

# This is the callback function for creating the tree. It's called for 
# each node in the tree.
#
sub disp_tree {

    # Get the parameters.
    #
    my %parms = @_;

    # We are interested in the item and its level.
    #
    my $item  = $parms{item};
    my $level = $parms{level};
    my $id    = $parms{id};

    # Strip leading and trailing space from the item.
    #
    $item =~ s/^\s+//;
    $item =~ s/\s+$//;

    # If the level increases, start a new list at the new level.
    #
    if ($level > $curr_level) {
        print ' ' x ($curr_level * 2), "<UL>\n";
    }

    # If the level decreases, end the lists between levels.
    #
    if ($level < $curr_level) {
        while ($curr_level > $level) {
            $curr_level--;
            print ' ' x ($curr_level * 2), "</UL>\n";
        }
    }
    $curr_level = $level;

    # Display each item.
    #
    print ' ' x ($curr_level * 2), "<LI>$item\n";

}

Listing 3. The source code for sample.cgi.

Tk and the Tree

Although it's not a widely known fact, Perl (on Win32 or the X Window System) can do GUI. The Perl/Tk module provides a nice way to develop sophisticated, cross-platform graphical applications. It is not only very easy to use, but it also allows for a great deal of expressiveness. (I mean, it's really powerful.)

Tk comes with a hierarchical list box widget called Tk::Hlist. While this widget is nice, I am somewhat partial to the Tk::Tree widget. They are both similar in some respects, but Tk::Tree pleased my eyes a little bit more.

If you don't have Tk and Tk::Tree installed on your box, you can get them quite easily. Using ActivePerl:

ppm.pl install Tk Tk-Tree

Or, using the CPAN module (Unix or a non-ActivePerl distribution of Perl Win32:

perl ýMCPAN ýe "install Tk::Tree"
perl ýMCPAN ýe "install Tk"

In either case, be prepared to wait a bit if you don't have a local package repository (or CPAN mirror) or you're connected to the net via a modem; Tk is a large package.

Tk::Tree is a fairly simple widget to work with. The key part of it is, of course, the items you want to put in a tree. Tk::Tree is designed around a filesystem metaphor--you generally add elements such as:

/
/usr
/usr/local
/usr/local/bin
/var
/var/spool

When you add these elements to the tree, you pass the full name of the element and a name that you want to display in the tree, as in:

$tree->add('/',       -text => '/');
$tree->add('/usr',      -text => 'usr');
$tree->add('/usr/local',   -text => 'local');
$tree->add('/usr/local/bin', -text => 'bin');
$tree->add('/var',      -text => 'var');
$tree->add('/var/spool',   -text => 'spool');

The first argument to add preserves the directory hierarchy information, and this is all Tk::Tree needs to know to draw the tree shown in Figure 2. Notice that I've used the short names for each node.


Figure 2. A simple tree generated by Tk::Tree.

 

We can use this approach with the food. Since the @parent_names argument to the callback method preserves the hierarchy, we can figure out that Skim Milk is actually "/Food/Dairy/Beverages/Skim Milk." And that's exactly what we do in Listing 4, the result of which is shown in Figure 3.


Figure 3. The results of tk-tree-example.pl.

 

#!/usr/bin/perl -w

use strict;

use Tk;
use Tk::Tree;
use Tk::Label;
use DBIx::Tree;

use vars qw(@list);   # the list of items in the tree.

# Connect to the datasource.
#
use DBI;
my $dbh = DBI->connect('DBI:mSQL:test:localhost', undef, undef);
if ( !defined $dbh ) {
    die $DBI::errstr;
}

# Create a new instance of the DBIx::Tree object.
#
my $dbtree = new DBIx::Tree( connection => $dbh, 
                            table      => 'food', 
                            method     => sub { disp_tree(@_) },
                            columns    => ['food_id', 'food', 'parent_id'],
                            start_id   => '001');

# Execute the query, and form the tree.
#
$dbtree->do_query;
$dbtree->tree;

# Create a new main window.
#
my $top = new MainWindow( -title  => "Tree" );

# Create a scrolled Tree widget.  Behind the scenes, we're forming
# each of the tree elements as a directory style listing. For example,
# Skim Milk is represented as "Dairy/Beverages/Skim Milk".  As long
# as we add the elements in the order in which they appear in the 
# tree, the tree will be able to figure out which element is the
# parent of each node we add.
#
my $tree = $top->Scrolled( 'Tree',
                           -separator       => '/',
                           -exportselection => 1,
                           -scrollbars      => 'oe',
                           -height => 20,
                           -width  => -1);
# Pack the tree.
#
$tree->pack( -expand => 'yes',
             -fill   => 'both',
             -padx   => 10,
             -pady   => 10,
             -side   => 'top' );

# When we ran $dbtree->tree earlier, the @list array was populated.
# It doesn't have a top element, so we need to pre-pend one to the 
# list ('/' below).
#
foreach ( '/', @list ) {

    # We don't want the user to see "Dairy/Beverages/Skim Milk",
    # so we'll strip off all but the last words for the label.
    #
    my $text = (split( /\//, $_ ))[-1]; 

    # If we're on /, let's make its label blank.
    #
    if ($_ eq '/') {
        $text = "";
    }

    # Add the item (in $_) with $text as the label.
    #
    $tree->add( $_, -text => $text );
    
}

$tree->autosetmode();

MainLoop();
$dbh->disconnect;

# This is the callback for the $dbtree->tree method. Each time
# A node is added, this method is called.
#
sub disp_tree {

    # Get the paremeters
    #
    my %parms = @_;

    # Get the item name and the list of parent nodes.
    #
    my $item        = $parms{item};
    my @parent_name = @{ $parms{parent_name} };

    # Always need a leading /
    #
    my $treeval = "/";

    # Add each parent to the item name, separated by a /
    #
    foreach (@parent_name) {

        s/^\s+//;   # strip leading...
        s/\s+$//;   # ... and trailing whitespace
        $treeval .= "$_/";

    }

    # Strip leading and trailing whitespace from the item name.
    #
    $item =~ s/^\s+//;
    $item =~ s/\s+$//;

    # add the item name to the string.
    #
    $treeval .= $item;

    # Add this item to the array of items.
    #
    push @list, $treeval;

}

Listing 4. The source code for tk-tree-example.pl.

Tree-mendous!

Well, that's it--after three examples, you're probably sick of trees. But if you ever need to manipulate, brief, debrief, or number a tree, you can start here with some of the examples. Let's not miss the forest for the trees, though. One of the things we've seen here is true cross-platform programming in Perl. Not only do the examples run on either Windows or Unix, but they also should run with any database supported by a DBI driver. In examining these leafy examples, we've seen that Perl 5.005's biggest gift to the community is interoperability.

Brian Jepson has written a fistful of books on Perl and other topics, such as Internet database development. He is also the Minister of Information for the SMT Computing Society, the elite technical strike force that manages the care and feeding of AS220's (www.as220.org) Technology Project. You can reach him via email at bjepson@ids.net.


 
search - home - archives - contacts - site index
 

Copyright © 1998 Miller Freeman Inc. All Rights Reserved
Redistribution without permission is prohibited.

Questions? Comments? We would love to hear from you!