The following article was printed in October 1980 of the magazine „BYTE".
|
Bill Walker
University of Oklahoma School of Electrical Engineering and Computer Science 202 W Boyd Normian OK 73019 |
Figure 3:
A simple binary tree.
Data is arranged within a binary tree so that the sequence right child, parent, left child gives the contents of the tree in sorted order.
Here, the sorted order is 12, 15, 25.
|
Figure 4:
Adding to a binary tree.
The colored arrow shows the direction of search when placing a new node in a binary tree.
Here, 14 is greater than 12, so it is ptaced as a right child of 12.
Here, the sorted order is 12, 14, 15, 25.
|
Figure 5:
Example of a binary tree.
Here, more nodes have been added, giving the sequence 10, 12, 14, 15, 20, 25, 26.
|
Figure 6:
Example of a binary tree.
Here (6a), the binary tree of figure 5 is reproduced with numbers above each node that give the order in which the nodes were added to the tree.
The table below (6b) gives the node number, key, left pointer, and right pointer of each node; both pointers refer to the node number of a node, not its key value.
The ward NIL indicates there is no node in the direction given by the pointer.
|
Figure 7:
Computer representation of a binary tree.
In the BASIC program of listing 1, each node is represented by four numbers:
the node number (given for purposes of illustration only), the key, and the right and left pointers.
To "thread" the tree so that a thread of pointers runs through the nodes in their sorted-key order, a right pointer, when negative, points not to a right child node but to the ancestor node that follows the current node in the sorted sequence; these upward right pointers are drawn in color.
|
Listing 1: BASIC program for sorting a threaded binary tree. This program allows the user to create, add to, delete from, and list a binary tree. The tree is threaded in that it is possible to print the tree sorted in ascending key sequence by using the pointers associated with each node. This version is written in Microsoft BASIC and currently runs an an Apple II. With the deletion of line 5, it has also been run an the Commodore PET, the Radio Shack TRS-80, and a PDP-10. It should also run without modification on any other computer using Microsoft BASIC. [Listing] |
Listing 2: Running the program of listing 1. In this listing, a hundred random numbers are generated, used to create a binary tree, and listed in sorted key sequence. The node containing key value of 8 is deleted, a node with key value of 39 is added, and the tree is listed again. | |||||||||||||||||||||||||||||||||||||
The upward pointers in figures 7 and 8 are highlighted for clarity.
Note that only one node in the tree, the rightmost node (containing the largest erttry in the sorted list), has a right pointer with a value of zero.
This is a signal that the end of the tree has been reached.
Let us look at the process of reading the sorted numbers from the tree in figure 7.
We enter the tree at the root node, which contains the integer 15.
We move ieft down the tree until we encounter the first node with a Ieft pointer of zero (which we will also refer to as NIL).
This is node number 7, the one that contains 10 as its key.
We print the value 10 and follow the right pointer back up the tree to node 2, which contains a 12.
We print the 12 and follow the right pointer of this node to node 4.
Node 4 has no left child, therefore we print 14 and follow the right pointer back up the tree to node 1.
At node 1, we print 15 and follow the right pointer to node 3.
Node 3 has a left child, so we move from node 3 along the left pointer to node 6.
Node 6 has no left child, so we print the contents of node 6, which is 20.
Following the right pointer of node 6 back to node 3, we print its value, 25.
Following the right pointer of node 3 to node 5, we find the 26, which has no left child.
So we print 26 and look at the right pointer of this node.
Since the right pointer is NIL, the rightmost node must have no sucessor.
This means it is the last node in the tree, so we end the sort.
This is certainly a lot of trouble--unless you want to do more than just sort your numbers once.
And it is here that the binary tree shows its greatest advantage.
In most sort procedures, if you add a number to the list to be sorted, you have to completely re-sort the new list.
But with a binary tree, all you have to do is add one node and list the tree (both of which take much less time than a complete re-sort).
Figure 8 shows the addition of one node, a 13, to be added to the tree of figure 7.
The 13, being the eighth number in our list, becomes node 8.
It is added to the tree at the bottom of the appropriate branch, necessitating the change of only one pointer in the old tree:
the left pointer of node 4 now points to node 8.
All the other pointers remain intact, thus keeping computer time to a minimum.
Best of all, no key values of nodes have been altered in the least.
It is not hard to write a subroutine that will traverse a binary tree in the proper order.
Nor is it difficult to write a subroutine to add a node to a tree--or, equivalently to add a number to a sorted list.
Deletion of nodes from the tree, however, is not as obvious a procedure as one might suppose.
Detaching the node to be deleted is simply a matter of destroying the pointers from other nodes to the undesired node.
But repair of the tree diagram--connecting nodes below the deleted node to the body of the tree--is not quite so easy.
There are several cases to be considered, all of which are handled by the subroutine DEL.
The binary-tree-sort program in listing 1 was designed as a main routine and a collection of subroutines.
A brief description of each subroutine (given below) and a flowchart describe the operation of each of the subroutine modules.
Each of the subroutines was written in structured fashion.
Although the main routine is also structured, the current listing reflects a certain amount of modification and experimentation;
in other words, it has not been rewritten for optimal structure.
The main routine of listing 1 contains the following sequence:
an array of 100 random integers is created and used to create a sorted binary tree;
the binary tree is listed;
then the user is given the options of adding to the tree, listing the tree, or deleting from the tree.
All the necessary subroutines are included in this listing.
This program was originially written in FORTRAN and was later translated to BASIC, running first on a PET, then on an Apple II.
The program contains some redundancies so that the program will run as written on both machines;
the only restriction is that line 5 must be deleted before running the program on the PET.
The structured programming techniques used in writing the original FORTRAN version proved to be quite necessary when translating and debugging the BASIC version.
I feel that structured programming techniques are essential to a program of any size.
Subroutine Descriptions
FIX:
The purpose of the subroutine FIX (see flowchart in figure 9) is to initialize all the space to be used for tree nodes before the binary tree is built.
The FIX subroutine must be presented with N, the maximum number of nodes to be available to the tree.
Once the tree has N nodes, there are no provisions in this program to make additional space available.
BILDER:
Subroutine BILDER adds a new data entry to the binary tree, placing it in correct relation to the other nodes in the tree.
Within the BILDER routine, the node to be added is numbered Q and has a value of KEY(Q).
The algorithm (see flowchart in figure 10) compares the node to be added, Q, with other nodes (starting with the root node and moving down the tree) until it can be placed in proper relation to a parent node (a node that has a NIL pointer pointing to where node Q should go).
SEARCH:
The SEARCH subroutine (see figure 11) returns the node number of a key with value ALPHA.
The node number (or zero if the node is not found) is returned in the variable SEARCH.
The search done is not a linear search but rather a binary search like the one used in BILDER.
A binary search is named such because each decision halves the area to be searched.
LIST:
The LIST subroutine (see figure 12) lists the nodes of the tree in ascending key sequence by traversing the tree from its leftmost to its rightrnost node.
LIST follows an optimal path down the leftmost branch of the tree until it encounters a node the left pointer of which has value NIL;
this node contains the smallest key value.
Having found the smallest key value, LIST then calls SUC (the successor subroutine) repeatedly to find successor nodes;
key values are printed in the order they are encountered.
When a node is found that has a right pointer of value NIL, the node with the highest key value has been found, and the LIST subroutine has finished.
SUC:
The SUC subroutine, shown in figure 13, when given the node numbered P, returns the value SUC, which is the number of the node that follows node P in the ascending key sequence.
If the right pointer of P, named Q, is an upward pointer (value less than zero), the absolute value of the pointer is the successor node.
If Q is a downward pointer, the leftmost child of Q is the successor node.
A SUC value of NIL is returned if node P is the rightmost node in the tree.
PAR:
The PAR subroutine, shown in figure 14, returns in variable PAR the number of the parent node of node P.
(Remember that every node in a binary tree except the root node has exactly one parent node.)
As in BILDER and LIST, a binary search starting at the root uses the values of the current node key and the key of node P to guide the search down the tree until the parent of P is found.
This method is much faster than linearly traversing the tree using the SUC subroutine.
DEL:
The DEL subroutine to delete a node from the tree is the most complicated of the subroutines presented.
This is due largely to the necessity of reestablishing the thread that runs through the tree.
The algorithm for node deletion is given in figure 15 and table 1;
the examples in figure 16 are referred to by table 1.
A Pascal Implementation
A second and somewhat different version of the binary tree program is given in listing 3.
This version does not use a threaded tree (ie: there are no upward right pointers).
Because of this, the subroutine to add a node is almost trivial, and the delete routine is greatly simplified.
This implementation is, in general, altogether nicer.
The main change in this listing (compared to the BASIC program of listing 1) is the use of the scan procedure to traverse and list the tree.
The scan procedure uses a stack, which is maintained by the initstack and push procedures, along with the pop function.
The scan algorithm is as follows:
Due to the nature of the Pascal language, the listing itself serves to document the algorithm.
In all fairness to BASIC, it must be stated that the Pascal program of listing 3 is simpler partially due to the absence of the thread running through the binary tree.
Acknowledgements
I would like to thank the Computer Corner of Amarillo, Texas, for the use of their printer to produce listings 1 and 2.
Scanned by
Werner Cirsovius
|