Recursive Procedures

I will never forget struggling through my first totally structured program (in COBOL). For a while, I thought it nearly impossible to follow the rules. But when it was done, I realized the power of the method. Regardless of the language, I have not written an unstructured program since. I have written many large programs, which executed bug free the first time they were run. I have made major modifications and additions to existing programs with relative ease and minimal damage to the original code. Obviously, I have become an advocate of structured programming.

I hope some readers can identify with my experience because this article is about a similar discovery, every bit as powerful but less well-known and less used — recursive procedures. My objective is to describe what they are and what they can do and to communicate their importance, particularly to those who have never written a recursive routine.

If you know what a recursive procedure is, but you cannot imagine ever having the need to write one — read on! You do not know what you are missing. If you do not know what a recursive procedure is, you are about to be exposed to one of the wonders of programming. Finally, I will present a small but interesting problem, which is not at all a classical recursion problem. However, after analysis, it begs for a recursive solution.

Recursive procedures are not new, and they were not born out of programming. Recursively defined mathematical functions have been around for hundreds of years. Some of the oldest programming languages were designed with recursion in mind because it was required by the problems they addressed. For example, the LISP language was invented in 1958 by John McCarthy. Today, it remains the primary language for artificial intelligence programming. Recursion is an integral part both in data structure and function definitions.

A recursive procedure is simply a process which uses itself as a subproccss. The process has to be identified with a name, to be used for reference. In COBOL or FORTRAN, such a process would be known as a subroutine, in Pascal a procedure or function, in C a function, and in Forth a word definition. Normally, references to execute a process are known as calls. So, what makes a recursive procedure different is that within the code defining a process, there is a call to the process being defined. Sounds like infinite loop time! But, if all goes well, it isn't.

As a simple example, let us look at the factorial function, defined as follows: If n is a positive integer, n factorial, denoted n!, is

In any programming language, it is not difficult to code a factorial process without resorting to recursion. But let us look at what recursion can do for this problem. First, we note that

So, if we know what (n-1)! is, we can calculate n! with a single multiplication. By the same token, if we knew (n - 2)!, we could have calculated (n-1)! with a single multiplication. This unraveling of the definition suggests recursion. We can create a recursive process for factorial calculation as in Listing 1.

This coding is not any particular language, just metacode. If you have never seen a recursive procedure before, study this example and play computer in your head until you develop a feel for how it works. Like structured programming, you must learn to think recursive to develop recursive algorithms.

This process illustrates several important characteristics of recursive procedures. First, note how short and sweet it is — the problem is distilled to its essence. Second, with any recursive procedure, a stopper must be provided to avoid the infinite loop. That is, there must be a condition of execution which prevents the recursive call, and that condition must always eventually be realized. In this case, it is N becoming zero, which breaks the chain of calls.

Programming languages either provide for recursion or they do not. Recursive procedures can be written in any programming language, but if the language does not provide for recursion, it will be a difficult task to program recursive routines.

Why is this so? Recursion depends upon two major factors. The first factor is the linkage mechanism, used when one routine calls another. The linkage mechanism is responsible for saving whatever is necessary in the calling routine, setting up access to the parameters being passed and, finally, relinquishing control to the called routine while storing an address to which to return when the routine has completed execution.

There are two ways to pass parameters, by address and value. Passing parameters by address causes the called routine to operate directly on the values of the parameters residing in the calling routine. If the called routine changes a parameter value, it is also changed in the calling routine. Passing parameters by value gives the called routine its own copies of the parameters, which may be changed independent of the parameters as they reside in the calling routine.

If you think about it, it is obvious that recursive routines must pass parameters by value. Otherwise, each successive call to the routine (by itself) will destroy the values in the earlier calls. By the same reasoning, the return address to the calling routine must be stored so that it is not destroyed by successive calls, or the called routine will never find its way back through the chain of linkage. Keep in mind, when a recursive routine calls itself, it does not load in another copy of the code to execute. The calling code and the called code are one and the same.

The second factor affecting recursion is the method of allocating storage to local variables, used only by the called routine. It should be clear from the discussion of parameter passing that the same space cannot be used for local variables from every call. Some sort of temporary storage must be allocated by the called routine, so that each call has its own copy of all local variables. In essence, recursive routines are characterized by having the same machine instructions operate on data areas which are uniquely defined for each call.

This should give you an appreciation for recursion's dependency on the programming language. If the linkage mechanism and local variables do not comply with the requirements of recursion, you must program your own saving and restoration of linkage, parameter, and local data items. Although this can be done by tricks within the language itself, it may be easier to write assembler-language routines to do the dirty work. For example, you could write an assembler interface between the calling routine and the called routine which allocated a temporary work area and made copies of linkage addresses and parameters.

Popular languages which do support recursion are Pascal, C, and Forth. Those which do not are BASIC, COBOL, and FORTRAN. Frankly, on a microprocessor, I believe that if you want to write a recursive routine, you should use a language which supports it. It is not worth the hassle to implement your own recursive capabilities. It might be worth it for a large application on a mainframe or minicomputer, due to other considerations in language selection.

The basic tool for allocating temporary storage is a stack. It consists of a block of computer memory and a pointer (usually a register) to indicate the area currently being used. When data is stored on the stack (commonly referred to as a PUSH), the pointer is automatically shifted to the next available area. You literally build a stack of data. When data is retrieved from the stack (referred to as a POP), the pointer is automatically shifted back so that the memory can be reused. Stacks can be used for many important programming functions, and they are the engine of recursion.

When a routine is called in a recursive language, the stack is used to PUSH all linkage data, including parameter values. The called routine will then allocate additional space in the stack for all local variables. For the latter function, instead of using PUSHes and POPs, the routine explicitly moves the stack pointer enough to accommodate all local variables and references the space directly. When the routine completes execution, it moves the pointer back and POPs the return address. Thus, no matter how many recursive calls are made, each call gets its own area in the stack.

Most microprocessors have hardware stack support. That is, they provide registers and machine instructions for performing stack operations. In Pascal and C, the use of the stack is invisible to the programmer. One routine simply calls another and the object code contains the stack operations.

In Forth, the data stack is explicit at the software level (linkage is handled through the dictionary structure, a method unique to Forth). In fact, stacks are an integral part of the Forth language for all data storage, manipulation, and retrieval functions. If you want to learn all about what stacks can be used for, learn Forth. Stacks are one of the keys to Forth's astonishing speed as an interpretive language. Hardware stack support also simplifies assembler-language recursive programming. Lack of hardware stack support makes life difficult for assembler-language programmers and authors of compilers and interpreters.

The astute reader will have noticed a problem in the call by value discussion. Suppose the calling routine is passing a giant array to the called routine as a parameter. Is the entire array going to be dupl icaled in the called routine? In the stack?

In general, the answer is no. Arrays are passed by address, not by value, to avoid wasting storage. In C, the programmer has no choice. According to the language definition, individual variables are passed by value, and arrays are passed by address.

In Pascal, the programmer can dictate whether each parameter is to be passed by value (referred to, appropriately, as value parameters) or by address (referred to as variable parameters). But Forth is a different ball game because all words are universal and parameter lists, as such, do not exist. However, everything on the Forth data stack is essentially passed by value, and any variables, including arrays, are passed by address.

Does array passing by address cause a problem in recursion? It possibly could, depending upon the problem, but it usually does not. If you were writing a recursive routine which required a fresh copy of an array, passed as a parameter, for each call, you would have to insure that each call had a copy with which to work.

However, it is the nature of most recursive algorithms that if an array is involved, it is something which is being scanned or manipulated at the element level from call to call, rather than being overhauled in its entirety.

Which brings us to another very useful (but not absolutely essential) recursive tool — the software pointer. The pointer allows indirect reference to a data item by an address reference rather than an explicit variable name. Pointers are efficient to use in accessing arrays because they avoid subscript calculations. They are also useful for passing arrays as parameters to called routines since the pointers themselves may be passed by value, and you avoid having to pass a subscript separately, thus saving a parameter. Pointers may be operated on arithmetically; to go to the next element of an array, you add one to the pointer for the array.

Unlike structured programming, recursive procedures should not be used for every program. (Some would argue that not every program should be structured either, such as mainframe, on-line modules. However, that is another article.) The factorial function, for example, despite its elegance, is inefficient as a recursive procedure. The overhead of the recursive calls in speed and probably even memory is greater than a straightforward nonrecursive routine.

If you have encountered recursive procedures before, they were probably being used in an application with a treelike data structure, maybe artificial intelligence or system utilities, like sorting. Trees are the classic case where use of recursive procedures is essential. Wherever you are in a tree, it looks like (and, in fact, is) the top of another tree. So you develop routines which process from the tops of trees looking down and call themselves when they get to the top of a new (sub)tree.

The point I'm trying to make is that recursive procedures are very useful in a wide variety of applications beyond the classic cases. Modern languages make them more accessible to the programming community. But, somewhere along the way, we forgot to teach programmers about them.

I read an otherwise excellent book on Pascal, which mentioned in only one sentence that procedures and functions could call themselves, but then said it would not be discussed further — that is, it was a subject beyond the scope of the book. Now, I do not expect a book about a programming language to do a dissertation on recursion, but it could at least give a hint of the significance of this capability or a reference for further study.

Recursion is simply a form of looping where each call is a cycle, pass, or iteration of the loop. Once you begin to think recursively, you will look at almost any program which requires looping as a candidate for a recursive procedure. I do not mean the big loops that process record-by-record from a file. I am talking about looping logic-nested loops, search loops, and particularly indefinite loops which can end at any time and/or may fail along the way and have to be redone.

The first special thing about recursive looping is that you start fresh with each iteration. You only carry the baggage from the previous iteration that you want to carry. You can concentrate on a little piece of the problem at a time, limiting the program view to the immediate data situation.

The second special thing about recursive looping is that you are actually building a chain of iterations. You do not have to complete the loop unless you want to, that is, if you arc successful in what you were trying to accomplish. If you fail, you can back up in the loop as far as you want.

I am sure this all sounds very esoteric. The best way to begin to appreciate what recursion can do for you is to study some sample recursive procedures and then start developing your own. Before we launch into my example, I would like to also refer you to an ingenious recursive algorithm for sorting called Quicksort.

Unlike many recursive sorts, it does not build tree structures. It uses recursion as a looping tool, as I described earlier. You can read about Quicksort in an article called "Bubble Sort, Insertion Sort, and Quicksort Compared" by Richard G. Larson in the premier issue of COMPUTER LANGUAGE.

In this final section I will present a problem and discuss the development of a recursive solution. It is a good example of the usefulness of recursion in everyday programming. I encountered it in my work, financial application systems, not in artificial intelligence and not even in systems programming.

Suppose you have two arrays, SUM and ELEMENT, such that each entry in the SUM array is the sum of one or more entries in the ELEMENT array. In addition, when all entries in the SUM array are factored into sums of ELEMENT entries. each ELEMENT entry is used once and only once as a factor.

The problem is to develop an algorithm that will discover how the SUMs can be factored into the ELEMENTs. The solution may or may not be unique. That is, there may be several ways to do the factoring. However, once we have the algorithm for finding one solution, we should be able to extend it to find all possible solutions.

This is a very practical problem. Given one set of things, composed of another set of things, we wish to uncover the composition. For simplicity, I use addition as the composition method. However, very little of the algorithm will actually depend upon the composition method, and it can be translated to other well-defined methods, even nonarithmctic methods.

One of the interesting things about this problem is that it does not sound very difficult. As an exercise, once you understand the problem, you should stop reading and try to write a conventional algorithm to solve it. You will discover that it is very messy.

Obviously, the basic approach is trial and error. It may have occurred to you that sorting the ELEMENTs and maybe even the SUMs will reduce the number of trials. An ELEMENT entry cannot be a factor of a SUM entry (or what is left of a Sf/M entry after partial factoring) if the ELEMENT entry is greater than the SUM entry.

However, no matter how you try to simplify it, you can still go a long way down the road only to discover that it's the wrong road. You can factor all but the last two SUMs and find that what you are left with in ELEMENTs will not work with the remaining two SUMs. This means one or more of the earlier SUMs is not properly factored. So you need to back up and try again.

From the recursive standpoint, I look at this problem as trying to find a thread through all of the ELEMENTs so that when the thread is followed, it will produce all of the SUMs. I need to find where to start the thread, where to go next, then next, etc., factoring the first SUM, then continuing until all of the SUMs are factored. I may hit a dead end at any point and have to back up, unraveling, to try a new thread.

Each step in the attempted thread will be a recursive call to a routine that will find a candidate for the next factor and then continue the thread by calling itself again. Any call may fail to find a next factor, in which case I break the chain of calls back to the nearest point where I can try an alternate thread. If all threads are tested, eventually I will find a successful one with a chain of calls running through all the ELEMENTs. That is, when I find a solution, my depth in recursive calls will be equal to the number of ELEMENTs.

This reminds us that recursion is not necessarily cheap, especially in terms of memory. If each call takes X bytes of the stack for linkage, parameter, and local data items, and I have Y ELEMENTs, I need X times Y bytes of stack memory for a work area. The old trade-offs of memory vs. speed and simplicity never go away.

However, let me get back to basics on developing the algorithm. Since SUMs and ELEMENTs with a value of zero are irrelevant to this problem, I can use the convention of terminating each array with a zero value (any zeros in either array to start with must, of course, be removed). This allows me to use pointers and not have to keep track of subscripts. If I find a zero value being pointed to in either array, I know I am at the end of that array.

Next, I need a method of recording my factoring results. How do I keep track of the fact that the third SUM has been factored into the second, sixth, and twentieth ELEMENTs? I have chosen lo use a MARKER array running in parallel with the ELEMENT array.

If a MARKER entry is zero, the corresponding ELEMENT entry has not yet been assigned as a factor of any SUM. (MARKER is filled with zeros to start.) If a MARKER entry is not zero, the corresponding ELEMENT entry has been assigned as a factor of the SUM entry subscripted by the MARKER value (subscripts starting with one, not zero). All MARKER values of one indicate factors of the first SUM, two, the second SUM, and so on. Thus, the MARKER array is my thread at any point in the factoring. When the factoring is completed, no MARKER values of zero will remain, and the MARKER array records the factoring.

The procedure itself, as mentioned briefly earlier, will be a routine with the job of finding a candidate to be the next ELEMENT factor of a given SUM. If it is successful, it will continue by calling itself again. If it fails, it will indicate its failure to the previous call, so it can back up and try again.

♦ A pointer to the ELEMENT array, positioned to the entry with which to start the search for the next possible factor of the current SUM

♦ A pointer to the MARKER array, synchronized with the pointer to the ELEMENT array, that is, pointing to the same entry number

In addition, the routine needs a global reference to the beginning of the ELEMENT and MARKER arrays. As each individual SUM is being factored, the pointers are moved through these two arrays. However, once a SUM entry is completely factored, and the next SUM entry is to be started, the routine must have a way of returning the search pointers to the beginning of these two arrays. Note these pointers could also be passed parameters but since they are constants and would take up stack space as parameters, it is more efficient to make them globals.

Finally, the routine needs a method of returning its result. Did it find a factor or not? I have used a simple return-code switch. Zero means success, one means failure.

1. Using the ELEMENT and MARKER pointers, scan for an ELEMENT entry which could be used as a factor of the current SUM. It must be unused (corresponding MARKER entry is zero) and less than or equal to the SUM. If one is not found, return failure code to previous call.

4. If the current SUM entry is not reduced to zero, call the routine again using the CURRENT pointers and the REDUCED SUM value.

5. If SUM entry is reduced to zero, check to see if it is the last SUM entry. If it is the last SUM, return success code to previous call. If it is not, call the routine again using a pointer to the NEXT SUM entry, pointers to the START of the ELEMENT and MARKER arrays, and the NEXT HIGHER LEVEL number.

6. Check the result code of the recursive call in either step four or five. If successful, return success code to the previous call. If not successful, back out the processing in step two (add back the last ELEMENT antry to the current SUM and zero the last MARKER entry), then go to step one.

That is all there is to it! If you did try to develop a conventional routine, I am sure you can appreciate the simplicity of this algorithm. For reference, I have included the listing of this routine coded as a C language function (Listing 2). If you want to try an interesting recursion problem of your own, figure out how to extend and/or utilize this routine to find all possible factorings for a given SUM and ELEMENT array instead of just one.

By John Snyder