The following article was printed in April 1981 of the magazine „Microcomputing".
The assembler source of the listing illustrated here will also be found in SIG/M1 volume 75. [Here the original 8080-source and here the adapted Z80-source for the M80 assembler]. Anywhere in the largeness of the use-net I found a similar utility of a Shell-Metzner sort; [here as 8080-assembler routine and here as Z80-assembler routine]. Find another sample in SIG/M volume 78.
An interesting article concerning sort of data.

No list of records is too short or long for this assembly-language Shell-Metzner sort.

Sorta Super Fast

By Albert J. Marino

This program will let you sort 32K in one minute. This means you can sort 2000 names in one minute and, in fact, can sort 10,000 names in a reasonable time. The routine can easily be used with any BASIC or other language as long as it supports external (user) routines and has a POKE or similar function.
I originally wrote a machine language program that worked on the algorithm shown in Fig. 1.
Fig. 1.
This routine was short and simple, required no scratch space and was about four times faster than a modified bubble sort. I have never seen this algorithm implemented by anyone else and it might be unique; it certainly is good enough for most applications.
However, when the files reached 2000 names, I found that this routine was taking two to three minutes. Since the time for the routine is approximately a function of N2, this meant that 10,000 names would take over an hour. I decided I needed a better routine.
The one I settled on was the Shell-Metzner sort shown in Fig. 2.
Fig. 2.
Listing 1 shows the 8080 assembly-language program for this sort. Of course, the program will work for any 8080- or Z-80-based computer. The listing happens to be assembled at location AB00H but can be reassembled for any starting address. Note that the routine takes 203 bytes, including the 16-byte scratch area at the beginning.

Using the Routine

Once the program is in memory, it can be used by any other program, as long as it is not overwritten. To use it follow these steps:

Test Results

To give times for comparison, I filled my memory with a random pattern of 0s and 1s. This minimum number of values for each location results in the maximum times for sorting. Test results are shown in Table 1a. The results for some of the same tests using the original algorithm are shown in Table 1b.
AMOUNT OF # # TIME TO SORT
MEMORY RECORDS BYTES/REC SECONDS
16K 4K 4 22
16K 2K 8 22
16K 1K 16 15
32K 8K 4 65
32K 4K 8 75
32K 2K 16 54
32K 1K 32 34
Table 1a.
16K 4K 4 775
16K 2K 8 285
16K 1K 16 110
Table 1b.
The improvement is significant especially for large numbers since the new algorithm time is approximately a function of N log(N).

An Example

In general, a good technique to use is:
Fig. 3.
As an example, suppose you want to do an alphabetical sort of names stored in a file in ASCII format (this is the normal string and file format of most languages). Getting the correct results is not difficult, but it does require some care. Consider the following correctly sorted names:
de la Fleur, Conrad
Diebold, Tom
DiGeorge, Olaf
DiGeorgeo, Dick
di Georgeo, Len
Di Georgeo, Pete
Dillingham, Sam
If you had sorted these names by just loading last name first name, into memory, you would have gotten:
Di Georgeo, Pete
DiGeorge, Olaf
DiGeorgeo, Dick
Diebold, Tom
Dillingham, Sam
de la Fleur, Conrad
di Georgeo, Len
To get around this problem, make the following changes as the names are loaded into memory for sorting:
For example, de la Fleur, Conrad, becomes "delafleur,conxx" and Jones, Sam, becomes "jones,sam00000xx," where xx is the pointer and the 0's are blanks. This allows 14 letters to sort on. With over 6000 names, I have not had a mistake caused by this length restriction; with less names you might get by with even less.
After you've done the above and sorted using this routine, you can use the now-sorted pointers to access the original records in alphabetical order.

Large Files

If the file is too large to be loaded into memory, even using the above method of loading only the sort field then you need an additional technique. One method, if you don't know anything about the distribution of the field you're sorting on, is:
This method works for files of any size, even ones that span multiple disks.

Conclusion

The sorting routine given here should let you sort your files on any field within a reasonable time. For those who wish, this program is available on a CP/M eight-inch disk or TRS-80 five-inch disk for $15 from Compleat Systems, 9551 Casaba Ave., Chatsworth, CA 91311 (213-993-1479).
The disk also contains a Z-80 version that takes 25 percent less space and runs 25 percent faster, and two additional utilities (search and move).

Address correspondence to Albert J. Marino, Compleat Systems, 9551 Casaba Ave., Chatsworth, CA 91311
1. SIG/M (Special Interest Group/Microcomputers), a part of the Amateur Computer Club of New Jersey, used to compile public-domain software into disk volumes for regular release. The SIG/M disk set was started in 1980 and volumes 000 up to 310 may be found here.

Scanned by Werner Cirsovius
August 2002, extension December 2009
© Microcomputing