Im Magazin „BYTE" wurde in der Ausgabe September 1976 der folgende Artikel abgedruckt.
Ein grundlegender Artikel über die Wandlung in und von verschiedenen Zahlenarten.
Im Anhang1 findet sich ein Testprogramm für alle Zahlenarten.
|
|
How to Do a Number of Conversions
James Brown
2518 FinleySt #636
Irving TX 75062
|
Perhaps one of the more difficult tasks on any small computer is the conversion from numcric characters to a form usable by the machine and back again.
That is, given some type of input output device (Teletype or TV typewriter) connected to your computer, it would be desirable to have the capability of entering a string of numeric characters (consecutive digits) through the keyboard.
The computer would then perform some operation on that number.
Finally, the result of that compulation is displayed back on the IO device.
Since the computer's natural language is N bit binary (i.e., ones and zeros), how can such a string be converted?
An example of the problem is: How do I convert the three character decimal string '196' into the binary integer equivalent 1100 0100 (or octal 204, or hexadecimal C4)?
Table 1:
Hexadecimal Codes of Setected ASCII Characters (high order bit assumed zero).
|
Hexadecimal | ASCII |
Hexadecimal | ASCII |
Hexadecimal | ASCII |
Code | Character |
Code | Character |
Code | Character |
|
|
|
|
|
|
00
:
0A
:
0D
:
20
:
2B
2C
2D
2E
2F
|
NUL
line feed
car. ret.
space
+
,
-
.
/
|
30
31
32
33
34
35
36
37
38
39
3A
|
0
1
2
3
4
5
6
7
8
9
:
|
40
41
42
43
44
45
46
47
|
@
A
B
C
D
E
F
G
|
|
Converting a decimal (base 10) number into binary can be a long and involved operation.
Let us work our way into decimal conversion by considering what would be necessary to do the following conversions in order of increasing complexity:
- Binary character strings (ASCII 0 or 1) to or from unsigned 8 bit integers.
- Octal character strings (ASCII 0 to 7) to or from unsigned 8 bit integers.
- Hexadecimal character strings (ASCII 0 to 9, A to F) to or from unsigned 16 bit integers.
- Signed decimal character strings (ASCII 0 to 9, +, -) to or from signed 16 bit integers.
Before we start, let us examine what the computer sees when a character is read from the keyboard, assuming that the keyboard speaks ASCII.
Examining
table 1, notice that each character is assigned a uniquc binary value.
Not only are the numeric characters 0 thru 9 grouped together;
but, if the left hand four bits were dropped, there would be a direct correspondence to the binary equivalents, of 0 thru 9.
As shown below, this is a fairly simple lask:
Algorithm:
'ASCI char' (AND) (0000 1111) = result
Examples
'0': (0011 0000) (AND) (0000 1111) = 0000 0000
'1': (0011 0001) (AND) (0000 1111) = 0000 0001
'9'; (0011 1001) (AND) (0000 1111) = 0000 1001
In each case, the result is a binary number in the low order of the byte after the AND operation has masked the high order bits.
Binary Conversions
Converting the ASCII character codes for 1 and 0 into a true binary value is perhaps the simplest to actually implement, and is a good starting point in understanding how number conversions work.
All of the other routines follow the basic plan presented here.
In the preceding, zapping the left four bits to get a binary value has one fatal flaw;
it only works for one character.
In developing something to handle a two character string such as '10', it might as well accept ASCII strings with any length, as long as the result can be contained in eight bits (an arbitrary choice).
The simplest way of doing this is to perform the conversion one character at a lime as they are entered and develop the result as each character of the string ('1' or '0') is processed.
Clearly the first step is to read the character and convert it into the binary value 1 or 0, using the masking technique shown earlier.
Since most computers have some type of shift instruction (see
note 1), this is an effective way of moving the new bit into the result which is being calculated.
Specifically, we must shift the result left one bit and then OR the new converted value to it.
This is malhematically equivalent to multiplying by 2 and adding.
For example, the four character binary string '1011' is entered and converted to the binary number 1011.
This is equivalent to the expression:
1 * 23 + 0 * 22 + 1 * 21 + 1 * 20 = 11 (base 10)
and could be accomplished by the following sequence:
- answer: = 0
- INPUT character
- character: = character (AND) 01 (hex)
- answer: = answer (SHIFT LEFT) 1
- answer: = answer (OR) character
- GOTO 2.
If those four characters were all I wanted to enter, I now need to teil the computer to stop looping, since there is a possibility of entering as many as eight characters.
The simplest way of doing this is to have the routine recognize some sort of delimiter (ie: some character other than '0' or '1').
Looking, once again, at
table 1, the characters, space, period, comma, carriage return, line feed, are all less than the character '0', when considered as binary values.
This condition is rather handy, since the same set of machine instructions could recognize a variety of delimitcrs without rewriting if I want to change what delimeter means.
Looking further, if the special characters between the 1 and A are excluded as delimiters, the following pair of tests checks for both delimiters and invalid characters.
- If the character is less than a '0' then finished.
- If the character is greater than a '1' then illegal character.
There is one further consideration that this routine should take into account.
The routine should check for a string of characters whose value would exceed the maximum value which could be contained in 8 bits (anything over 255 decimal).
Notice that the routine really cannot count the number of characters entered since nine zeros and a one are still the value one, even though 10 characters were processed.
Most computers have something called a carry bit or overflow flag.
During a shift left this carry bit usually receives the most significant bit from the register being shifted.
Thus, as soon as the carry bit becomes a one, then the result has overflowed 8 bits;
and the number being entered is too big.
Figure 1a shows the detailed flow of the binary input procedure;
listing 1a shows the 8080 assembly code of this procedure.
Listing 1a:
The BIN Routine Specified for an 8080.
This listing, as all the listings of this article, shows the symbolic code and absolute machine code for an 8080 version of the routine.
The notes refer to absolute addresses which must be adjusted when relocating the code to some address in memory address space:
BIN reads the '1' and '0' characters of an ASCII encoded binary string, leaving up to 8 bits of input in B.
[8080 Quelle and Z80 Quelle] |
|
Output is simply the reverse process but has error checking omitted.
Since the input was left to right, the output should be the same.
(It is extremely frustrating to enter the character string '1100' and have the string'0011'printed out)
Fortunately most computers have a route left instruction (
note 1).
If I choose to always print 8 characters per 8 bit value (after all, the computer is working, not me), the output routine should perform the following steps:
- value = value (ROTATE LEFT) 1
- character = value (AND) 1
- character = character (OR) '0' (ASCII character code for '0' is hex 30)
- OUTPUT character
- GOTO 1.
|
Listing 1b:
The BOT Routine Specified for an 8080.
This routine writes out a string of 8 binary encoded ASCII digits, taken from the B register.
[8080 Quelle und Z80 Quelle] |
Figure 1b contains the flow diagram for this procedure, and
listing 1b shows typical code for an 8080 Computer.
Figure 1a:
The BIN Routine Flowchart.
This routine treats successive ASCII '0' and '1' characters of Input as the digits of a binary string.
The digits are shifted into ANSWER until an illegal character or overflow returns control.
In the 8080 code of listing 1a, ANSWER is register B.
|
Figure 1b:
The BOT Routine Flowchart.
BOT is a binary output routine which writes an 8 digit ASCII binary string converted from ANSWER.
The digits are printed high order first in a loop which shifts out the bits one by one.
In the 8080 code of listing 1b, ANSWER is supplied by regster B.
|
|
|
Octal Conversions
For octal input from strings with ASCII characters '0' thru '7', the binary input routine can be used with some modifications.
As shown in
figure 2a, the illegal character check now looks for something greater than a '7', the shift left is now three bits instead of one, and the mask used on the character during the logical AND Operation is now an octal 7.
The octal output routine was a bit of a problem because the vaiue is an 8 bit quantity.
Hence, the routine must process the first two bits, then the next three, then the next three, left to right, as indicated on the fiow chart.
In my implementation, the 8080 had a rotate which would flow through the carry flag.
Thus the bits as they are handled are shown below, after the value is loaded into the A register and carry reset to zero.
Carry A Register
0 bb bbb bbb
RAL: b bb bbb bb0
RAL: b bb bbb b0b
RAL: b bb bbb 0bb
At this point carry and the A register are saved and a character put out.
Processing then continues at the first rotate, after the saved information is restored.
The A register plus carry, in effect, operates as if the machine has a 9 bit register.
Listing 2a:
The OIN Routine Specified for an 8080.
This routine accepts an inptut string of ASCII octal characters and collects the results in ANSWER (CPU register B).
Conversion ends with invalid characters or an overflow.
[8080 Quelle und Z80 Quelle] |
|
|
Listing 2b:
The OOT Routine Specified for an 8080.
This routine converts the contents of ANSWER (CPU register B) into a 3 digit ASCII string of octal characters, transferring the result to the output device during the conversion.
[8080 Quelle und Z80 Quelle] |
Figure 2a:
The OIN Routine Flowchart.
OIN is the octal version of an input routine;
its logic is an extension of the simpler BIN routine.
OIN treats successive characters from ASCII '0' to '7' as octal digits which are shifted into ANSWER.
The mutine accepts input until an illegal octal character or overflow occurs.
In the 8080 code of listing 2a, ANSWER is register B.
|
Figure 2b:
The OOT Routine Flowchart.
OOT is the octal version of an output routine for character string conversion.
Its logic is complicated by the fact that 8 bits is not an even multiple of 3 bits.
Thus there is a speciat case which treats the carry flag as a ninth bit for the first (high order) output digit.
Then the basic logic consists of shifting 3 places, extracting 3 bits and creating an ASCII character from '0' to '7'.
This routine in its 8080 implementation uses the stack as a temporary data area, as shown in listing 2b.
|
|
|
Hexadecimal
Input and output of hexadecimals employs logic similar to the preceding routines, with the following differences:
- ASCII '0' through '9' and 'A' through 'F' are legal numbers.
- The shift left is now four bits.
- On input if the character is ASCII 'A' through 'F', then a binary 9 is added lo generate a correct value in the low order 4 bits which are then masked as usual:
ASCII A = hexadecimal 41 + 09 = 4A (and) 0F = 0A
- On output if a 4 bit binary value is greater than a 9, then a 7 is added to the value.
The conversion is then completed by adding hexadecimal 30, the ASCII code for 0 (zero).
For example:
00 + 30 = 30 or ASCII '0'
09 + 30 = 39 or ASCII '9'
0A + 07 = 11 + 30 = 41 or ASCII 'A'
0F + 07 = 16 + 30 = 46 or ASCII 'F'
The software of 16 bit unsigned hexadecimal input and output conversion is shown in listings
listings 3a and
3b as implemented for an 8080 computer.
The flow charts of
figures 3a and
3b outline the logic for adaptation to other computers.
When this was implemented, an arbitrary choice was made to use 16 bit values instead of 8 bit.
This can lead to some inconvenience on an 8 bit microprocessor without 16 bit operations.
However, certain instructions were available on the 8080 to perform double register operations (two 8 bit registers treated as a single unit).
The 8080 DAD instruction performs 16 bit addition on the (H,L) register pair using another specified register pair.
When the 8080 instruction DAD H is encountered, the vaiue in (H,l) is doubled, thus in effect shifting that pair of registers left one bit.
For input shifting, it was a simple matter of performing four of these and then using an OR to the low order 8 bits from the value generated as a result of step 3 above.
Output necessitated four groups of DAD H and RAL operations to shift a bit into carry, then rotate it into the A register before step 4 was performed (see
listing 3b).
|
Listing 3a:
The XIN Routine Specified for an 8080.
This rouline accepts an input string of ASCII hexadecimal characters and collects the results as a 16 bit number in ANSWER (CPU register pair H and L).
[8080 Quelle und Z80 Quelle] |
|
Listing 3b:
The XOT Routine Specified for an 8080.
This routine converts the contents of ANSWER (CPU register pair H and L) into a 4 digit ASCII string of hexadecimal characters, transferring the results to the output device with PUT.
[8080 Quelle und Z80 Quelle] |
Figure 3a:
The XIN Routine Flowchart.
XIN is the hexadezimal version of the input algorithm, with the extension of accepting 16bit values.
The XIN routine tests for the validity of the hexadecimal digits, then converts the low order bits to a binary version of the digit.
This value is then shifted into the ANSWER being prepared.
In the 8080 Version of this routine (listing 3a), ANSWER becomes the HL index register pair, and the 8080's double precision addition operation is utilized.
Conversion terminates with an invalid character or when overflow occurs.
|
|
|
Figure 3b:
The XOT Routine Flowchart.
XOT converts a 16 bit quantity in ANSWER into a series of ASCII hexadecimal characters, starting with the high order digit.
The logic shifts out 4 bits at a time into the accumulator, adjusts the value if alphabetic codes are present then prints the ASCII version obtained by adding '0' to the value.
Four digits are created and printed prior to return.
|
|
Decimal Integer Conversions
Purely out of habit, I choose to use leading minus sign to indicatc negative, ASCII '-', with '+' or nothing to indicate positive integers.
Again I felt that a 16 bit routine would be more useful than an 8 bit one, allowing two's complement binary values for 32767 to -32768 instead of 127 to -128 (see
note 2).
Input was fairly straightforward, as shown by
listing 4a and
figure 4a.
If the first character read is a '-', set the minus flag.
Then for all numbers read, if the minus flag is set, the value is negated.
The developing answer is multiplied by 10 and the new value read added to it.
The implementation shown performs multiplication by repeated addition for simplicity, although a hardware multipiy instruction would certainly improve performance if it were availabie.
Decimal output, unfortunately, could not be made quite so simple, primarily because there really exists no decimal (base 10) left shift.
This left two alternatives, either repetitively diyide by 10 stacking the remainders, or perform a succession of pseudo divisions by subtracting appropriate constants.
The latter technique was chosen due to the complexity of multi register division.
The code of such a routine for an 8080 is shown in
listing 4b, and the corresporiding flow chart is
figure 4b.
The output routine checks the initial value to determinc if it is negative, and if so, output the ASCII character '-'.
If the value is negative, it is negated (making it positive) so that positive and negative numbers can be handled the same way.
A table containing powers of 10 (10,000; 1,000; 100; 10; 1) was then utilized to perform pseudo divisions by successive subtraction.
This is outlined in the flow diagram in
figure 4b.
For the 8080 implementation, there is no 16 bit subtraction, hence a multiple precision subtract operation is coded.
The handling of signed numbers is optional, as well as the zero suppression.
They were included because it is easier to take them out than to try to divine where they go and how to do it.
Many microprocessors have an instruction which maintains decimal numbers.
Given the 8 bit quantity hexadecimal 79, assume a hexadecimal 02 is added to it, giving the hexadecimal value 7b.
This instruction then can be used to adjust this result back to two decimal digits, 4 bits each.
The value then would appear as hexadecimal 81, which can be thought of as adding the decimal numbers 79 + 2, giving 81.
If computations are to be made in this packed decimal mode, then the hexadecimal routine presented could be used to input and output these values.
In conclusion, these routines are not presented as the final answer in riumber conversions.
In order to implement any or all of these routines on your own personal computer, the flow diagrams may be more useful than the sample 8080 implementation.
That implementation is targeted for Intel's 8080 microprocessor, one of the most widely used hobby computers at the time of this writing.
All the routines made full use of certain special features and strange quirks of the 8080 microprocessor.
Whatever your particular machine, the time spent in understanding these routines should save you a few headaches in your next program.
|
Listing 4a:
The DIN Routine Specified for an 8080.
This routine converts an ASCII decimal string of the form 'SXXXXX' into a signed 16 bit quantity in ANSWER (the CPU's H and L register pair).
The 'S' can be either '+', '-' or a null string (");
the 'X' can be a decimal digit '0' to '9' or a null string.
(Thus a successful canversion can involve from 1 to 6 characters.)
Conversion is terminated by an overflow or an invalid character.
[8080 Quelle und Z80 Quelle] |
|
TENSTABLE:
LOCATION VALUE(DECIMAL) (HEX)
0 10 000 2710
2 1 000 03E8
4 100 0064
6 10 000A
8 1 0001
NOTE:
INTEL FORMAT IN LISTING 4b REQUIRES LOW
ORDER HEXADECIMAL BYTE AT FIRST (LOW)
ADDRESS.
|
Listing 4b:
The DOT Routine Specified for an 8080.
The routine converts the signed two's complement number in ANSWER (register pair H and L) into an ASCII signed decimal string with leading zero suppression.
The result is sent to the output device during the conversion.
[8080 Quelle und Z80 Quelle] |
|
Assumptions
The assumptions for the procedures of this article are:
- An input and output subroutine exists (GET and PUT) which preserve CPU registers except A.
- The conversion process is itself a subroutine.
- The conversion process need not save any registers.
- Validating characters is done (though not necessary).
- Overflow checking is done (again not necessary and in some instances not desirable).
- All values are treated as unsigned integers (except the decimal routines).
- Non significant leading zeros are not required on input.
- Leading zeros are printed on output (except for decimal).
|
NOTES
Note 1:
During a left shift, as the high order bit
leaves the register, it enters the carry bit and
the vacated low order bit receives a zero.
For example: Before : Carry=0 A=1001 0111
After : Carry=1 A=0010 1110
During a rotate left, as a bit leaves the high
order bit position, that value is shiftad into the
vacated low order bit position. On the Intel 8080,
two types of rotate are available:
1. RRL : rotate accumulator copying
swapped bit to carry.
before: Carry=0 A=1001 0111
after: Carry=1 A=0010 1111
2. RAL : rotate accumutator thru carry
before: Carry=0 A=1001 0111
after: Carry=1 A=0010 1110
On Computers with a rotate through the
carry bit, new bits can be shifted into the
accumulator while old bits are shifted out.
|
Note 2:
Two's complement arithmetic uses the high
order bit of a value to indicate sign; 1 is
negative and 0 is positive. A negative value is
formed by complementing all bits of the value
(1 to 0 and 0 to 1) and adding one. Thus, the
largest positive value for a 16 bit quantity is a
hexadecimal 7FFF, and the smallest negative
value is a hexadecimal 8000, or decima) 32767
to -32768. The 8 bit values are 7F to 80 or
127 to -128.
For example: given the value 1, create the
value -1.
0000 0001 = 1 Start with 1
1111 1110 Complement all 16 bits
+1 Add 1
1111 1111 =- 1 Giving the value -1.
|
|
1. |
Das Programm CONV testet die Ein- und Ausgabe der verschiedenen Zahlenarten
|
Eingescanned von
Werner Cirsovius
April 2012
© BYTE Publications Inc.