Next: Implications of Finite Word
Up: The Programming Environment
Previous: The Interpreter and Compiler
While the basic memory unit in a digital computer is the bit,
holding a value of 1 or 0, or equivalently ON or OFF,
most information of interest to the user requires more than a simple
bivalued representation. Thus it is practical to form groups of bits,
each consisting of some carefully chosen number of bits, and design the
computer hardware to deal with these groups as complete units. In modern
computers the most basic group consists of 8 bits, called the byte. Most computers implement groupings of two bytes, 4 bytes, and 8
bytes for purposes to be discussed below. Figure 1.2
illustrates the most frequently used word sizes.
Figure 1.2:
The bit, holding a value of 1 or 0, and typical groups of bits
of practical use in computer memory and for computation.

This is a useful approach from the users' point of view, and also from
the technical implementation point of view. If the user requires
information in chunks of 8, 16, or 32 bits, than there is no reason not
to build the hardware so that it can transfer 8, 16, or 32 bits in one
cycle rather than in sequence, and similar advantage can be gained by
building the basic arithmetic and logic unit so that it can
perform the basic operations of arithmetic and logic on these large
chunks as well. This is the first and by far the most significant step
towards implementing parallelism in computers. Most computers are now
designed to handle long words, consisting of 32 bits.
The task of the computer is to process information, which it can only
represent by various bit patterns, in a manner defined by the user's
computer program. The significance of these bit patterns as data lies
with the user and programmer, who has significant freedom in choosing
the representations. In fact, in most programs, different memory
locations will be defined to have completely different interpretations
for any bit pattern. This leads to the possibility of assigning a data type to each variable in a program. The usual data types
would include (we will use the names familiar to the C language  other
languages have equivalents) unsigned integer (positive), signed integer, float (a finite subset of the rational numbers),
as well as char for single characters. Most numeric data types can
exist in at least two sizes. The integer data type usually
has a short (usually 16 bits) and a long size. Similarly,
the float type usually contains 32 bits, and its large counterpart
is the double with 64 bits.
Figure:
A simple array of bytes in memory. This example illustrates an array as
a sequence of cells of the same data type, in this case interpreted as a
vector, since it is a one dimensional array. This vector is also
an example of a character string, since the basic data type is the
byte, used here to hold one character, with the additional property that
it is terminated by a special character, the null character which
has the value 0, and is represented by the notation ''. The
array is shown twice, the first copy showing the numerical value (in
decimal) of each byte, and the second with the interpretation of these
values as characters in the ASCII alphabet. Above the array of cells the
address of each byte is shown, and it is clear that the bytes are
contiguous in memory. The starting address of 123 is of no particular
interest, and could lie anywhere in memory.

Of course, it is straightforward to develop the idea of a sequence of
memory cells holding the same data type, and sharing a common name, with
only a sequential index to distinguish them. This then allows the use of
arrays, vectors, and character strings as illustrated
in Figure 1.3. In scientific computation the use of
matrices in calculations is thus quite easy to implement.
While the implications of the finite number of bits used in each date
type is easy to understand for the integer types, they are quite
complicated for the float and double data types. In
particular, there is always a temptation to treat the float and
double data types as if they are real numbers, which they
are most definitely not. In fact some languages use the data type name
real, which adds to the confusion.
Next: Implications of Finite Word
Up: The Programming Environment
Previous: The Interpreter and Compiler
Charles Dyer
20020424