IDL DATA TYPES

1.  DIGITS, BITS, BYTES, AND WORDS

    We have gotten to the place where you need to know a little about the internal workings of computers.  Specifically, how a computer stores numbers and characters.

    Humans think of numbers expressed in powers-of-ten, or decimal numbers.  This means that there are 10 digits (0  9), and you begin counting with these digits.  When you reach the highest number expressible by a single digit, you use two digits and generate the next series of numbers, 10  99 and so on with more digits.  This can be expressed mathemathically - note the following when we use 5198 as an example.

5198 = 10 3 * 5 + 10 2 * 1 + 10 1 * 9 + 10 0 * 8
    Fundamentally, all computer information is stored in the form of binary numbers, meaning powers-of-two.  How many digits?  Two!  They are 0 and  1.  The highest number expressible by a single digit is 1.  The two-digit numbers range from 10 to 11 and so on with more digits, but what decimal numbers do these binary numbers represent?  Let's look at an example using 1101.  Notice that we use the base of 2 instead of 10.
1101   2 3 * 1 + 2 2 *1 + 2 1 * 0 + 2 0 * 1 = 8 + 4 + 0 + 1 = 13
But wait a minute!  The word "digit" is a misnomer - it implies something about 10 fingers.  Hence, it's the word bit that is appropriate.  Each binary "digit" is really a bit.  So the binary number 1101 is a 4-bit number.  What decimal number does the binary number 1001 equal?

    For convenience, computers and their programmers group the bits into groups of eight.  Each group of 8 bits is called a byte.  Consider, then, the binary number 11111111; it's the maximum-sized number that can be stored in a byte.  What is this number?

    Finally, computers group the bytes into words.  The oldest PC's dealt with 8-bit words - one byte.  The Pentiums and Sparcs deal with 32-bit words - four bytes.  What's the largest number you can store in a 4-byte word?  And how about negative numbers?  We'll learn answers to these questions below.

    Below we describe how IDL (and everybody else) gets around this apparent upper limit on numbers.  They do this by defining different data types.  We don't cover all data types below - specifically, we omit Complex (yes, complex numbers!), Hexadecimal, and Octal data types, which you can look up if you are interested.  Please refer to §2.2 in your text book, "Practical IDL Programming" for more information.

2.  INTEGER DATA TYPES IN IDL

    Integer data types store the numbers just like you'd expect.  IDL supports integers of four different lengths: 1, 2, 4, and 8 bytes.  The shorter the word, the less memory required; the longer the word, the larger the numbers can be.  Different requirements require different compromises.

2.1.  1 byte:  The Byte Data Type

    The byte data type is a single byte long and always positive.  Therefore, it's values run 0  255.  Images are always represented in bytes.  The data might not be in bytes, but the numbers that the computer sends to the video processor card are always bytes.  Video screens require lots of memory and really quick processing speed, so bytes are ideal.  You generate an array using bytarr() for zeroed array or bindgen() for index array; you can generate a single byte variable by saying x=3b for example.  If a byte number exceeds 255 during a calculation, then it will "wrap around"; for example, 256 wraps to 0, 257 to 1, etc.

2.2.  2 byte:  Integers and Unsigned Integers

    With 2 bytes, numbers that are always positive are called Unsigned Integers.  They can range from 0  256 2-1, or 0  65535.  You generate an array using uintarr() for zeroed array or uindgen() for index array.  How do you think unsigned integers wrap around?

    Normally you want the possibility of negative numbers and you use Integers.  The total number of positive integer values is 256 2 / 2 = 32768.  One possible value is, of course, zero.  So the number of negative and positive values differ by one.  The choice is to favor negative numbers, so integers cover the range -32768  32767.  You generate an array using intarr() or indgen().  What happens with wrap around?  What if x = 5, y = 30000 and z = x * y?  Check it out!

2.3.  4 bytes:  Long Integers and Unsigned Long Integers

    The discussion here is exactly like that for 2-byte integers, except that 256 2 becomes 256 4.  What are the limits on these numbers?  See IDL help under "Data Types" and "Integer Constants" for more information.  You generate arrays using ulonarr() or ulindgen() and lonarr() or lindgen().

2.4.  8 bytes:  64-bit Long Integers and Unsigned 64-bit Long Integers

    The discussion here is exactly like that for 2-byte integers, except that 256 2 become 256 8.  What are the limits on these numbers?  See IDL help under "Data Types" and "Integer Constants" for more information.  You generate arrays using ulon64arr() or ul64indgen() and lon64arr() or l64indgen().

3.  FLOATING DATA TYPES IN IDL

    The problem with integer data types is that you can't represent anything other than integral numbers - no fractions!  Moreover, if you divide two integer numbers and the result should be fractional, it won't be; instead, it will be rounded down (e.g., 5/3 is calculated as 1).  To get around this, the floating data type uses some of the bits to store an exponents, which may be positive or negative.  You throw away some of the precision of the integer representation in favor of being able to represent a much wider range of numbers.

3.1.  4 bytes:  Floats

    "Floating point" means floating decimal point - it can wash all around.  With Floats, the exponent can range from about -38 -> +38 and there is about 6 digits of precision.  You generate an array using fltarr() or findgen() and a single variable by including a decimal point (e.g., x = 3.) or using exponential notation (e.g., x = 3e5).

3.2.  8 bytes:  Double-Precision

    Like Float, but the exponent can range from about -307 -> +307 and there is about 16 digits of precision.  You generate an array using dblarr() or dindgen() and a single variable by writing something like x = 3d or x = 3d5.

4.  STRINGS

    Strings store characters - letters, symbols, and numbers (but numbers as characters - you can't calculate with strings!)  A string constant such as hello consists of five letters.  It takes 5 bytes to store this constant - one byte for each character.  There are 256 possible characters for each of the bytes; with 2*26 letters (smalls and caps) and 10 digits, this leaves 104 other possibilities, which are used for things like semicolon, period and etc.  You can generate an array of strings with strarr() or sindgen() and a single string using ' ' like this: x = 'Hi there!!!'.

5. STRUCTURES

    Structures are a special data type that allows variables of different types and sizes to be packaged into one entity.  This is different from an array, where every element must be the same data type.  There are two kinds of structures in IDL:  an anonymous structure (a package of arbitrary variables) and a named structure (a package of variables that conform to a template created by the user).   Structures are used when it makes sense to collect and store a group of related items.  Your text book does a good job discussing the structures, so please refer to §2.7 in Gumley's "Practical IDL Programming" for more detailed information.


Written by Carl Heiles (U. C. Berkeley) and edited by Min Y. H. Hubbard