In this chapter, we'll take a look at some of the ways that Perl handles data. All computer programs use data in some way. Some use it to personalize the program. For example, a mail program might need to remember your name so that it can greet you upon starting. Another program-say one that searches your hard disk for files-might remember your last search parameters in case you want to perform the same search twice.
A literal is a value that is represented "as is" or hard-coded in your source code. When you see the four characters 45.5 in programs it really refers to a value of forty-five and a half. Perl uses four types of literals. Here is a quick glimpse at them:
Associative arrays will be discussed in Chapter 3 "Variables." Numbers, strings, and regular arrays will be discussed in the following sections.
Numeric literals are frequently used. They represent a number
that your program will need to work with. Most of the time you
will use numbers in base ten-the base that everyone uses. However,
Perl will also let you use base 8 (octal) or base 16 (hexadecimal).
Note |
For those of you who are not familiar with non-decimal numbering systems, here is a short explanation. In decimal notation-or base ten- when you see the value 15 it signifies (1 * 10) + 5 or 1510. The subscript indicates which base is being used. In octal notation-or base eight-when you see the value 15 it signifies (1 * 8) + 5 or 1310. In hexadecimal notation-or base 16-when you see the value 15 it signifies (1 * 16) + 5 or 2110. Base 16 needs an extra six characters in addition to 0 to 9 so that each position can have a total of 16 values. The letters A-F are used to represent 11-16. So the value BD16 is equal to (B16 * 16) + D16 or (1110 * 16) + 1310 which is 17610. |
If you will be using very large or very small numbers, you might
also find scientific notation to be of use.
Note |
If you're like me, you probably forgot most of the math you learned in high school. However, scientific notation has always stuck with me. Perhaps because I liked moving decimal points around. Scientific notation looks like 10.23E+4, which is equivalent to 102,300. You can also represent small numbers if you use a negative sign. For example, 10.23E-4 is .001023. Simply move the decimal point to the right if the exponent is positive and to the left if the exponent is negative. |
Let's take a look at some different types of numbers that you can use in your program code.
First, here are some integers.
An integer. Integers are numbers with no decimal components.
An integer in octal format. This number is 35, or (4 * 8) + 3, in base 10.
An integer in hexadecimal format. This number is also 35, or (2 * 16) + 3 in base 10.
123 043 0x23
Now, some numbers and fractions-also called floating point values. You will frequently see these values referred to as a float value for simplicity's sake.
A float with a value in the tenths place. You can also say 100 and 5/10.
A float with a fraction value out to the thousandths place. You can also say 54 and 534/1000.
100.5 54.534
Here's a very small number.
A very small float value. You can represent this value in scientific notation as 3.4E-5.
.000034
String Literals are groups of characters surrounded by quotes so that they can be used as a single datum. They are frequently used in programs to identify filenames, display messages, and prompt for input. In Perl you can use single quotes ('), double quotes("), and back quotes (`).
The following examples show you how to use string literals. String literals are widely used to identify filenames or when messages are displayed to users. First, we'll look at single-quoted strings, then double-quoted strings.
A single-quoted string is pretty simple. Just surround the text
that you'd like to use with single quotes.
Note |
The real value of single-quoted strings won't become apparent until you read about variable interpolation in the section "Examples: Variable Interpolation" in Chapter 3 "Variables." |
A literal that describes one of my favorite role-playing characters.
A literal that describes the blessed cleric that frequently helps WasWaldo stay alive.
'WasWaldo the Illusionist' 'Morganna the Fair'
Strings are pretty simple, huh? But what if you wanted to use
a single quote inside the literal? If you did this, Perl would
think you wanted to end the string early and a compiler error
would result. Perl uses the backslash (\) character to indicate
that the normal fuNCtion of the single quote-ending a literal-should
be ignored for a moment.
Tip |
The backslash character is also called an escape character-perhaps because it lets the next character escape from its normal interpretation |
A literal that comments on WasWaldo's fighting ability. Notice how the single quote is used.
Another comment from the peanut gallery. Notice that double quotes can be used directly inside single-quoted strings.
'WasWaldo can\'t hit the broad side of a barn.' 'Morganna said, "WasWaldo can\'t hit anything."'
The single-quotes are used here specifically so that the double-quotes can be used to surround the spoken words. Later in the section on double-quoted literals, you'll see that the single-quotes can be replaced by double-quotes if you'd like.You must know only one more thing about single-quoted strings. You can add a line break to a single-quoted string simply by adding line breaks to your source code-as demonstrated by Listing 2.1.
Tell Perl to begin printing.
More Lines for Perl to display.
The single quote ends the string literal.
Listing 2.1 02LST01.PL-Using Embedded Line Breaks to Skip to a New Line
print 'Bill of Goods Bread: $34 .45 Fruit: $45.00 ====== $79.45';
Figure 2.1 shows a bill of goods displayed on one long, single-quoted literal.
Figure 2.1 : A bill of goods displayed one long single-quoted literal.
You can see that with single-quoted literals, even the line breaks in your source code are part of the string.
Double-quoted strings start out simple, then become a bit more
involved than single-quoted strings. With double-quoted strings,
you can use the backslash to add some special characters to your
string. Chapter 3 "Variables," will talk about how
double-quoted strings and variables interact.
Note |
Variables-which are described in Chapter 3 "Variables"-are simply locations in the computer's memory where Perl holds the various data types. They're called variables because the content of the memory can change as needed. |
The basic double-quoted string is a series of characters surrounded by double quotes. If you need to use the double quote inside the string, you can use the backslash character.
This literal is similar to one you've already seen. Just the quotes are different.
Another literal that uses double quotes inside a double-quoted string.
"WasWaldo the Illusionist" "Morganna said, \"WasWaldo can't hit anything.\""
Notice how the backslash in the second line is used to escape the double quote characters. And the single quote can be used without a backslash.
One major differeNCe between double- and single-quoted strings
is that double-quoted strings have some special escape sequeNCes
that can be used. Escape sequeNCes represent characters that are
not easily entered using the keyboard or that are difficult to
see inside an editor window. Table 2.1 shows all of the escape
sequeNCes that Perl understands. The examples following the table
will illustrate some of them.
Description or Character | |
Alarm\bell | |
Backspace | |
Escape | |
Form Feed | |
Newline | |
Carriage Return | |
Tab | |
Vertical Tab | |
Dollar Sign | |
Ampersand | |
Any Octal byte | |
Any Hexadecimal byte | |
Any Control character | |
Change the next character to lowercase | |
Change the next character to uppercase | |
Change the following characters to lowercase until a \E sequeNCe is eNCountered. Note that you need to use an uppercase E here, lowercase will not work. | |
Quote meta-characters as literals. See Chapter 10, "Regular Expressions," for more information on meta-characters. | |
Change the following characters to uppercase until a \E sequeNCe is eNCountered. Note that you need to use an uppercase E here, lowercase will not work. | |
Terminate the \L, \Q, or \U sequeNCe. Note that you need to use an uppercase E here, lowercase will not work. | |
Backslash |
Note |
In the next chapter, "Variables," you'll see why you might need to use a backslash when using the $ and @ characters. |
This literal represents the following: WasWaldo is 34 years old. The \u is used twice in the first word to capitalize the w characters. And the hexadecimal notation is used to represent the age using the ASCII codes for 3 and 4.
This literal represents the following: The kettle was HOT!. The \U capital-izes all characters until a \E sequeNCe is seen.
"\uwas\uwaldo is \x33\x34 years old." "The kettle was \Uhot\E!"
For more information about ASCII codes, see Appendix E, "ASCII Table."
Actually, this example isn't too difficult, but it does involve looking at more than one literal at oNCe and it's been a few pages siNCe our last advaNCed example. Let's look at the \t and \n escape sequeNCes. Listing 2.2-a program displaying a bill with several items-will produce the output shown in Figure 2.2.
Figure 2.2 : A bill of goods displayed using newline and tab characters.
Display a literal as the first line, second and third of the output.
Display literals that show what was purchased
Display a separator line.
Display the total.
Listing 2.2 02LST02.PL-Using Tabs and Newline Characters to Print
print "Bill of Goods Bread:\t\$34.45\n"; print "Fruit:\t"; print "\$45.00\n"; print "\t======\n"; print "\t\$79.45\n";
Tip |
Notice that Figure 2.1 and 2.2 look identical. This illustrates a cardinal rule of Perl-there's always more than one way to do something. |
This program uses two methods to cause a line break.
I recommend using the \n character so that when looking at your
code in the future, you can be assured that you meant to cause
a line break and did not simply press the ENTER key by mistake.
Caution |
If you are a C/C++ programmer, this material is not new to you. However, Perl strings are not identical to C/C++ strings because they have no ending NULL character. If you are thinking of converting C/C++ programs to Perl, take care to modify any code that relies on the NULL character to end a string. |
It might be argued that back-quoted strings are not really a data type. That's because Perl uses back-quoted strings to execute system commands. When Perl sees a back-quoted string, it passes the contents to Windows, UNIX, or whatever operating system you are using.
Let's see how to use the back-quoted string to display a directory listing of all text files in the perl5 directory.
Figure 2.3 shows what the output of such a program might look like.
Figure 2.3 : Using a back-quoted string to display a directory.
Print the directory listing.
print "dir *.txt";
All of the escape sequeNCes used with double-quoted strings can be used with back-quoted strings.
Perl uses arrays-or lists-to store a series of items. You could use an array to hold all of the lines in a file, to help sort a list of addresses, or to store a variety of items. We'll look at some simple arrays in this section. In the next chapter, "Variables," you'll see more examples of how useful arrays can be.
In this section, we'll look at printing an array and see how arrays are represented in Perl source code.
This example shows an empty array, an array of numbers and an array of strings.
Figure 2.4 shows the output of Listing 2.3.
Figure 2.4 : The output from Listing 2.3, showing different array literals.
Print the contents of an empty array.
Print the contents of an array of numbers.
Print the contents of an array of strings.
Print the contents of an array with different data types.
Listing 2.3 02LST03.PL-Printing Some Array Literals
print "Here is an empty array:" . () . "<-- Nothing there!\n"; print (12, 014, 0x0c, 34.34, 23.3E-3); print "\n"; print ("This", "is", 'an', "array", 'of', "strings"); print "\n"; print ("This", 30, "is", 'a', "mixed array", 'of', 0x08, "items");.
The fourth line of this listing shows that you can mix single-
and double-quoted strings in the same array. You can also mix
numbers and strings interchangeably, as shown in the last line.
Note |
Listing 2.3 uses the period, or coNCatenation, operator to join a string representation of the empty array with the string "Here is an empty array:" and the string "<-- Nothing there!\n". You can read more about operators in Chapter 4 "Operators." |
Note |
In this and other examples in this chapters, the elements of an array will be printed with no spaces between them. You will see how to print with spaces in the section "Strings Revisited" in Chapter 3 "Variables." |
Many times a simple list is not enough. If you're a painter, you might have one array that holds the names of orange hues and one that holds the names of yellow hues. To print them, you can use Perl's ability to specify a sub-array inside your main array definition.
While this example is not very "real-world," it gives you the idea behind specifying an array by using sub-arrays.
Print an array that consists of two sub-arrays.
Print an array that consists of an array, a string, and another array.
print (("Bright Orange", "Burnt"), ("Canary Yellow", "Sunbeam")); print (("Bright Orange", "Burnt"), " Middle ", ("Canary Yellow", "Sunbeam"));
So far, we haven't talked about the internal representations of data types. That's because you almost never have to worry about such things with Perl. However, it is important to know that, internally, the sub-arrays are merged into the main array. In other words, the array:
(("Bright Orange", "Burnt"), ("Canary Yellow", "Sunbeam"))
is exactly equivalent to
("Bright Orange", "Burnt", "Canary Yellow", "Sunbeam")
At times you might need an array that consists of sequential numbers or letters. Instead of making you list out the entire array, Perl has a shorthand notation that you can use.
Perl uses two periods (..) to replace a consecutive series of values. Not only is this method quicker to type-and less prone to error-it is easier to understand. Only the end points of the series are specified; you don't need to manually verify that every value is represented. If the .. is used, then automatically you know that a range of values will be used.
Print an array consisting of the numbers from 1 to 15.
Print an array consisting of the numbers from 1 to 15 using the shorthand method.
print (1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15); print "\n"; print (1..15);
The two arrays used in the previous example are identical, but
they were specified differently.
Note |
The double periods in the array specification are called the range operator. The range operator is also discussed in Chapter 4 "Operators." |
You can also use the shorthand method to specify values in the middle of an array.
Print an array consisting of the numbers 1, 2, 7, 8, 9, 10, 14, and 15.Print an array consisting of the letters A, B, F, G, H, Y, Z
print (1, 2, 7..10, 14, 15); print "\n" print ("A", "B", "F".."H", "Y", "Z");
The range operator works by taking the lefthand value, adding one to it, then appending that new value to the array. Perl continues to do this until the new value reaches the righthand value. You can use letters with the range operator because the ASCII table uses consecutive values to represent consecutive letters.
For more information about ASCII codes, see Appendix E, "ASCII Table."
This chapter introduced you to both numeric and string literals. You learned that literals are values that are placed directly into your source code and never changed by the program. They are sometimes referred to as hard-coded values.
You read about numbers and the three different bases that can be used to represent them-decimal, octal, and hexadecimal. Very large or small numbers can also be described using scientific notation.
Strings were perhaps a bit more involved. Single-, double-, and back-quoted strings are used to hold strings of characters. Back-quoted strings have an additional purpose. They tell Perl to send the string to the operating system for execution.
Escape sequeNCes are used to represent characters that are difficult to enter through the keyboard or that have more than one purpose. For example, using a double quote inside a double-quoted string would end the string before you really intended. The backslash character was introduced to escape the double quote and change its meaning.
The next chapter, "Variables," will show you how Perl uses your computer memory to store data types and also will show you ways that you can manipulate data.
Answers to Review Questions are in Appendix A.