One of the reasons the Internet has blossomed so quickly is because everyone can understand the protocols that are spoken on the net. A protocol is a set of commands and responses. There are two layers of protocols that I'll mention here. The low-level layer is called TCP/IP and while it is crucial to the Internet, we can effectively ignore it. The high-level protocols like ftp, smtp, pop, http, and telnet are what you'll read about in this chapter. They use TCP/IP as a facilitator to communicate between computers. The protocols all have the same basic pattern:
Figure 18.1 is what the protocol for sending mail looks like. The end-user creates a mail message and then the sending system uses the mail protocol to hold a conversation with the receiving system.
Figure 18.1 : All protocols follow this Communications model.
Internet conversations are done with sockets, in a manner similar to using the telephone or shouting out a window. I won't kid you, sockets are a complicated subject. They are discussed in the "Sockets" section that follows. Fortunately, you only have to learn about a small subset of the socket fuNCtionality in order to use the high-level protocols.
Table 18.1 provides a list of the high-level protocols that you
can use. This chapter will not be able to cover them all, but
if you'd like to investigate further, the protocols are detailed
in documents at the http://ds.internic.net/ds/dspg0 intdoc.html
Web site.
Description | ||
Authentication | ||
Checks server to see if they are running | ||
Lets you retrieve information about a user | ||
File Transfer Protocol | ||
Network News Transfer Protocol - Usenet News Groups | ||
Post Office Protocol - iNComing mail | ||
Simple Mail Transfer Protocol - outgoing mail | ||
Time Server | ||
Lets you connect to a host and use it as if you were a directly connected terminal |
Each protocol is also called a service. HeNCe the term, mail server
or ftp server. Underlying all of the high-level protocols is the
very popular Transfer Control Protocol/Internet Protocol or TCP/IP.
You don't need to know about TCP/IP in order to use the high-level
protocols. All you need to know is that TCP/IP enables a server
to listen and respond to an iNComing conversation. INComing
conversations arrive at something called a port. A Port
is an imaginary place where iNComing packets of information can
arrive (just like a ship arrives at a sea port). Each type of
service (for example, mail or file transfer) has its own port
number.
Tip |
If you have access to a UNIX machine, look at the /etc/services file for a list of the services and their assigned port numbers. Users of Windows 95-and, I suspect Windows NT-can look in \windows\services. |
In this chapter, we take a quick look at sockets, and then turn our attention to examples that use them. You see how to send and receive mail. Sending mail is done using the Simple Mail Transfer Protocol (SMTP), which is detailed in an RFC numbered 821. Receiving mail is done using the Post Office Protocol (POP) as detailed in RFC 1725.
Sockets are the low-level links that enable Internet conversations. There are a whole slew of fuNCtions that deal with sockets. Fortunately, you don't normally need to deal with them all. A small subset is all you need to get started. This section will focus in on those aspects of sockets that are useful in Perl. There will be whole areas of sockets that I won't mention.
Table 18.2 lists all of the Perl fuNCtions that relate to sockets
so you have a handy refereNCe. But remember, you probably won't
need them all.
FuNCtion | Description |
accept(NEWSOCKET, SOCKET) | Accepts a socket connection from clients waiting for a connection. The original socket, SOCKET, is left along, and a new socket is created for the remote process to talk with. SOCKET must have already been opened using the socket() fuNCtion. Returns true if it succeeded, false otherwise. |
bind(SOCKET, PACKED_ADDRESS) | Binds a network address to the socket handle. Returns true if it succeeded, false otherwise. |
connect(SOCKET, PACKED_ADDRESS) | Attempts to connect to a socket. Returns true if it succeeded, false otherwise. |
getpeername(SOCKET) | Returns the packed address of the remote side of the connection. This fuNCtion can be used to reject connections for security reasons, if needed. |
getsockname(SOCKET) | Returns the packed address of the local side of the connection. |
getsockopt(SOCKET, LEVEL, OPTNAME) | Returns the socket option requested, or undefined if there is an error. |
listen(SOCKET, QUEUESIZE) | Creates a queue for SOCKET with QUEUESIZE slots. Returns true if it succeeded, false otherwise. |
recv(SOCKET, BUFFER, LEN, FLAGS) | Attempts to receive LENGTH bytes of data into a buffer from SOCKET. Returns the address of the sender, or the undefined value if there's an error. BUFFER will be grown or shrunk to the length actually read. However, you must initalize BUFFER before use. For example my($buffer) = '';. |
select(RBITS, WBITS, EBITS, TIMEOUT) | Examines file descriptors to see if they are ready or if they have exception conditions pending. |
send(SOCKET, BUFFER, FLAGS, [TO]) | Sends a message to a socket. On uNConnected sockets you must specify a destination (the TO parameter). Returns the number of characters sent, or the undefined value if there is an error. |
setsockopt(SOCKET, LEVEL, OPTNAME, OPTVAL) | Sets the socket option requested. Returns undefined if there is an error. OPTVAL may be specified as undefined if you don't want to pass an argument. |
shutdown(SOCKET, HOW) | Shuts down a socket connection in the manner indicated by HOW. If HOW = 0, all iNComing information will be ignored. If HOW = 1, all outgoing information will be stopped. If HOW = 2, then both sending and receiving is disallowed. |
socket(SOCKET, DOMAIN,TYPE, PROTOCOL) | Opens a specific TYPE of socket and attaches it to the name SOCKET. See "The Server Side of a Conversation" for more details. Returns true if successful, false if not. |
socketpair(SOCK1, SOCK2, DOMAIN, TYPE, PROTO) | Creates an unnamed pair of sockets in the specified domain, of the specified type. Returns true if successful, false if not. |
Note |
If you are interested in knowing everything about sockets, you need to get your hands on some UNIX documentation. The Perl set of socket fuNCtions are pretty much a duplication of those available using the C language under UNIX. Only the parameters are different because Perl data structures are handled differently. You can find UNIX documentation at http://www.delorie.com/gnu/docs/ on the World Wide Web. |
Programs that use sockets inherently use the client-server paradigm. One program creates a socket (the server) and another connects to it (the client). The next couple of sections will look at both server programs and client programs.
Server programs will use the socket() fuNCtion to create a socket; bind() to give the socket an address so that it can be found; listen() to see if anyone wants to talk; and accept() to start the conversation. Then send() and recv() fuNCtions can be used to hold the conversation. And finally, the socket is closed with the close() fuNCtion.
The socket() call will look something like this:
$tcpProtocolNumber = getprotobyname('tcp') || 6; socket(SOCKET, PF_INET(), SOCK_STREAM(), $tcpProtocolNumber) or die("socket: $!");
The first line gets the TCP protocol number using the getprotobyname() fuNCtion. Some systems-such as Windows 95-do not implement this fuNCtion, so a default value of 6 is provided. Then, the socket is created with socket(). The socket name is SOCKET. Notice that it looks just like a file handle. When creating your own sockets, the first parameter is the only thing that you should change. The rest of the fuNCtion call will always use the same last three parameters shown above. The actual meaning of the three parameters is unimportant at this stage. If you are curious, please refer to the UNIX documentation previously mentioned.
Socket names exist in their own namespace. Actually, there are several pre-defined namespaces that you can use. The namespaces are called protocol families because the namespace controls how a socket connects to the world outside your process. For example, the PF_INET namespace used in the socket() fuNCtion call above is used for the Internet.
ONCe the socket is created, you need to bind it to an address with the bind() fuNCtion. The bind() call might look like this:
$port = 20001; $internetPackedAddress = pack('Sna4x8', AF_INET(), $port, "\0\0\0\0"); bind(SOCKET, $internetPackedAddress) or die("bind: $!");
All Internet sockets reside on a computer with symbolic names.
The server's name in conjuNCtion with a port number makes up a
socket's address. For example, www.water.com:20001.
Symbolic names also have a number equivalent known as the dotted
decimal address. For example, 145.56.23.1. Port numbers are a
way of determining which socket at www.water.com
you'd like to connect to. All port numbers below 1024 (or the
symbolic constant, IPPORT_RESERVED)
are reserved for special sockets. For example, port 37 is reserved
for a time service and 25 is reserved for the smtp service. The
value of 20,001 used in this example was picked at random. The
only limitations are: use a value above 1024 and no two sockets
on the same computer should have the same port number.
Tip |
You can always refer to your own computer using the dotted decimal address of 127.0.0.1 or the symbolic name localhost. |
The second line of this short example creates a full Internet socket address using the pack() fuNCtion. This is another complicated topic that I will sidestep. As long as you know the port number and the server's address, you can simply plug those values into the example code and not worry about the rest. The important part of the example is the "\0\0\0\0" string. This string holds the four numbers that make up the dotted decimal Internet address. If you already know the dotted decimal address, convert each number to octal and replace the appropriate \0 in the string.
If you know the symbolic name of the server instead of the dotted decimal address, use the following line to create the packed Internet address:
$internetPackedAddress = pack('S n A4 x8', AF_INET(), $port, gethostbyname('www.remotehost.com'));
After the socket has been created and an address has been bound to it, you need to create a queue for the socket. This is done with the listen() fuNCtion. The listen() call looks like this:
listen(SOCKET, 5) or die("listen: $!");
This listen() statement will create a queue that can handle 5 remote attempts to connect. The sixth attempt will fail with an appropriate error code.
Now that the socket exists, has an address, and has a queue, your program is ready to begin a conversation using the accept() fuNCtion. The accept() fuNCtion makes a copy of the socket and starts a conversation with the new socket. The original socket is still available and able to accept connections. You can use the fork() fuNCtion, in UNIX, to create child processes to handle multiple conversations. The normal accept() fuNCtion call looks like this:
$addr = accept(NEWSOCKET, SOCKET) or die("accept: $!");
Now that the conversation has been started, use print(), send(), recv(), read(), or write() to hold the conversation. The examples later in the chapter show how the conversations are held.
Client programs will use socket() to create a socket and connect() to initiate a connection to a server's socket. Then input/output fuNCtions are used to hold a conversation. And the close() fuNCtion closes the socket.
The socket() call for the client program is the same as that used in the server:
$tcpProtocolNumber = getprotobyname('tcp') || 6; socket(SOCKET, PF_INET(), SOCK_STREAM(), $tcpProtocolNumber) or die("socket: $!");
After the socket is created, the connect() fuNCtion is called like this:
$port = 20001; $internetPackedAddress = pack('Sna4x8', AF_INET(), $port, "\0\0\0\0"); connect(SOCKET, $internetPackedAddress) or die("connect: $!");
The packed address was explained in "The Server Side of a Conversation." The SOCKET parameter has no relation to the name used on the server machine. I use SOCKET on both sides for convenieNCe.
The connect() fuNCtion is a blocking fuNCtion. This means that it will wait until the connection is completed. You can use the select() fuNCtion to set non-blocking mode, but you'll need to look in the UNIX documentation to find out how. It's a bit complicated to explain here.
After the connection is made, you use the normal input/output fuNCtions or the send() and recv() fuNCtions to talk with the server.
The rest of the chapter will be devoted to looking at examples of specific protocols. Let's start out by looking at the time service.
It is very important that all computers on a given network report the same time. This allows backups and other regularly scheduled events to be automated. Instead of manually adjusting the time on every computer in the network, you can designate a time server. The other computers can use the time server to determine the correct time and adjust their own clocks accordingly.
Listing 18.1 contains a program that can retrieve the time from any time server in the world. Modify the example to access your own time server by setting the $remoteServer variable to your server's symbolic name.
Turn on the warning compiler option.
Load the Socket module.
Turn on the strict pragma.
Initialize the $remoteServer to the symbolic name of the time server.
Set a variable equal to the number of seconds in 70 years.
Initialize a buffer variable, $buffer.
Declare $socketStructure.
Declare $serverTime.
Get the tcp protocol and time port numbers, provide a default in case the getprotobyname() and getservbyname() fuNCtions are notimplemented.
Initialize $serverAddr with the Internet address of the time server.
Display the current time on the local machine, also called the localhost.
Create a socket using the standard parameters.
Initialize $packedFormat with format specifiers.
Connect the local socket to the remote socket that is providing the time service.
Read the server's time as a 4 byte value.
Close the local socket.
Unpack the network address from a long (4 byte) value into a string value.
Adjust the server time by the number of seconds in 70 years.
Display the server's name, the number of seconds differeNCe between the remote time and the local time.
Declare the ctime() fuNCtion.
Return a string reflecting the time represented by the parameter.
Listing 18.1 18LST01.PL-Getting the Time from a Time Service
#!/usr/bin/perl -w use Socket; use strict; my($remoteServer) = 'saturn.planet.net'; my($secsIn70years) = 2208988800; my($buffer) = ''; my($socketStructure); my($serverTime); my($proto) = getprotobyname('tcp') || 6; my($port) = getservbyname('time', 'tcp') || 37; my($serverAddr) = (gethostbyname($remoteServer))[4]; printf("%-20s %8s %s\n", "localhost", 0, ctime(time())); socket(SOCKET, PF_INET, SOCK_STREAM, $proto) or die("socket: $!"); my($packFormat) = 'S n a4 x8'; # Windows 95, SunOs 4.1+ #my($packFormat) = 'S n c4 x8'; # SunOs 5.4+ (Solaris 2) connect(SOCKET, pack($packFormat, AF_INET(), $port, $serverAddr)) or die("connect: $!"); read(SOCKET, $buffer, 4); close(SOCKET); $serverTime = unpack("N", $buffer); $serverTime -= $secsIn70years; printf("%-20s %8d %s\n", $remoteServer, $serverTime - time, ctime($serverTime)); sub ctime { return(scalar(localtime($_[0]))); }
Each operating system will have a different method to update the local time. So I'll leave it in your hands to figure how to do that.
The next section is devoted to sending mail. First the protocol will be explained and then you will see a Perl script that can send a mail message.
Before you send mail, the entire message needs to be composed.
You need to know where it is going, who gets it, and what the
text of the message is. When this information has been gathered,
you begin the process of transferring the information to a mail
server.
Note |
The mail service will be listening for your connection on TCP port 25. But this information will not be important until you see some Perl code later in the chapter. |
The message that you prepare can only use alphanumeric characters. If you need to send binary information (like files), use the MIME protocol. The details of the MIME protocol can be found at the http://ds.internic.net/ds/dspg0intdoc.html Web site.
SMTP uses several commands to communicate with mail servers. These
commands are described in Table 18.3. The commands are not case-insensitive,
which means you can use either Mail or MAIL. However, remember
that mail addresses are case-sensitive.
Description | |
Basic Commands | |
Initiates a conversation with the mail server. When using this command you can specify your domain name so that the mail server knows who you are. For example, HELO mailhost2. planet.net. | |
Indicates who is sending the mail. For example, MAIL FROM: <[email protected]>. Remember this is not your name, it's the name of the person who is sending the mail message. Any returned mail will be sent back to this address. | |
Indicates who is recieving the mail. For example, RCPT TO: <[email protected]>. You can indicate more than one user by issuing multiple RCPT commands. | |
Indicates that you are about to send the text (or body) of the message. The message text must end with the following five letter sequeNCe: "\r\n.\r\n." | |
Indicates that the conversation is over. | |
AdvaNCed Commands (see RFC 821 for details) | |
Indicates that you are using a mailing list. | |
Asks for help from the mail server. | |
Does nothing other than get a reponse from the mail server. | |
Aborts the current conversation. | |
Sends a message to a user's terminal instead of a mailbox. | |
Sends a message to a user's terminal and to a user's mailbox. | |
Sends a message to a user's terminal if they are logged on; otherwise, sends the message to the user's mailbox. | |
Reverses the role of client and server. This might be useful if the client program can also act as a server and needs to receive mail from the remote computer. | |
Verifies the existeNCe and user name of a given mail address. This command is not implemented in all mail servers. And it can be blocked by firewalls. |
Every command will receive a reply from the mail server in the
form of a three digit number followed by some text describing
the reply. For example, 250 OK
or 500 Syntax error, command unrecognized.
The complete list of reply codes is shown in Table 18.4. Hopefully,
you'll never see most of them.
Description | |
A system status or help reply. | |
Help Message. | |
The server is ready. | |
The server is ending the conversation. | |
The requested action was completed. | |
The specified user is not local, but the server will forward the mail message. | |
This is a reply to the DATA command. After getting this, start sending the body of the mail message, ending with "\r\n.\r\n." | |
The mail server will be shut down. Save the mail message and try again later. | |
The mailbox that you are trying to reach is busy. Wait a little while and try again. | |
The requested action was not done. Some error occurred in the mail server. | |
The requested action was not done. The mail server ran out of system storage. | |
The last command contained a syntax error or the command line was too long. | |
The parameters or arguments in the last command contained a syntax error. | |
The mail server has not implemented the last command. | |
The last command was sent out of sequeNCe. For example, you might have sent DATA before sending RECV. | |
One of the parameters of the last command has not been implemented by the server. | |
The mailbox that you are trying to reach can't be found or you don't have access rights. | |
The specified user is not local; part of the text of the message will contain a forwarding address. | |
The mailbox that you are trying to reach has run out of space. Store the message and try again tomorrow or in a few days-after the user gets a chaNCe to delete some messages. | |
The mail address that you specified was not syntactically correct. | |
The mail transaction has failed for unknown causes. |
Now that you've seen all of the SMTP commands and reply codes, let's see what a typical mail conversation might look like. In the following conversation, the '>' lines are the SMTP commands that your program issues. The '<' lines are the mail server's replies.
>HELO <250 saturn.planet.net Hello [email protected] [X.X.X.X],pleased to meet you >MAIL From: <(Rolf D'Barno, 5th Circle Archer)> <250 <(Rolf D'Barno, 5th Circle Archer)>... Sender ok >RCPT To: <[email protected]> <250 <[email protected]>... Recipient ok >DATA <354 Enter mail, end with "." on a line by itself >From: (Rolf D'Barno, 5th Circle Archer) >Subject: Arrows >This is line one. >This is line two. >. <250 AAA14672 Message accepted for delivery >QUIT <221 saturn.planet.net closing connection
The bold lines are the commands that are sent to the server. Some of the SMTP commands are a bit more complex than others. In the next few sections, the MAIL, RCPT, and DATA commands are discussed. You will also see how to react to undeliverable mail.
The MAIL command tells the mail server to start a new conversation. It's also used to let the mail server know where to send a mail message to report errors. The syntax looks like this:
MAIL FROM:<reverse-path>
If the mail server accepts the command, it will reply with a code of 250. Otherwise, the reply code will be greater than 400.
In the example shown previously
>MAIL From:<([email protected])> <250 <([email protected])>... Sender ok
The reverse-path is different from the name given as the sender following the DATA command. You can use this technique to give a mailing list or yourself an alias. For example, if you are maintaining a mailing list to your college alumni, you might want the name that appears in the reader's mailer to be '87 RugRats instead of your own name.
You tell the mail server who the recipient of your message is by using the RCPT command. You can send more than one RCPT command for multiple recipients. The server will respond with a code of 250 to each command. The syntax for the RCPT is:
RCPT TO:<forward-path>
Only one recipient can be named per RCPT command. If the recipient is not known to the mail server, the response code will be 550. You might also get a response code indicating that the recipient is not local to the server. If that is the case, you will get one of two responses back from the server:
After starting the mail conversation and telling the server who the recipient or recipients are, you use the DATA command to send the body of the message. The syntax for the DATA command is very simple:
DATA
After you get the standard 354 response, send the body of the
message followed by a line with a single period to indicate that
the body is finished. When the end of message line is received,
the server will respond with a 250 reply code.
Note |
The body of the message can also iNClude several header items like Date, Subject, To, Cc, and From. |
The mail server is responsible for reporting undeliverable mail, so you may not need to know too much about this topic. However, this information may come in handy if you ever run a list service or if you send a message from a temporary account.
An endless loop happens when an error notification message is sent to a non-existent mailbox. The server keeps trying to send a notification message to the reverse-path specified in the MAIL command.
The answer to this dilemma is to specify an empty reverse path in the MAIL command of a notification message like this:
MAIL FROM:<>
An entire mail session that delivers an error notification message might look like the following:
MAIL FROM:<> 250 ok RCPT TO:<@[email protected]> 250 ok DATA 354 send the mail data, end with . Date: 12 May 96 12:34:53 From: [email protected] To: [email protected] Subject: Problem delivering mail. Robin, your message to [email protected] was not delivered. SILVER.COM said this: "550 No Such User" . 250 ok
I'm sure that by now you've had enough theory and would like to
see some actual Perl code. Without further explanation, Listing
18.2 shows you how to send mail.
Caution |
The script in Listing 18.2 was tested on Windows 95. Some comments have been added to indicate changes that are needed for SunOS 4.1+ and SunOS 5.4+ (Solaris 2). The SunOS comments were supplied by Qusay H. Mahmoud-also known as Perlman on IRC. Thanks, Qusay! |
Turn on the warning compiler option.
Load the Socket module.
Turn on the strict pragma.
Initialize $mail To which holds the recipient's mail address.
Initialize $mailServer which holds the symbolic name of your mail server.
Initialize $mailFrom which holds the originator's mail address.
Initialize $realName which holds the text that appears in the From header field.
Initialize $subject which holds the text that appears in the Subject header field.
Initialize $body which holds the text of the letter.
Declare a signal handler for the Interrupt signal. This handler will trap users hitting Ctrl+c or Ctrl+break.
Get the protocol number for the tcp protocol and the port number for the smtp service. Windows 95 and NT do not implement the getprotobyname() or getservbyname() fuNCtions so default values are supplied.
Initialize $serverAddr with the mail server's Internet address.
The $length variable is tested to see if it is defined, if not, then thegethostbyname() fuNCtion failed.
Create a socket called SMTP using standard parameters.
Initialize $packedFormat with format specifiers.
Connect the socket to the port on the mail server.
Change the socket to use unbuffer input/output. Normally, sends and receives are stored in an internal buffer before being sent to your script. This line of code eliminates the buffering steps.
Create a temporary buffer. The buffer is temporary because it is local to the block surrounded by the curly brackets.
Read two responses from the server. My mail server sends two reponses when the connection is made. Your server may only send one response.
If so, delete one of the recv() calls.
Send the HELO command. The sendSMTP() fuNCtion will take care of reading the response.
Send the MAIL command indicating where messages that the mail server sends back (like undeliverable mail messages) should be sent.
Send the RCPT command to specify the recipient.
Send the DATA command.
Send the body of the letter. Note that no reponses are received from the mail server while the letter is sent.
Send a line containing a single period indicating that you are finished sending the body of the letter.
Send the QUIT command to end the conversation.
Close the socket.
Define the closeSocket() fuNCtion which will act as a signal handler.
Close the socket.
Call die() to display a message and end the script.
Define the send SMTP() fuNCtion.
Get the debug parameter.
Get the smtp command from the parameter array.
Send the smtp command to STDERR if the debug parameters were true.
Send the smtp command to the mail server.
Get the mail server's response.
Send the response to STDERR if the debug parameter were true.
Split the response into reply code and message, and return just the reply code.
Listing 18.2 18LST02.PL-Sending Mail with Perl
#!/usr/bin/perl -w use Socket; use strict; my($mailTo) = '[email protected]'; my($mailServer) = 'mailhost2.planet.net'; my($mailFrom) = '[email protected]'; my($realName) = "Rolf D'Barno"; my($subject) = 'Test'; my($body) = "Test Line One.\nTest Line Two.\n"; $main::SIG{'INT'} = 'closeSocket'; my($proto) = getprotobyname("tcp") || 6; my($port) = getservbyname("SMTP", "tcp") || 25; my($serverAddr) = (gethostbyname($mailServer))[4]; if (! defined($length)) { die('gethostbyname failed.'); } socket(SMTP, AF_INET(), SOCK_STREAM(), $proto) or die("socket: $!"); $packFormat = 'S n a4 x8'; # Windows 95, SunOs 4.1+ #$packFormat = 'S n c4 x8'; # SunOs 5.4+ (Solaris 2) connect(SMTP, pack($packFormat, AF_INET(), $port, $serverAddr)) or die("connect: $!"); select(SMTP); $| = 1; select(STDOUT); # use unbuffered i/o. { my($inpBuf) = ''; recv(SMTP, $inpBuf, 200, 0); recv(SMTP, $inpBuf, 200, 0); } sendSMTP(1, "HELO\n"); sendSMTP(1, "MAIL From: <$mailFrom>\n"); sendSMTP(1, "RCPT To: <$mailTo>\n"); sendSMTP(1, "DATA\n"); send(SMTP, "From: $realName\n", 0); send(SMTP, "Subject: $subject\n", 0); send(SMTP, $body, 0); sendSMTP(1, "\r\n.\r\n"); sendSMTP(1, "QUIT\n"); close(SMTP); sub closeSocket { # close smtp socket on error close(SMTP); die("SMTP socket closed due to SIGINT\n"); } sub sendSMTP { my($debug) = shift; my($buffer) = @_; print STDERR ("> $buffer") if $debug; send(SMTP, $buffer, 0); recv(SMTP, $buffer, 200, 0); print STDERR ("< $buffer") if $debug; return( (split(/ /, $buffer))[0] ); }
This program displays:
> HELO < 250 saturn.planet.net Hello [email protected] [207.3.100.120], pleased to meet you > MAIL From: <[email protected]> < 250 <[email protected]>... Sender ok > RCPT To: <[email protected]> < 250 <[email protected]>... Recipient ok > DATA < 354 Enter mail, end with "." on a line by itself > . < 250 TAA12656 Message accepted for delivery > QUIT < 221 saturn.planet.net closing connection
The lines in bold are the commands that were sent to the server. The body of the letter is not shown in the output.
The flip side to sending mail is, of course, receiving it. This is done using the POP or Post Office Protocol. SiNCe you've already read about the SMTP protocol in detail, I'll skip describing the details of the POP. After all, the details can be read in the RFC documents when they are needed. Instead, I'll use the POP3Client module- available on the CD-ROM-to demonstrate receiving mail.
Listing 18.3 contains a program that will filter your mail.
It will display a report of the authors and subject line for any
mail that relates to EarthDawn, a role-playing game from
FASA. This program will not delete any mail from the server, so
you can experiment with confideNCe.
Note |
Before trying to run this program, make sure that the POP3Client module (POP3Client.pm) is in the Mail subdirectory of the library directory. You may need to create the Mail subdirectory as I did. On my system, this directory is called it is probably different on your system though. See your system administratior if you need help placing the file into the correct directory. |
Caution |
This script was tested using Windows 95. You might need to modify it for other systems. On SunOS 5.4+ (Solaris 2), you'll need to change the POP3Client module to use a packing format of 'S n c4 x8' instead of 'S n a4 x8'. Other changes might also be needed. |
Turn on the warning compiler option.
Load the POP3Client module. The POP3Client module will load the Socket module automatically.
Turn on the strict pragma.
Declare some variables used to temporary values.
Define the header format for the report.
Define the detail format for the report.
Initialize $username to a valid username for the mail server.
Initialize $password to a valid password for the user name.
Create a new POP3Client object.
Iterate over the mail messages on the server. $pop->Count holds the number of messages waiting on the server to be read.
Initialize a flag variable. When set true, the script will have a mail message relating to EarthDawn.
Iterate over the headers in each mail messages. The Head() method of the POP3Client module returns the header lines one at a time in the $_ variable.
Store the author's name if looking at the From header line.
Store the subject if looking at the Subject line.
This is the filter test. It checks to see if the word "EarthDawn" is in the subject line. If so, the $earthDawn flag variable is set to true (or 1).
This line is commented out; normally it would copy the text of the message into the @body array.
This line is also commented out; it will delete the current mail message from the server. Use with caution! ONCe deleleted, you can't recover the messages.
Set the flag variable, $earthDawn, to true.
Write a detail line to the report if the flag variable is true.
Listing 18.3 18LST03.PL-Creating a Mail Filter
#!/usr/bin/perl -w use Mail::POP3Client; use strict; my($i, $from, $subject); format main::STDOUT_TOP = @||||||||||||||||||||||||||||||||||||||||||||||||| Pg @< "Waiting Mail Regarding EarthDawn", $% Sender Subject ---------------------- -------------------------------- . format main::STDOUT = @<<<<<<<<<<<<<<<<<<<<< @<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< $from, $subject . my($username) = 'medined'; my($password) = 'XXXXXXXX'; my($mailServer) = 'mailhost2.planet.net'; my($pop) = Mail::POP3Client->new($username, $password, $mailServer); for ($i = 1; $i <= $pop->Count; $i++) { my($earthDawn) = 0; foreach ($pop->Head($i)) { $from = $1 if /From:\s(.+)/; $subject = $1 if /Subject:\s(.+)/; if (/Subject: .*EarthDawn/) { # @body = $pop->Body($i); # $pop->Delete($i); $earthDawn = 1; } } if ($earthDawn) { write(); } }
This program displays:
Waiting Mail Regarding EarthDawn Pg 1 Sender Subject ---------------------- --------------------------------- Bob.Schmitt [EarthDawn] NethermaNCer Doug.Stoechel [EarthDawn] Weaponsmith Mindy.Bailey [EarthDawn] Troubador
When you run this script, you should change $username, $password, and $mailServer and the filter test to whatever is appropriate for your system.
You could combine the filter program with the send mail program (from Listing 18.2) to create an automatic mail-response program. For example, if the subject of a message is "Info," you can automaticallly send a predefined message with information about a given topic. You could also create a program to automatically forward the messages to a covering person while you are on vacation. I'm sure that with a little thought you can come up with a half-dozen ways to make your life easier by automatically handling some of your iNComing mail.
Occasionally it's good to know if a server is up and fuNCtioning.
The echo service is used to make that determination. Listing 18.4
shows a program that checks the upness of two servers.
Caution |
Windows 95 (and perhaps other operating systems) can't use the SIGALRM interrupt signal. This might cause problems if you use this script on those systems because the program will wait forever when a server does not respond. |
Turn on the warning compiler option.
Load the Socket module.
Turn on the strict pragma.
Display a message if the red.planet.net server is reachable.
Display a message if the saturn.planet.net server is reachable.
Declare the echo() fuNCtion.
Get the host and timeout parameters from the paramter array. If no timeout parameter is specified, 5 seconds wil be used.
Declare some local variables.
Get the tcp protocol and echo port numbers.
Get the server's Internet address.
If $serverAddr is undefined then the name of the server was probably iNCorrect and an error message is displayed.
Check to see if the script is running under Windows 95.
If not under Windows 95, store the old alarm handler fuNCtion, set the alarm handler to be an anonymous fuNCtion that simply ends the script, and set an alarm to go off in $timeout seconds.
Initialize the status variable to true.
Create a socket called ECHO.
Initialize $packedFormat with format specifiers.
Connect the socket to the remote server.
Close the socket.
Check to see if the script is running under Windows 95.
If not under Windows 95, reset the alarm and restore the old alarm handler fuNCtion.
Return the status.
Listing 18.4 18LST04.PL-Using the Echo Service
#!/usr/bin/perl -w use Socket; use strict; print "red.planet.net is up.\n" if echo('red.planet.net'); print "saturn.planet.net is up.\n" if echo('saturn.planet.net'); sub echo { my($host) = shift; my($timeout) = shift || 5; my($oldAlarmHandler, $status); my($proto) = getprotobyname("tcp") || 6; my($port) = getservbyname("echo", "tcp") || 7; my($serverAddr) = (gethostbyname($host))[4]; return(print("echo: $host could not be found, sorry.\n"), 0) if ! defined($serverAddr); if (0 == Win32::IsWin95) { $oldAlarmHandler = $SIG{'ALRM'}; $SIG{'ALRM'} = sub { die(); }; alarm($timeout); } $status = 1; # assume the connection will work. socket(ECHO, AF_INET(), SOCK_STREAM(), $proto) or die("socket: $!"); $packFormat = 'S n a4 x8'; # Windows 95, SunOs 4.1+ #$packFormat = 'S n c4 x8'; # SunOs 5.4+ (Solaris 2) connect(ECHO, pack($packFormat, AF_INET(), $port, $serverAddr)) or $status = 0; close(ECHO); if (0 == Win32::IsWin95) { alarm(0); $SIG{'ALRM'} = $oldAlarmHandler; } return($status); }
This program will display:
echo: red.planet.net could not be found, sorry. saturn.planet.net is up.
When dealing with the echo service, you only need to make the connection in order to determine that the server is up and running. As soon as the connection is made, you can close the socket.
Most of the program should be pretty familiar to you by now. However, you might not immediately realize what return statement in the middle of the echo() fuNCtion does. The return statement is repeated here:
return(print("echo: $host could not be found, sorry.\n"), 0) if ! defined($serverAddr);
The statement uses the comma operator to execute two statements where normally you would see one. The last statement to be evaluated is the value for the series of statements. In this case, a zero value is returned. I'm not recommending this style of coding, but I thought you should see it a least oNCe. Now, if you see this technique in another programmer's scripts you'll understand it better. The return statement could also be done written like this:
if (! defined($serverAddr) { print("echo: $host could not be found, sorry.\n") return(0); }
One of the backbones of the Internet is the ability to transfer files. There are thousands of fcservers from which you can download files. For the latest graphic board drivers to the best in shareware to the entire set of UNIX sources, ftp is the answer.
The program in Listing 18.5 downloads the Perl FAQ in compressed
format from ftp.cis.ufl.edu and displays a directory in two formats.
Caution |
The ftplib.pl file can be found on the CD-ROM that accompanies this book. Please put it into your Perl library directory. I have modified the standard ftplib.pl that is available from the Internet to allow the library to work under Windows 95 and Windows NT. |
Turn on the warning compiler option.
Load the ftplib library.
Turn on the strict pragma.
Declare a variable to hold directory listings.
Turn debugging mode on. This will display all of the protocol commands and responses on STDERR.
Connect to the ftp server providing a userid of anonymous and your email address as the password.
Use the list() fuNCtion to get a directory listing without first changing to the directory.
Change to the /pub/perl/faq directory.
Start binary mode. This is very important when getting compressed files or executables.
Get the Perl FAQ file.
Use list() to find out which files are in the current directory and then print the list.
Use dir() to find out which files are in the current directory andthen print the list.
Turn debugging off.
Change to the /pub/perl/faq directory.
Use list() to find out which files are in the current directory and then print the list.
Listing 18.5 18LST05.PL-Using the ftplib Library
#!/usr/bin/perl -w require('ftplib.pl'); use strict; my(@dirList); ftp::debug('ON'); ftp::open('ftp.cis.ufl.edu', 'anonymous', '[email protected]') or die($!); @dirList = ftp::list('pub/perl/faq'); ftp::cwd('/pub/perl/faq'); ftp::binary(); ftp::gets('FAQ.gz'); @dirList = ftp::list(); print("list of /pub/perl/faq\n"); foreach (@dirList) { print("\t$_\n"); } @dirList = ftp::dir(); print("list of /pub/perl/faq\n"); foreach (@dirList) { print("\t$_\n"); } ftp::debug(); ftp::cwd('/pub/perl/faq'); @dirList = ftp::list(); print("list of /pub/perl/faq\n"); foreach (@dirList) { print("\t$_\n"); }
This program displays:
<< 220 flood FTP server (Version wu-2.4(21) Tue Apr 9 17:01:12 EDT 1996) ready. >> user anonymous << 331 Guest login ok, send your complete e-mail address as password. >> pass ..... << 230- Welcome to the << 230- University of Florida . . . << 230 Guest login ok, access restrictions apply. >> port 207,3,100,103,4,135 << 200 PORT command successful. >> nlst pub/perl/faq << 150 Opening ASCII mode data connection for file list. << 226 Transfer complete. >> cwd /pub/perl/faq << 250 CWD command successful. >> type i << 200 Type set to I. >> port 207,3,100,103,4,136 << 200 PORT command successful. >> retr FAQ.gz << 150 Opening BINARY mode data connection for FAQ.gz (75167 bytes). << 226 Transfer complete. >> port 207,3,100,103,4,138 << 200 PORT command successful. >> nlst << 150 Opening BINARY mode data connection for file list. << 226 Transfer complete. list of /pub/perl/faq FAQ FAQ.gz >> port 207,3,100,103,4, 139 << 200 PORT command successful. >> list << 150 Opening BINARY mode data connection for /bin/ls. << 226 Transfer complete. list of /pub/perl/faq total 568 drwxrwxr-x 2 1208 31 512 Nov 7 1995 . drwxrwxr-x 10 1208 68 512 Jun 18 21:32 .. -rw-rw-r-- 1 1208 31 197446 Nov 4 1995 FAQ -rw-r--r-- 1 1208 31 75167 Nov 7 1995 FAQ.gz list of /pub/perl/faq FAQ FAQ.gz
I'm sure that you can pick out the different ftp commands and responses in this output. Notice that the ftp commands and responses are only displayed when the debugging feature is turned on.
One of the most valuable services offered on the net is Usenet newsgroups. Most newsgroups are question and answer forums. You post a message-perhaps asking a question. And, usually, you get a quick response. In addition, a small number of newsgroups are used to distribute information. Chapter 22, "Internet Resources," describes some specific newsgroups that you might want to read.
Like most services, NNTP uses a client/server model. You connect
to a news server and request information using NNTP. The protocol
consists of a series of commands and replies. I think NNTP is
a bit more complicated than the other because the variety of things
you might want to do with news articles is larger.
Caution |
Some of the NNTP commands will result in very large responses. For example, the LIST command will retrieve the name of every newsgroup that your server knows about. Because there are over 10,000 newsgroups it might take a lot of time for the response to be received. |
I suggest using Perl to filter newsgroups or to retrieve all the articles available and create reports or extracts. Don't use Perl for a full-blown news client. Use Java, Visual Basic, or another language that is designed with user interfaces in mind. In addition, there are plenty of great free or inexpensive news clients available, why reinvent the wheel?
Listing 18.6 contains an object-oriented program that eNCapsulates a small number of NNTP commands so that you can experiment with the protocol. Only the simplest of the commands have been implemented to keep the example small and uNCluttered.
Turn on the warning compiler option.
Load the Socket module.
Turn on the strict pragma.
Begin the News package. This also started the definition of the News class.
Define the new() fuNCtion2-the constructor for the News class.
Get the class name from the parameter array.
Get the name of the news server from the parameter array.
Declare a hash with two entries-the class properties.
Bless the hash.
Call the initialize() fuNCtion that connects to the server.
Define a signal handler to gracefully handle Ctrl+C and Ctrl+Break.
Return a refereNCe to the hash-the class object.
Define the initialize() fuNCtion-connects to the news server.
Get the class name from the parameter array.
Get the protocol number, port number, and server address.
Create a socket.
Initialize the format for the pack() fuNCtion.
Connect to the news server.
Modify the socket to use non-buffered I/O.
Call the getInitialResponse() fuNCtion.
Define getInitialResponse()-receive response from connection.
Get the class name from the parameter array.
Initialize a buffer to hold the reponse.
Get the reponse from the server.
Print the response if debugging is turned on.
Define closeSocket()-signal handler.
Close the socket.
End the script.
Define DESTROY()-the deconstructor for the class.
Close the socket.
Define debug()-turns debugging on or off.
Get the class name from the parameter array.
Get the state (on or off) from the parameter array.
Turn debugging on if the state is on or 1.
Turn debugging off if the state is off or 0.
Define send()-send a NNTP command and get a response.
Get the class name from the parameter array.
Get the command from the parameter array.
Print the command if debugging is turned on.
Send the command to the news server.
Get a reply from the news server.
Print the reply if debugging is turned on.
Return the reply to the calling routine.
Define article()-gets a news article from the server.
Get the class name from the parameter array.
Get the article number from the parameter array.
Return the response to the ARTICLE command. No processing of the reponse is needed.
Define group()-gets information about a specific newsgroup.
Get the class name from the parameter array.
Get the newsgroup name from the parameter array.
Split the response using space characters as a delimiter.
Define help()-gets a list of commands and descriptions from server.
Return the response to the HELP command.
Define quit()-ends the session with the server.
Send the QUIT command.
Close the socket.
Start the main package or namespace.
Declare some local variables.
Create a News object.
Turn debugging on.
Get information about the comp.lang.perl.misc newsgroup.
If the reply is good, display the newgroup information.
Turn debugging off.
Initialize some loop variables. The loop will execute 5 times.
Start looping through the article numbers.
Read an article, split the response using newline as the delimiter.
Search through the lines of the article for the From and Subject lines.
Display the article number, author, and subject.
Turn debugging on.
Get help from the server. They will be displayed because debugging is on.
Stop the NNTP session.
Define the min() fuNCtion-find smallest element in parameter array.
Store the first element into $min.
Iterate over the parameter array.
If the current element is smaller than $min, set $min equal to it.
Return $min.
Listing 18.6 18LST06.PL-Using the NNTP Protocol to Read Usenet News
#!/usr/bin/perl -w use Socket; use strict; package News; sub new { my($class) = shift; my($server) = shift || 'news'; my($self) = { 'DEBUG' => 0, 'SERVER' => $server, }; bless($self, $class); $self->initialize(); $main::SIG{'INT'} = 'News::closeSocket'; return($self); } sub initialize { my($self) = shift; my($proto) = getprotobyname('tcp') || 6; my($port) = getservbyname('nntp', 'tcp') || 119; my($serverAddr) = (gethostbyname($self->{'SERVER'}))[4]; socket(SOCKET, main::AF_INET(), main::SOCK_STREAM(), $proto) or die("socket: $!"); my($packFormat) = 'S n a4 x8'; # Windows 95, SunOs 4.1+ #my($packFormat) = 'S n c4 x8'; # SunOs 5.4+ (Solaris 2) connect(SOCKET, pack($packFormat, main::AF_INET(), $port, $serverAddr)) or die("connect: $!"); select(SOCKET); $| = 1; select(main::STDOUT); $self->getInitialResponse(); } sub getInitialResponse { my($self) = shift; my($inpBuf) = ''; recv(SOCKET, $inpBuf, 200, 0); print("<$inpBuf\n") if $self->{'DEBUG'}; } sub closeSocket { # close smtp socket on error close(SOCKET); die("\nNNTP socket closed due to SIGINT\n"); } sub DESTROY { close(SOCKET); } sub debug { my($self) = shift; my($state) = shift; $self->{'DEBUG'} = 1 if $state =~ m/on|1/i; $self->{'DEBUG'} = 0 if $state =~ m/off|0/i; } sub send { my($self) = shift; my($buffer) = @_; print("> $buffer") if $self->{'DEBUG'}; send(SOCKET, $buffer, 0); # Use a large number to receive because some articles # can be huge. recv(SOCKET, $buffer, 1000000, 0); print("< $buffer") if $self->{'DEBUG'}; return($buffer); } # NNTP Commands sub article { my($self) = shift; my($articleNumber) = shift; return($self->send("ARTICLE $articleNumber\n")); } sub group { my($self) = shift; my($newsgroup) = shift; split(/ /, $self->send("GROUP $newsgroup\n")); } sub help { return($_[0]->send("HELP\n")); } sub quit { $_[0]->send("QUIT\n"); close(SOCKET); } package main; my(@lines, $from, $help, $subject); my($obj) = News->new('jupiter.planet.net'); $obj->debug('ON'); my($replyCode, $numArticles, $firstArticle, $lastArticle) = $obj->group('comp.lang.perl.misc'); if (211 == $replyCode ) { printf("\nThere are %d articles, from %d to %d.\n\n", $numArticles, $firstArticle, $lastArticle); } $obj->debug('OFF'); my($loopVar); my($loopStart) = $firstArticle; my($loopEnd) = min($lastArticle, $firstArticle+5); for ($loopVar = $loopStart; $loopVar <= $loopEnd; $loopVar++) { @lines = split(/\n/, $obj->article($loopVar)); foreach (@lines) { $from = $1 if (/From:\s(.*?)\s/); $subject = $1 if (/Subject:\s(.*)/); } print("#$loopVar\tFrom: $from\n\tSubject: $subject\n\n"); } $obj->debug('ON'); $help = $obj->help(); $obj->quit(); sub min { my($min) = shift; foreach (@_) { $min = $_ if $_ < $min; } return($min); }
This program displays:
<200 jupiter.planet.net InterNetNews NNRP server INN 1.4 22-Dec-93 ready (post > GROUP comp.lang.perl.misc < 211 896 27611 33162 comp.lang.perl.misc There are 896 articles, from 27611 to 33162. #27611 From: [email protected] Subject: Re: How do I suppress this error message #27612 From: [email protected] Subject: Re: find and replace #27613 From: [email protected] Subject: GRRRR!!!! Connect error! #27614 From: [email protected] Subject: Re: Why does RENAME need parens? #27615 From: [email protected] Subject: Re: Date on Perl 2ed moved? #27616 From: Tim Subject: Re: How do I suppress this error message > HELP < 100 Legal commands authinfo user Name|pass Password article [MessageID|Number] body [MessageID|Number] date group newsgroup head [MessageID|Number] help ihave last list [active|newsgroups|distributions|schema] listgroup newsgroup mode reader newgroups yymmdd hhmmss ["GMT"] [<distributions>] newnews newsgroups yymmdd hhmmss ["GMT"] [<distributions>] next post slave stat [MessageID|Number] xgtitle [group_pattern] xhdr header [range|MessageID] xover [range] xpat header range|MessageID pat [morepat...] xpath xpath MessageID Report problems to <[email protected]> .
The program previously listed is very useful for hacking but it is not ready for professional use in several respects. The first problem is that it pays no attention to how large the iNComing article is. It will read up to one million characters. This is probably not good. You might consider a different method. The second problem is that it ignores error messages sent from the server. In a professional program, this is a bad thing to do. Use this program as a lauNChpad to a more robust application.
Unfortunately, the HTTP protocol is a bit extensive to cover in this introductory book. However, if you've read and understood the examples in this chapter then, you'll have little problem downloading some modules from the CPAN archives and quickly writing your own Web crawling programs. You can find out more about CPAN in Chapter 22, "Internet Resources."
In order to get you started, there are two files on the CD-ROM, URL.PL and URL-GET.PL. These libraries will retrieve Web documents when given a specific URL. Place them into your Perl directory and run the program in Listing 18.7. It will download the Perl home page into the $perlHomePage variable.
Load the url_get library.
Initialize $perlhomePage with the contents of the Perl home page.
Listing 18.7 18LST07.PL-Retrieving the Perl Home Page
require 'url_get.pl'; $perlHomePage = url_get('http://www.perl.com');
The HTTP standard is kept on the http://info.cern.ch/hypertext/www/ protocols/HTTP/HTTP2.html Web page.
Learning Internet protocols will give you a very valuable skill and enable you to save time by automating some of the more mundane tasks you do. I'm sure you'll be able to come up with some fascinating new tools to make yourself more productive. For example, the other day I stumbled across a Web site that searched a newsgroup for all URLs mentioned in the messages and stored them on a Web page sorted by date. This is an obvious time saver. That Webmaster no longer needs to waste time reading the message to find interesting sites to visit.
You started this chapter off with a list of some protocols or services that are available. Then you learned that protocols are a set of commands and responses that both a server and a client understand. The high-level protocols (like mail and file-transfer) rest on top of the TCP/IP protocol. TCP/IP was ignored because, like any good foundation, you don't need to know its details in order to use it.
Servers and clients use a different set of fuNCtions. Servers use socket(), bind(), listen(), accept(), close(), and a variety of I/O fuNCtions. Client use socket(), connect(), close(), and a variety of I/O fuNCtions.
On the server side, every socket must have an address that consists of the server's address and a port number. The port number can be any number greater than 1024. The name and port are combined using a colon as a delimiter. For example, www.foo.com:4000.
Next, you looked at an example of the time service. This service is useful for syNChronizing all of the machines on a network.
SMTP or Simple Mail Transport Protocol is used for sending mail. There are only five basic commands: HELO, MAIL, RCPT, DATA, and QUIT. These commands were discussed and then a mail sending program was shown in Listing 18.2.
The natural corollary to sending mail is receiving mail-done with the POP or Post Office Protocol. Listing 18.3 contained a program to filter iNComing mail looking for a specific string. It produced a report of the messages that contained that string in the subject line.
After looking at POP, you saw how to use the Echo service to see if a server was running. This service is of marginal use in Windows operating systems because they now handle the SIGALRM signal. So a process might wait forever for a server to respond.
Then, you looked at ftp or File Transfer Protocol. This protocol is used to send files between computers. The example in Listing 18.5 used object-oriented techniques to retrieve the Perl Frequently Asked Questions file.
NNTP was next. The news protocol can retrieve articles from a news server. While the example was a rather large program, it still only covered a few of the commands that are available.
Lastly, the http protocol was mentioned. A very short-two line-program was given to retrieve a single Web page.
Answers to Review Questions are in Appendix A.