Behind the Scenes

Many people ask how certain Pine features are implemented. This section outlines some of the details.

Address Books

Beginning with Pine 4.00 there are two types of address book storage. There are local address books, which are the address books that are stored in a local file (the address books Pine has had all along); and there are remote address books, which are stored on an IMAP server.

Information About Remote Address Books

NOTE: The remote address book capability does not allow you to access an existing local address book from a remote system! That is, you can't set the remote address book to something like {remote.host}.addressbook and expect to access the existing .addressbook file on remote.host. Instead, you need to create a new remote address book in a new, previously unused remote mail folder. Then you can use the Select and Apply Save commands in the address book screen to Save all of the entries from an existing local address book to the new remote address book.

Beginning with Pine 4.00 there is a new type of address book called a remote address book. A remote address book is stored in a mail folder on an IMAP server. A Pine remote address book is just like a Pine local address book in that it is not interoperable with other email clients. The folder is a regular folder containing mail messages but those messages are special. The first message must be a pine remote address book header message which contains the header x-pine-addrbook. The last message in the folder contains the address book data. In between the first and the last message are old versions of the address book data. The address book data is simply stored in the message as it would be on disk, with no MIME encoding. When it is used the data from the last message in the folder is copied to a local file and then that file is used exactly like a local address book file is used. When a change is made the modified local file is appended to the remote folder in a new message. In other words, the local file is just a cache copy of the data in the remote folder. Each client which uses the remote address book will have its own cache copy of the data. Whenever a copy is done the entire address book is copied, not just the entries which have changed.

Pine can tell that the remote data has changed by one of several methods. If the date contained in the Date header of the last message has changed then it knows it has changed. If the UID of the last message has changed, or the number of messages in the folder has changed, it knows that it has changed. When Pine discovers the folder has changed it gets a new copy and puts it in the local cache file.

There is a new configuration file variable for remote address books called remote-abook-metafile. The variable is the name of a file in which information about remote address books is stored. There is one line in the metafile for each remote address book. The information stored there is the name of the cache file and information to help figure out when the remote folder has changed. If the metafile or any of the cache files is deleted then Pine will rebuild them the next time it runs.

Remote address books have names that look just like regular remote mail folder names. For example:

{host.domain}foldername

Pine decides whether or not an address book is remote simply by looking at the first character of the address book name and comparing it to '{'.

Information About All Address Books

The address book is named, by default, .addressbook in the user's Unix home directory, or in the case of PC-Pine, ADDRBOOK, in the same directory as the PINERC file. There may be more than one address book, and the default name can be overridden via an entry in any of the Pine configuration files. The two configuration variables address-book and global-address-book are used to specify the names of the address books. Each of these variables is a list variable. The total set of address books for a user is the combination of all the address books specified in these two lists. Each entry in the list is an optional nickname followed by an address book name. The nickname is everything up to the last space before the file name. The global-address-book list will typically be configured in the system-wide configuration file, though a user may override it like most other variables. Address books which are listed in the global-address-book variable are forced read-only, and are typically shared among multiple users.

Local address books (or local cache files for remote address books) are simple text files with lines in the format:

<nickname>TAB<fullname>TAB<address>TAB<fcc>TAB<comments>

The last two fields are optional. A "line" may be made up of multiple actual lines in the file by using continuation lines, which are lines beginning with SPACE characters. The line breaks may be after TABs or in between addresses in a distribution list. Each actual line in the file must be less than 1000 characters in length.

Nicknames (the first field) are short names that the user types instead of typing in the full address. There are several characters which aren't allowed in nicknames in order to avoid ambiguity when parsing the address (SPACE, COMMA, @, ", ;, :, (, ), [, ], <, >, \). Nicknames aren't required. In fact, none of the fields is required.

The fullname field is usually stored as Last_name, First_name, in order that a sort on the fullname field comes out sorted by Last_name. If there is an unquoted comma in the fullname, Pine will flip the first and last name around and get rid of the comma when using the entry in a composition. It isn't required that there be a comma, that's only useful if the user wants the entries to sort on last names.

The address field takes one of two forms, depending on whether the entry is a single (simple) address or a distribution list. For a simple entry, the address field is an RFC 822 address. This could be either the email-address part of the address, i.e., the part that goes inside the brackets (<>), or it could be a full RFC 822 address. The phrase part of the address (the fullname) is used unless there is a fullname present in the fullname field of the address book entry. In that case, the fullname of the address book entry replaces the fullname of the address. For a distribution list, the <address> is in the format:

"(" <address>, <address>, <address>, ... ")"

The only purpose for the parentheses around the list of addresses is to make it easier for the parsing routines to tell that it is a simple entry instead of a list. The two are displayed differently and treated slightly differently in some cases, though most of the distinction has disappeared. Each of the addresses in a list can be a full RFC 822 address with fullname included, or it may be just the simple email-address part of the address. This allows the user to have a list which includes the fullnames of all the list members. In both the simple and list cases, addresses may also be other nicknames which appear in this address book or in one of the other address books. (Those nicknames are searched for by looking through the address books in the order they appear in the address book screen, with the first match winning.) Lists may be nested. If addresses refer to each other in a loop (for example, list A includes list B which includes list A again) this is detected and flagged. In that case, the address will be changed to "**** address loop ****".

The optional fcc field is a folder name, just like the fcc field in the composer headers. If the first address in the To field of a composition comes from an address book entry with an fcc field, then that fcc is placed in the fcc header in the composer.

The comments field is just a free text field for storing comments about an entry. By default, neither the fcc nor the comments field is shown on the screen in the address book screen. You may make those fields visible by configuring the variable addressbook-formats. They are also searched when you use the WhereIs command in the address book screen and are visible when you View or Update an entry.

The address book is displayed in the order that it is stored. When the user chooses a different sorting criterion, the data is actually sorted and stored, as opposed to showing a sorted view of the data.

When the address book is written out, it is first written to a temporary file and if that write is successful it is renamed. This guards against errors writing the file that might destroy the whole address book. The address book is re-written after each change. If the address book is a remote address book, the file is then appended to the remote mail folder using IMAP.

The end-of-line character(s) in the address book file are those native to the system writing it. So it is <LF> on Unix and <CR><LF> on PC's. However, both Unix and PC versions of Pine can read either format, so it should be possible to share a read-only address book among the two populations (using NFS, for example). The end-of-line character for the LookUp file is always just <LF>, even on a PC.

Address Book Lookup File

Starting in 3.90 there is an additional file for each address book, called the LookUp file. It usually has the same name as the address book file with the suffix ".lu" appended. (It might have a different name if a file name length restriction prohibited that name.) This file is created and maintained by Pine. If it is deleted, Pine will recreate it next time it runs. Its purpose is to speed up lookups for large address books and to reduce memory requirements for large address books. A fairly detailed description of how it is used is given in src/pine/adrbklib.h.

The lookup file changes whenever the address book itself is changed. If it doesn't exist, Pine attempts to create it. If Pine doesn't have permission to create the lookup file with the standard name, it will create a temporary version in a temp directory. You want to avoid this since it would have to be rebuilt every time Pine was run, and rebuilding takes a significant time for a large address book. So, if you're going to have a shared address book in a read-only directory, it is highly desirable to create the lookup file so that the users sharing it won't have to each create a copy in a temp directory. You can do that by running Pine and accessing the address book under a user id which does have permission to write the file or by using the -create_lu command line argument to Pine. If users may be using a shared address book that needs updating, it is best to move the old address book to another name rather than copying over it since the file may be opened by running Pines. It is also best to make the lookup file for the new addrbook before moving it and the address book file into place, otherwise users may get stuck attempting to initialize the new lookup file. The lookup file contains a timestamp which records the mtime of the address book file when the lookup file was last updated. Whenever a user runs Pine the current mtime of the address book is checked against this timestamp and if they differ, Pine will want to rebuild the lookup file. Because of this, it isn't a good idea to build the lookup file and then copy the address book and lookup file into place. You should move it or copy it in some way which preserves the address book file's mtime (e.g., use mv or tar).

Validity Checking of Address Books

There is no file locking done on Pine address books, however, there is considerable validity checking done to make sure that the address book hasn't changed unexpectedly. Whenever the address book is about to be changed, a check is made to see if the file is newer than when we read it or the remote address book folder has changed since we last copied it. If either of these is true, the change is aborted.

There is an automatic, behind-the-scene check that happens every so often, also. For example, if someone else changes one of the address books that you have configured, your Pine's copy of the address book will usually be updated automatically without you noticing. This checking happens at the same time as new mail checking takes place, unless you are actively using the address book, in which case it happens more frequently.

Another sort of validity check is that the lookup file contains a timestamp internally that is supposed to match the time that the address book file itself was last modified. If the lookup file timestamp doesn't match the date of the address book file, a new lookup file is built. If you are having trouble, it is always ok to remove the lookup file and restart. Pine will automatically rebuild the lookup file.

One other validity check happens when looking up an entry in the address book file. An entry is looked up by first getting an offset into the address book file from the lookup file. A seek to that location is done and then the entry is read. An entry should be at the start of a line. If it isn't, something is wrong. In that case, the lookup file is rebuilt and the operation is repeated if possible.

Checkpointing

Periodically Pine will save the whole mail folder to disk to prevent loss of any mail or mail status in the case that it gets interrupted, disconnected, or crashes. The period of time Pine waits to do the checkpoint is calculated to be minimally intrusive. The timing can be changed (but usually isn't) at compile time. Folder checkpointing happens for both local folders and those being accessed with IMAP. The delays are divided into three categories:

Good Time:: This occurs when Pine has been idle for more than 30 seconds. In this case Pine will checkpoint if 12 changes to the file have been made or at least one change has been made and a checkpoint hasn't been done for five minutes.
Bad Time:: This occurs just after Pine has executed some command. Pine will checkpoint if there are 36 outstanding changes to the mail file or at least one change and no checkpoint for ten minutes.
Very Bad Time:: Done when composing a message. In this case, Pine will only checkpoint if at least 48 changes have been made or at least one change has been made in the last twenty minutes with no checkpoint.

Debug Files

If Unix Pine is compiled with the compiler DEBUG option on (the default), then Pine will produce debugging output to a file. The file is normally .pine-debugX in the user's home directory where X goes from 1 to 4. Number 1 is always the most recent session and 4 the oldest. Four are saved because often the user has gone in and out of Pine a few times after a problem has occurred before the expert actually gets to look at it. The amount of output in the debug files varies with the debug level set when Pine is compiled and/or as a command line flag. The default is level 2. This shows very general things and records errors. Level 9 produces copious amounts of output for each keystroke.

PC-Pine creates a single debug file named PINEDEBG.TXT in the same directory as the PINERC file.

Filters

Pine is not designed to process email messages as they are delivered; rather Pine depends on the fact that some other program (sendmail, etc) will deliver messages and Pine simply reads the email folders which that other program creates. For this reason, Pine cannot filter incoming email into different folders. It can, however, work alongside most of the programs available over the Internet which perform this task. Pine is known to operate successfully with the Elm filter program and with procmail.

Pine allows users to specify a set of incoming-folders. Pine will separate out all the folders listed as incoming-folders and offer convenient access to these. We hope that in the future Pine will be able to offer new message counts for all of the incoming folders, but we haven't done this so far because of the performance penalty.

Folder Formats and Name Extensions

A folder is a group of messages. The default format used by Unix Pine is the Berkeley mail format. It is also used by the standard mail command and by elm. Unix Pine also understands message folders in other formats, such as Tenex, MH, MMDF, and Netnews.

PC-Pine reads and writes local (PC) folders in a special format similar to the Tenex format. Near as we can tell, PC-Pine is the only program to use this format. Beginning with version 3.90, PC-Pine includes a ReadOnly driver for the Berkeley mailbox format in addition. That means that you can import Unix mail folders, or mount them via NFS or SMB, and PC-Pine can read them --but not modify them.

Extensions. In the past, file name extensions have been significant in both Unix Pine and PC-Pine, but this has caused more problems than it solved. Therefore, on Unix Pine extensions no longer have any special meaning, and this is the trend for PC-Pine as well.

By default, PC-Pine adds ".MTX" to the name of any local (PC) folders that are referenced, and suppresses the extension from the "Folder List" display. Now that PC-Pine can read more than one folder format, the MTX extension no longer implies a particular format, and is largely irrelevant. By using the folder_extension option, you can change this behavior. In particular, you may set folder-extension to the "null string" (a pair of double quotes) which tells PC-Pine to neither add nor hide-from-view any folder name extension.

The reason you might wish to over-ride the MTX default is that recent versions of PC-Pine have the ability to open (albeit ReadOnly) normal Unix mail folders. Since it might be inconvenient to rename all of them to have an MTX extension, it is possible with this option to switch PC-Pine's behavior so that such folders can be seen and accessed without changing their names. However, doing this means that your existing PC-Pine local folders will have apparently changed their names. For example, if you had a local folder named "FOO" it will now appear in the "Folder List" as "FOO.MTX". If you wish to save additional messages to that folder, you will need to enter the full name, "FOO.MTX" at the Save prompt. Likewise for GoTo.

If you wish to permanently avoid having to deal with folder name extensions, you will need to set this option to the null string by entering two double- quote marks, and you will need to rename your existing local folders to not have an MTX extension. In DOS this can be done in one command, once you have changed to your mail directory: RENAME *.MTX *.

We don't know why you might wish to, but you could also use this option to tell PC-Pine to use an extension other than MTX. In this case, enter the three characters you desire to use in lieu of "MTX". Note that your existing folders will need to be renamed to correspond to this new extension.

Berkeley Mail Format

This format comes to us from the ancient UNIX mail program, /bin/mail. (Note that this doesn't have anything to do with Berkeley, but we call it the Berkeley mail file format anyway.) This program was actually used to interactively read mail at one time, and is still used on many systems as the local delivery agent. In the Berkeley mail format, a folder is a simple text file. Each message (including the first) must start with a separator line which takes approximately the form:

From [email protected] Wed Aug 11 14:32:33 1993

Each message ends with two blank lines. There are actually several different variations in the date part of the string, twenty at last count. Because of the format of the separators, lines in the mail message beginning with "From ", space included, risk being confused as message separator lines. Some mail programs will interpret any line beginning with "From " as a message separator, while others --including Pine-- will not be confused unless the line really looks like a message separator, complete with address and date. Such lines will be modified to begin with ">From ". In deference to other mail programs, you may also set the save-will-quote-leading-froms feature, in which case any line beginning with "From " will be modified as above. If you see this occasionally in incoming mail messages, the culprit is not Pine but the message delivery program being used at your site.

You can fool Pine into thinking a file is a mail folder by copying a suitable message separator from a real folder to the beginning of the file and wherever you want message boundaries. The vast majority of INBOXes Pine reads and folders it writes are of this format.

Tenex and MTX Formats

Like the Berkeley format, the Tenex folder format uses a single file per folder. Historically, the name of Tenex-format folders ended with .txt, but this rule is no longer enforced. The file format consists of a header line followed by the message text for each message. The header is in one of two forms:

dd-mmm-yy hh:mm:ss-zzz,n;ffffffffffff
dd-mmm-yyyy hh:mm:ss sssss,n;ffffffffffff

and is immediately followed by a newline (and the message text). The fields in the formats are:

dd two-digit day of month (leading space if a single-digit day)
mmm three-letter English month name (Jan, Feb, etc.)
yy two-digit year in 20th century (obsolete)
yyyy four-digit year
hh two-digit hour in 24-hour clock (leading zero if single-digit)
mm two-digit minute (leading zero)
ss two-digit second (leading zero)
zzz three-letter North American time zone (obsolete)
sssss signed four-digit international time zone as in RFC 822
n one or more digits of the size of the following message in
bytes
ffffffffffff
twelve-digit octal flags value

Punctuation is as given above.

The time in the header is the time that message was written to the folder. The flags are interpreted as follows: the high order 30 bits are used to indicate user flags, the next two bits are reserved for future usage, the low four bits are used for system flags (010 = answered, 04 = flagged urgent, 02 = deleted, 01 = seen).

If a Tenex-format (or empty) file named mail.txt exists in a Pine user's home directory, this triggers special processing in Pine. When INBOX is opened, mail is automatically moved from /usr/spool/mail into mail.txt in the user's home directory.

The format used by PC-Pine is identical to the Tenex format, with two exceptions: the folder name ends with .MTX instead of .txt (this is a requirement in the MTX format), and DOS-style CR/LF newlines are used instead of UNIX-style LF newlines.

Netnews Format

The netnews format is a ReadOnly format which uses directories under /usr/spool/news as folders. The /usr/spool/news/ prefix is removed and all subsequent ``/'' (slash) characters are changed to ``.'' (period). For example, the netnews folder name comp.mail.misc refers to the directory name /usr/spool/news/comp/mail/misc. In addition, the news folder name must appear in the file /usr/lib/news/active for it to be recognized. Individual messages are stored as files in that directory, with file names being the ASCII form of a number assigned to that message. The default locations above can be changed with the config variables news-spool-directory and news-active-file-path.

Folder Locking

There are two kinds of locking which Pine has to worry about. The first might be called program-contention locking. This affects the times when a program is performing actual updates on a folder. An update might be a message delivery program appending a message (sendmail delivering a message to an INBOX), status changes (checkpoints by Pine every few minutes) or deletion of messages (an expunge in Pine). For moderate sized mail messages, these operations should not last for more than a few seconds. The second kind of locking has to do with user-contention situations. This would be the case when one folder is shared by a group of people or even when one person starts multiple email sessions all of which access the same folders and INBOX.

There are two standard locking mechanisms which handle program-contention locking. To be on the safe side, Pine implements both of them. The older mechanism places a file xxxx.lock (where xxxx is the name of the file being locked) in the same directory as the file being locked. This makes use of the fact that directory operations are atomic in UNIX and mostly works across NFS. There are involved algorithms used to determine if a lock has been held for an excessive amount of time and should be broken. The second program-contention locking mechanism uses the flock() system call on the mailbox. This is much more efficient and the locks can't get stuck because they go away when the process that created them dies. This is usually found on 4BSD and related machines.

In addition to these, Pine--through the c-client library--provides robust locking which prevents several users (or several instances of the same user) having a mail file open (for update) at once. This user-contention lock is held the entire time that the folder is in use.

With IMAPd 7.3(63) and Pine 3.84 and higher, the second Pine session which attempts to open a particular folder (usually INBOX) with Pine will ``win''. That is to say, the second session will have read/write access to the folder. The first user's folder will become read-only. (Note that this is exactly the opposite of the behavior prior to Pine 3.84 where the second open was read-only. Having the latest open be read-write seems to match more closely with what users would like to have happen in this situation.) Pine's additional locking is only effective against multiple uses of Pine or other programs using the c-client library, such as MailManager, ms, IMAPd and a few others. Beginning with Pine 3.85, there is a -o command line flag to intentionally open a mailbox read-only.

Pine locking on UNIX systems works by creating lock files in /tmp of the form \usr\spool\mail\joe. The system call flock() is then used on these files; the existence of the file alone does not constitute a lock. This lock is created when the folder is opened and destroyed when it is closed. When the folder is actually being written, the standard UNIX locks are also created.

If a folder is modified by some other program while Pine has it open, Pine will give up on that mail file, concluding it's best not to do any further reads or writes. This can happen if another mailer that doesn't observe Pine's user-contention locks (e.g. elm or mail) is run while Pine has the mail folder open. Pine checkpoints files every few minutes, so little data can be lost in these situations.

PC-Pine does not do any folder locking. It depends on IMAP servers to handle locking of remote folders. It is assumed that only one Pine session can be running on the PC at a time, so there is no contention issue around folders on the PC itself.

INBOX and Special Folders

The INBOX folder is treated specially. It is normally kept open constantly so that the arrival of new mail can be detected. The name INBOX refers to wherever new mail is retrieved on the system. If the inbox-path variable is set, then INBOX refers to that. IMAP servers understand the concept of INBOX, so specifying the folder {imap.u.example.edu}INBOX is meaningful. The case of the word INBOX is not important, but Pine tends to display it in all capital letters.

The folders for sent mail and saved messages folders are also somewhat special. They are automatically created if they are absent and recreated if they are deleted.

Internal Help Files

The file pine.hlp in the pine subdirectory of the distribution contains all the help text for Pine. On UNIX, it is compiled right into the Pine binary as strings. This is done to simplify installation and configuration. The pine.hlp file is in a special format that is documented at the beginning of the file. It is divided into sections, each with a name that winds up being referenced as a global variable. This file is processed by two awk scripts and turned into C files that are compiled into Pine.

PC-Pine, which tries to run on machines with as little as 640k of memory, leaves the Pine help text out of the executable. PINE.EXE, PINE.HLP, and PINE.NDX are all needed for PC-Pine's help system.

International Character Sets

While Pine was designed in the U.S. and used mostly for English-language correspondence, it is a goal for Pine to handle email in almost any language. Many sites outside of the U.S. run Pine in their native language. The default character set for Pine is US-ASCII. That can be changed in the personal or system-wide configuration file with the variable character-set.

When reading incoming email, Pine allows all character sets to pass through. Pine doesn't actually display the characters but simply passes them through; it is up to the actual display device to show the characters correctly. When composing email, Pine will accept input in any language and tag the message according to the character-set variable. Again, it is up to the input device to generate the correct sequences for the character set being used.

With the exception of UNICODE-1-1-UTF-7, the outgoing message is checked to see if it is all US-ASCII text (and contains no escape characters). In that case, the text will be labeled as US-ASCII even if the character-set variable is set to something else. The theory is that every reasonable character set will have US-ASCII as a subset, and that it makes sense to label the text with the lowest-common-denominator label so that more mailers will be able to display it. Text in the UNICODE-1-1-UTF-7 character set is never re-labeled as US-ASCII. If the outgoing message is not all US-ASCII text, then it will be labeled with the character-set variable set by the user. If the user has not set the character-set variable then it will be labeled as X-UNKNOWN-CHARSET.

BUG: If you prepare a UNICODE-1-1 document and read it into the composer with ^R, Pine may mistreat it. If your document, when misviewed as 8-bit bytes, does not contain any individual bytes greater than 0x7f base 16, then Pine will re-label your outgoing message as US-ASCII, even if your message is really in Unicode Cyrillic, Arabic, or Thai. On the other hand, if your UNICODE-1-1, when misviewed as 8-bit bytes, does contain at least one individual byte greater than 0x7f base 16, as is likely for Unicode French/German/Spanish, Greek, Japanese, and Chinese, then Pine will retain the UNICODE-1-1 label.

The character sets are:

US-ASCII Standard 7 bit English characters
ISO-8859-1 8 bit European "latin 1" character set
ISO-8859-2 8 bit European "latin 2" character set
ISO-8859-3 8 bit European "latin 3" character set
ISO-8859-4 8 bit European "latin 4" character set
ISO-8859-5 8 bit Latin and Cyrillic
ISO-8859-6 8 bit Latin and Arabic
ISO-8859-7 8 bit Latin and Greek
ISO-8859-8 8 bit Latin and Hebrew
ISO-8859-9 8 bit European "latin 5" character set
ISO-8859-10 8 bit European "latin 6" character set
KOI8-R 8 bit Latin and Russian
VISCII 8 bit Latin and Vietnamese
ISO-2022-JP Latin and Japanese
ISO-2022-KR Latin and Korean
UNICODE-1-1 Unicode
UNICODE-1-1-UTF-7 Mail-safe Unicode
ISO-2022-JP-2 Multilingual

Earlier versions of Pine made use of the character set tags associated with text in MIME to decide if the text should be displayed or not. Depending on the character set tag and the character-set variable in Pine, the text was either displayed as is, displayed with some characters filtered out, or not displayed at all. The current version uses a much simpler algorithm in order to maximize the chance that useful contents are readable by the user. It simply displays all messages of type text and makes no attempt to filter out characters that may be in the wrong character set. If the text is tagged as something other than US-ASCII and the tag does not match the character set that the character-set variable is set to, then a warning is printed at the start of the message. In that case, it is possible that the text will be displayed incorrectly. For example, if the text is one variant of ISO-8859 and the display device is another variant, some of the characters may show up on the screen as the wrong character. Or if the text is Japanese and the display device is not, some parts of the message may be total gibberish (which will look like ASCII gibberish). On the other hand, the parts of the Japanese message that really are US-ASCII will be readable in the midst of the gibberish.

In the case of PC-Pine, the character values cannot be passed through to the display device unaltered since MS-DOS uses various non-standard character sets called "Code Pages".

The mapping between DOS Code Page and standard character set is controlled by the character-set variable in the PINERC file and the PC's installed Code Page. PC-Pine will automatically map common characters in IBM Code Pages 437, 850, 860, 863, and 865 to ISO-8859-1 and back when the PINERC has character-set=ISO-8859-1. Pine will also map common characters for IBM Code Page 866 to ISO-8859-5 and back when character-set=ISO-8859-5. The mappings are bi-directional, and applied to all saved text attachments in the defined character set, messages exported, etc.

Alternatively, the translation tables can be configured externally and applied at run time whenever the character-set variable is set to something other then "US-ASCII" (the default). PC-Pine looks in the text file pointed to by the environment variable ISO_TO_CP for the table to use for mapping text matching the type defined by the character-set variable into the local Code Page value. PC-Pine looks in the text file pointed to by the environment variable CP_TO_ISO for the table to use for mapping text in the local Code Page into outbound text tagged with the character-set variable's value.

A text file containing a character set mapping table is expected to contain 256 elements where each element is a decimal number separated from the next element by white-space (space, tab or newline, but no commas!). The index of the element is the character's value in the source character set, and the element's value is the corresponding character's value in the destination character set.

Interrupted and Postponed Messages

If the user is composing mail and is interrupted by being disconnected (SIGHUP, SIGTERM or end of file on the standard input), Pine will save the interrupted composition and allow the user to continue it when he or she resumes Pine. As the next Pine session starts, a message will be given that an interrupted message can be continued. To continue the interrupted message, simply go into the composer. To get rid of the interrupted message, go into the composer and then cancel the message with ^C.

Composition of half-done messages may be postponed to a later time by giving the ^O command. Other messages can be composed while postponed messages wait. All of the postponed messages are kept in a single folder. Postponing is a good way to quickly reference other messages while composing.

Message Status

The c-client library allows for several flags or status marks to be set for each message. Pine uses four of these flags: UNSEEN, DELETED, ANSWERED, and FLAGGED. The N in Pine's FOLDER INDEX means that a message is unseen-it has not been read from this folder yet. The D means that a message is marked for deletion. Messages marked with D are removed when the user Expunges the folder (which usually happens when the folder is closed or the user quits Pine). The A in Pine's FOLDER INDEX means that the message has been replied-to. The * in Pine's FOLDER INDEX means that the message has been ``flagged'' as important. That is, the user used the Flag command to turn the FLAGGED flag on. This flag can mean whatever the user wants it to mean. It is just a way to mark some messages as being different from others. It will usually probably be used to mark a message as somehow being ``important''. For Berkeley format folders, the message status is written into the email folder itself on the header lines marked Status: and X-Status. In Tenex and PC-Pine's MTX folder formats, the status goes into the 36-bit octal flags.

MIME: Reading a Message

Pine should be able to handle just about any MIME message. When a MIME message is received, Pine will display a list of all the parts, their types and sizes. It will display the attachments when possible and appropriate and allow users to Save all other attachments.

Starting with version 3.90, Pine honors the "mailcap" configuration system for specifying external programs for handling attachments. The mailcap file maps MIME attachment types to the external programs loaded on your system which can display and/or print the file. A sample mailcap file comes bundled with the Pine distribution. It includes comments which explain the syntax you need to use for mailcap. With the mailcap file, any program (mail readers, newsreaders, WWW clients) can use the same configuration for handling MIME-encoded data.

If a MAILCAPSenvironment variable is defined, Pine will use that to look for one or more mailcap files, which are combined. In the absence of MAILCAPS, Unix Pine will look for a personal mailcap file in ~/.mailcap and combine that with a system-wide file in /etc/mailcap. PC-Pine will look for a file named MAILCAP in the same directory as the PINERC file, and/or the directory containing the PINE.EXE executable.

Messages which include rich text or enriched text in the main body will be displayed in a very limited way (it will show bold and underlining).

If Pine sees a MIME message part tagged as type IMAGE, and Pine's image-viewer configuration variable is set, Pine will attempt to send that attachment to the named image viewing program. In the case of UNIX Pine, the DISPLAY environment variable is checked to see if an X-terminal is being used (which can handle the images). If the image-viewer variable is not set, Pine uses the mailcap system to determine what to do with IMAGE types, just as it does for any other non-TEXT type, e.g. type APPLICATION. For MIME's generic "catch all" type, APPLICATION/OCTET-STREAM, the mailcap file will probably not specify any action, but Pine users may always Save any MIME attachment to a file.

MIME type "text/plain" is handled a little bit differently than the other types. If you are viewing the main body part in the MESSAGE TEXT viewing screen, then Pine will use its internal viewer to display it. This happens even if there is a mailcap description which matches this particular type. If it is labeled as having a character set other than the one you are using, it will still be displayed by the internal viewer (perhaps incorrectly), though you will get a warning message prepended to the message in the viewing screen. However, if you view a part of type "text/plain" from the ATTACHMENT INDEX screen, then Pine will check the mailcap database for a matching entry and use it in preference to its internal viewer.

Some text attachments, specifically those which are just other email messages forwarded as MIME messages, are displayed as part of the main body of the message. This distinction allows easy display when possible (the forward as MIME case) and use of an attachment viewer when that is desirable (the plain text file attachment case).

If the parts of a multipart message are alternate versions of the same thing Pine will select and display the one best suited. For parts of type "message/external-body", the parameters showing the retrieval method will be displayed, and the retrieval process is automated. Messages of type "message/partial" are not currently supported.

MIME: Sending a Message

There are two important factors when trying to include an attachment in a message: encoding and labeling. Pine has rules for both of these which try to assure that the message goes out in a form that is robust and can be handled by other MIME mail readers.

MIME has two ways of encoding data-Quoted-Printable and Base64. Quoted-Printable leaves the ASCII text alone and only changes 8-bit characters to "=" followed by the hex digits. For example, "=09" is a tab. It has the advantage that it is mostly readable and that it allows for end of line conversions between unlike systems. Base64 encoding is similar to uuencode or btoa and just encodes a raw bit stream. This encoding is designed to get text and binary files through even the most improperly implemented and configured gateways intact, even those that distort uuencoded data.

All attachments are encoded using Base64 encoding. This is so that the attachment will arrive at the other end looking exactly like it did when it was sent. Since Base64 is completely unreadable except by MIME-capable mailers or programs, there is an obvious tradeoff being made here. We chose to ensure absolutely reliable transport of attachments at the cost of requiring a MIME-capable mailer to read them. If the user doesn't want absolute integrity he or she may always include text (with the ^R command) in the body of a message instead of attaching it. With this policy, the only time quoted-printable encoding is used is when the main body of a message includes special foreign language characters.

When an attachment is to be sent, Pine sniffs through it to try to set the right label (content-type and subtype). An attachment with any lines longer than 500 characters in it or more than 10% of the characters are 8-bit it will be considered binary data. Pine will recognize (and correctly label) a few special types including GIF, JPEG, PostScript, and some audio formats. Another method which can be more robust and flexible for determining the content-type and subtype is to base it on the file extension. This method uses a MIME.Types File.

If it is not binary data (has only a small proportion of 8-bit characters in it,) the attachment is considered 8-bit text. 8-bit text attachments are labeled "text/plain" with charset set to the value of the user's character-set variable. If an attachment is ASCII (no 8-bit characters) and contains no ESCAPE, ^N, or ^O characters (the characters used by some international character sets), then it is considered plain ASCII text. Such attachments are given the MIME label "text/plain; charset=US-ASCII", regardless of the setting of the user's character-set variable.

All other attachments are unrecognized and therefore given the generic MIME label "application/octet-stream".

New Mail Notification

Pine checks for new mail in the INBOX and in the currently open folder every two and a half minutes by default. It used to be 30 seconds instead of 150 seconds, but we increased it in order to reduce the load on large systems with lots of Pine users. The value can be changed at compile-time in the pine/os.h file. This value can be changed with the variable mail-check-interval. A new mail check can be forced by redrawing the screen with a ^L.

When there is new mail, the message(s) will appear in the index, the screen will beep, and a notice showing the sender and subject will be displayed. If there has been more than one new message since you last issued a command to Pine, the notice will show the count of new messages and the sender of the most recent one.

Questions have arisen about the interaction between Pine and external mail notification routines (biff, csh, login). Firstly and unfortunately, we have found no PC based program that will check for email on an IMAP server when PC-Pine is not running. If you find one, please tell us.

The UNIX case is more complicated. Pine sets the modification and access time on a file every time it performs a write operation (status change or expunge). You need to see which of these your email notification program is looking at to know how it will behave with Pine.

NFS

It is possible to access mail folders on NFS mounted volumes with Pine, but there are some drawbacks to doing this, especially in the case of incoming-message folders that may be concurrently updated by Pine and the system's mail delivery agent. One concern is that Pine's user-contention locks don't work because /tmp is usually not shared, and even if it was, flock() doesn't work across NFS.

The implementation of the standard UNIX ".lock" file locking has been modified to work with NFS as follows. Standard hitching post locking is used so first a uniquely named file is created, usually something like xxxx.host.time.pid. Then a link to it is created named xxxx.lock where the folder being locked is xxxx. This file constitutes the lock. This is a standard UNIX locking scheme. After the link returns, a stat(2) is done on the file. If the file has two links, it is concluded that the lock succeeded and it is safe to proceed.

In order to minimize the risks of locking failures via NFS, we strongly recommend using IMAP rather than NFS to access remote incoming message folders, e.g. your INBOX. However, it is generally safe to access personal saved-message folders via NFS since it is unlikely that more than one process will be updating those folders at any given time. Still, some problems may occur when two Pine sessions try to access the same mail folder from different hosts without using IMAP. Imagine the scenario: Pine-A performs a write that changes the folder. Pine-B then attempts to perform a write on the same folder. Pine-B will get upset that the file has been changed from underneath it and abort operations on the folder. Pine-B will continue to display mail from the folder that it has in its internal cache, but it will not read or write any further data. The only thing that will be lost out of the Pine-B session when this happens is the last few status changes.

If other mail readers besides Pine are involved, all bets are off. Typically, mailers don't take any precautions against a user opening a mailbox more than once and no special precautions are taken to prevent NFS problems.

Printers and Printing

UNIX Pine can print to the standard UNIX line printers or to generic printers attached to ANSI terminals using the escape sequences to turn the printer on and off. The user has a choice of three printers in the configuration.

The first setting, attached-to-ansi, makes use of escape sequences on ANSI/VT100 terminals. It uses "<ESC>[5i" to begin directing all output sent to the terminal to the printer and then "<ESC>[4i" to return to normal. Pine will send these escape sequences if the printer is set to attached-to-ansi. This works with most ANSI/VT100 emulators on Macs and PCs such as kermit, NCSA telnet, VersaTerm Pro, and WinQVT. Various terminal emulators implement the print feature differently. For example, NCSA telnet requires "capfile = PRN" in the config.tel file. Attached-to-ansi printing doesn't work at all with the telnet provided with PC-NFS. There is also a closely related method called attached-to-ansi-no-formfeed which is the same except for the lack of formfeed character at the end of the print job.

The second selection is the standard UNIX print command. The default is lpr, but it can be changed on a system basis to anything so desired in /usr/local/lib/pine.conf.

The third selection is the user's personal choice for a UNIX print command. The text to be printed is piped into the command. Enscript or lpr with options are popular choices. The actual command is retained even if one of the other print selections is used for a while.

Both the second and third sections are actually lists of possible commands rather than single commands.

If you have a PostScript printer attached to a PC or Macintosh, then you will need to use a utility called ansiprt to get printouts on your printer. Ansiprt source code and details can be found in the ./contrib directory of the Pine distribution.

The three printer choices are for UNIX Pine only. PC-Pine can only print to the locally attached printer. All printing on PC-Pine is done via ROM BIOS Print Services (Int 17h). After verifying the existence of a local printer via the BIOS Equipment-List Service (Int 11h), it simply sends the message text, character by character, to the first printer found using ASCII CR and LF at the end of lines and followed by an ASCII FF. Note, some system adjustments using the PC's "MODE" command may be required if the printer is not on the first parallel port. PC-Pine cannot generate PostScript, so printing to exclusively PostScript printers does not work.

PC-Pine for Winsock uses the MS-Windows printer interface. A Pine print command will bring up a standard MS-Windows printer dialog box.

Save and Export

Pine users get two options for moving messages in Pine: Save and Export. Save is used when the message should remain ``in the Pine realm.'' Saved messages include the complete header (including header lines normally hidden by Pine), are placed in a Pine folder collection and accumulate in a standard folder format which Pine can read. In contrast, the Export command is used to write the contents of a message to a file for use outside of Pine. Messages which have been exported are placed in the user's home directory (unless the feature use-current-dir is turned on), not in a Pine folder collection. Unless FullHeaderMode is toggled on, all delivery-oriented headers are stripped from the message. Even with Export, Pine retains message separators so that multiple messages can accumulate in a single file and subsequently be accessed as a folder. On UNIX systems, the Export command pays attention to the standard umask for the setting of the file permissions.

Sent Mail

Pine's default behavior is to keep a copy of each outgoing message in a special "sent mail" folder. This folder is also called the fcc for "file carbon copy". The existence, location and name of the sent mail folder are all configurable. Sent mail archiving can be turned off by setting the configuration variable default-fcc="". The sent mail folder is assumed to be in the default collection for Saves, which is the first collection named in folder-collections. The name of the folder can be chosen by entering a name in default-fcc. With PC-Pine, this can be a bit complicated. If the default collection for Saves is local (DOS), then the default-fcc needs to be SENTMAIL, which is syntax for a DOS file. However, if the default collection for Saves is remote, then the default-fcc needs to be sent-mail to match the UNIX syntax.

The configuration variable fcc-name-rule also plays a role in selecting the folder to save sent mail in.

A danger here is that the sent mail could grow without bound. For this reason, we thought it useful to encourage the users to periodically prune their sent mail folder. The first time Pine is used each month it will offer to archive all messages sent from the month before. Pine also offers to delete all the sent mail archive folders which are more than 1 month old. If the user or system has disabled sent mail archiving (by setting the configuration variable default-fcc="") there will be no pruning question.

Spell Checker

Spell checking is available for UNIX Pine only. We could not find an appropriate PC based spell checker to hook into PC-Pine. Even UNIX Pine depends on the system for its spell checking and dictionary. Pico, the text editor, uses the same spell checking scheme as Pine.

Lines beginning with ">" (usually messages included in replies) are not checked. The message text to be checked is on the standard input and the incorrect words are expected on the standard output.

The default spell checker is UNIX spell. You can replace this by setting the speller configuration variable. Pine also respects the environment variable SPELL. The spelling checker reads its words from a standard dictionary on the system. Below is a description, contributed by Bob Hurt, of how you can create your own personal dictionary with additional ``correct'' words.

Step 1:
Make a file with all the words you want to include in your new dictionary. I did mine with one word per line in alphabetical order. Caps don't matter at all, as far as I know.
Step 2:
At the UNIX prompt, type "cat [word file] | spellin /usr/dict/hlista > [new dict name]" where [word file] is the file you just created and [new dict name] is the name of the new dictionary that Pine will look at instead of the standard /usr/dict/hlista. I named my word file .bobwords and my dictionary .bobspell so I don't have to see them when I do a ls command (ls doesn't list "dot" files). I also put the above command into my .alias file as the command makedict so I can add a word to my word file and easily recreate my dictionary. NOTE: the new dictionary is in something called a "hashed" format, and can't be read normally.
Step 3:
Check your new dictionary. At the UNIX prompt, type "cat [word file] | spellout [new dict name]" If you did everything correctly, it should just give you another prompt. If it lists any of the words in your file, something is wrong. I can try to help if all else fails.
Step 4:
Now you have to tell UNIX to use your dictionary instead of the standard one by setting the environment variable SPELL to access your dictionary. Go into your .login or .cshrc file in your home directory (it doesn't seem to make a difference which one you use) and add the line
setenv SPELL "spell -d [new dict name]"

I also created an alias for SPELL in my .alias file so I can use the UNIX spell command to spell-check a file outside of Pine. (The .alias line is: alias spell 'spell -d [new dict name]')
Step 5:
Now you need to logoff and log back on to let UNIX look at your .login (or .cshrc) file.

Here is an alternative method suggested by Zachary Leber:

Create a list (e.g. .zachwords) with the upper case followed by lower case words, sorted alphabetically.

Add this line to .cshrc:
setenv SPELL 'spell +/home/ie/rsa/.zachwords'

The limitation here is that the path must be absolute (e.g. +~/.zachwords doesn't work).

My man pages for spell show this + flag to be an easy way to do the exception list. This way you don't have to bother with hash lists or rehashing, and it seems to work across several platforms.

Terminal Emulation and Key Mapping

Pine has been designed to require as little as possible from the terminal. At the minimum, Pine requires cursor positioning, clear to end of line, and inverse video. Unfortunately, there are terminals that are missing some of these such as a vt52. Pine makes no assumptions as to whether the terminal wraps or doesn't wrap. If the terminal has other capabilities it may use some of them. Pine won't run well on older terminals that require a space on the screen to change video attributes, such as the Televideo 925. One can get around this on some terminals by using "protected field" mode. The terminal can be made to go into protected mode for reverse video, and then reverse video is assigned to protected mode.

Pine handles screens of most any size and resizing on the fly. It catches SIGWINCH and does the appropriate thing. A screen one line high will display only the new mail notification. Screens that are less than ten columns wide don't format very nicely or work well, but will function fine again once resized to something large. Pine sets an internal maximum screen size (currently 170x200) and decides to use either termcap or terminfo when it is compiled.

On the input side of things, Pine uses all the standard keys, most of the control keys and (in function-key mode) the function keys. Pine avoids certain control keys, specifically ^S, ^Q, ^H, and ^\ because they have other meanings outside of Pine (they control data flow, etc.) ^H is treated the same as the delete key, so the backspace or delete keys always works regardless of any configuration. There is a feature compose-maps-delete-key-to-ctrl-d which makes the delete key behave like ^D rather than ^H (deletes current character instead of previous character).

Sometimes a communications program or communications server in between you and the other end will eat certain control characters. There is a work-around when you need it. If you type two escape characters followed by a character that will be interpreted as the character with the control key depressed. For example, ESC ESC T is equivalent to ^T.

When a function key is pressed and Pine is in regular (non-function key) mode, Pine traps escape sequences for a number of common function keys so users don't get an error message or have an unexpected command executed for each character in the function key's escape sequence. Pine expects the following escape sequences from terminals defined as VT100:

ANSI/VT100
F1: <ESC>OP
F2: <ESC>OQ
F3: <ESC>OR
F4: <ESC>OS
F5: <ESC>Op
F6: <ESC>Oq
F7: <ESC>Or
F8: <ESC>Os
F9: <ESC>Ot
F10: <ESC>Ou
F11: <ESC>Ov

Arrow keys are a special case. Pine has the escape sequences for a number of conventions for arrow keys hard coded and does not use termcap to discover them. This is because termcap is sometimes incorrect, and because many users have PC's running terminal emulators that don't conform exactly to what they claim to emulate. In some versions of Pine before 4.00 there was a compile-time macro called TERMCAP_WINS which could be set to cause the termcap or terminfo definitions to be used instead of the built in definitions. Beginning with 4.00 there is a hidden runtime feature which can be turned on to accomplish the same thing. The feature is called termdef-takes-precedence and it can be set in any of the Pine configuration files. Some arrow keys on old terminals send single control characters like ^K (one even sends ^\). These arrow keys will not work with Pine. The most popular escape sequences for arrow keys are:

Up: <ESC>[A <ESC>?x <ESC>A <ESC>OA
Down: <ESC>[B <ESC>?r <ESC>B <ESC>OB
Right: <ESC>[C <ESC>?v <ESC>C <ESC>OC
Left: <ESC>[D <ESC>?t <ESC>D <ESC>OD

It is possible to configure an NCD X-terminal so that some of the special keys operate. Brad Greer contributes these instructions:

1.: In your .Xdefaults file, include the following "translations", using lower hex values:
Pine*VT100.Translations: #override \n\
<Key>Delete: string(0x04) \n\
<Key>End: string(0x05) \n\
<Key>Escape: string(0x03) \n\
<Key>Home: string(0x01) \n\
<Key>Next: string(0x16) \n\
<Key>Prior: string(0x19) \n\
<Key>KP_Enter: string(0x18) \n\
2.: Start up Pine from an xterm, and specify a "resource name". This resource name will allow the user to specify resources for Pine (that deviate from the defaults). For example, xterm -name Pine -e pine & (the resource name Pine corresponds to the translations just added in the .Xdefaults file).