Tags:
create new tag
, view all tags

Email Out On The Internet

PageStatus: FirstDraft

Introduction

This is an attempt to provide a very brief overview of how email works "out on the Internet". Some of it was developed by reading or skimming most of RFC 821: SIMPLE MAIL TRANSFER PROTOCOL, by Jonathan B. Postel, 1982. I should also read the RFCs for POP3, IMAP, and others.

There are some points made especially to help a user familiar with the typical Windows approach to email (an email client on a workstation) to understand the significantly different "traditional" Linux approach to email (an email server on a workstation).

This is intended to be simple and not cover all the details but without being significantly inaccurate. It is the first in a series of pages eventually intended to lead a new Linux user through setup and configuration of an email server (or client) on his workstation.


UPDATE: Since writing this, I've skimmed and read portions of RFC 2821. It is much more helpful than RFC 821 and, among other things:
  • confirms (and describes to some extent) the role of DNS in email
  • points out that some things in RFC 821 are no longer in common use (deprecated?), but still required to be supported, and described only in RFC 821
  • confirms some of my other thoughts (and writings) which might have been somewhat speculative at the time I wrote them

Therefore, this page needs some refactoring, which I will start soon (maybe a first pass now).

In addition, I found a good source for the RFCs, with a nice listing by subject -- http://www.faqs.org/rfcs/np.html#SMTP. I'm going to start a list of RFCs related to email that I've read, skimmed, glanced at, or feel I should skim, glance at, or read.

The RFCs found at the www.faqs.org are nice in that they are in HTML and readable on-line. They would be a little nicer if each heading in them was an anchor so I (or anyone else) could link to a particular section of the document. I'll try to find a way to make that suggestion to the people that maintain the site.


ToDos:

  1. I think I've fairly well covered (or at least understand) the route of incoming email from the Internet to an intermittently connected local email server. Other pages will deal with distributing mail within a server (postfix and/or procmail), and making it available to other machines on a (local) LAN for retrieval via POP3 or viewing via IMAP. What I may want to add to this page is some more about how outgoing email gets sent to, for example, the user's ISP from an intermittently connected local mail server. (I think this is actually pretty simple -- a connection must exist or be created to the ISP, and at that time the local MTA "relays" the email to the ISP's MTA as any other email relay transaction would occur on the Internet.)
  2. (UPDATE: RFC 2821 confirms much of this.) Mention DNS -- my assumption is that DNS works for email almost exactly like it does for things like http and ftp, except perhaps it uses MX records instead of A records (I say, without knowing much about either). Perhaps the first clause is all I want to say on this page about DNS and create a link to other pages with more explanation, like DNS, or DNS and email.
  3. A general rewrite now seems appropriate. Do I want to skim and read RFC 2821 one more time first, and maybe an RFC on IMAP? Should I use the term MTA more often in preference to SMTP server?

Email Addresses

There are several types of email addresses that you should be aware of:

  • normal (fully qualified) addresses, like: R. H. Kramer <rhkramer@fast.net> and rhkramer@fastPLEASENOSPAM.net -- mail addressed like this is delivered where ever in the world the address indicates
  • local addresses, like: root, dad, rhkramer -- mail addressed like this is delivered on the local mailserver machine (only) (such mail can be retrieved by other machines on the (local) LAN running mail clients if a POP3 or IMAP server is running on this machine -- I will eventually want this capability, so it will be covered somewhere, probably not on this page)
  • envelope addresses -- to be discussed more later -- sometimes the mail, for various reasons, is sent to an envelope address instead of the To:, CC:, or BCC: address. I don't fully understand the reasons or circumstances at this point. I think it has to do with things like mail lists and the concept presented in the next section of only one copy of an email being forwarded until the email reaches a machine where the mail must be forwarded (or distributed) in more than one direction.

((UPDATE: RFC2821 mentions envelope addresses and confirms some of the following. My current understanding of envelope address is that is the addresses used by the SMTP protocol to transmit the email, but are not part of the email headers. (I was going to say they were in the forward and reverse routes transmitted by the SMTP commands (forgot their names) but I really have to do more reading to understand it better.) I skimmed RFC 821 too fast, and had the idea something like this happens possibly from other resouces. I don't recall seeing the term "envelope address" in RFC 821, but I will search for it. Now that I think about it, the idea of making email copies at the last possible moment, so to speak, provides a perfect rationale for an envelope address -- until the mail reaches a machine where the mail must be split, multiple (logical) copies are in an "envelope" addressed more generally (either to a machine or to a list of users on a machine -- not sure which right now).

SMTP, POP3, and IMAP

(Notice that this section introduces the ideas of mail relaying and bouncing.)

SMTP stands for the Simple Mail Transport Protocol and is the primary protocol used to transport email over the Internet. (POP3 and IMAP are two other protocols that come into play for exchanging mail with an email client at the final destination.)

The SMTP protocol is driven by the email originator. When a person wants to send an email, the SMTP program on their machine contacts an SMTP server on a machine connected to their machine. (This is accurate but a little misleading -- may want a separate sentence for an email client and a workstation with an email MTA, or do I cover that adequately below.)

In the typical case, email is destined for some machine not directly connected to the email originator, and email must be relayed from machine to machine until it gets to the destination user, or a machine that accepts email for the destination user -- more later. (DNS comes into play to deal with the relaying, see RFC 2821 (I'm not finished reading it).)

The situation that a machine accepts email for another machine arises most often in connection with a typical residential user connected to an ISP by an "intermittent" (not always connected) dial-up connection. In that case the residential user's ISP's mail server is configured to accept mail for the residential user. When mail arrives for that user, the ISP's mail server accepts it and stores it. When the user connects to his ISP, he can retrieve the mail with his email client using the POP3 protocol, or view his mail and construct replies on the ISP's mail server with his email client using the IMAP protocol.

As email is relayed over the Internet, each transaction is driven by a program on the sending machine running the SMTP protocol. The SMTP server on the receiving machine does one of the following things with each email message:

  • Accepts it, because the addressee is a user on the receiving machine.
  • Accepts it because this machine is set up to accept mail for the addressee.
  • Accepts it to forward it via other machines to a machine where the addressee is a user or to a machine which accepts mail for the addressee.
  • Refuses to accept the message, because the receiving machine does not have any knowledge of the addressee or the addressee's domain. (This does not usually happen when you send mail to your ISP, except perhaps if you accidentally attempt to send mail to a private IP address on your home LAN or a non-registered domain name.)

In general, the magic that happens out on the Internet to get the message to it's final destination is beyond the scope of this document. Maybe someday I'll write a document to address some of that in more detail. I will mention the following:

  • When any machine accepts email for delivery, the relaying process starts, and email is relayed from machine to machine until it reaches a machine on which the addressee is present, or set up to accept email for the addressee, or a machine which knows that the addressee or his domain does not exist.
  • Internet mail handling is designed to make copies of messages as required for multiple addressees in an efficient way. When any SMTP server receives an email, it will make additional copies if necessary to send the mail to more than one recipient; however, if more than one recipient is on the next server in the relay, it will only send one copy -- the server that recognizes the need to send the message in two different directions will make the necessary copies. (This is my understanding, but I'm not absolutely sure this is what I read in RFC 821 -- RFC 2821 may be a better resource.)
  • Until the final destination, mail servers accept mail based on the domain name of an email address rather than the user plus domain name. What I'm trying to say is that, if you send email to a non-existent user at a valid domain address, the email will wend its way over the Internet until it gets to the machine just before the machine serving that domain. When that "second last" server attempts to send it to the server handling the domain, the server handling that domain will respond with a message saying that user is unknown. The second last server will now construct an undelivered message email (a bounce message) and send it back to the machine which originated the message.

Some Notes including Comments for Windows Users

The SMTP program on a person's machine is not named SMTP. Many email clients, especially GUI email clients, can "speak" the SMTP protocol to start a message on its way. Some GUI clients also use the IMAP protocol which allows them to create the email message on, for example, the ISP's machine, from where it is started on it's way using SMTP driven by the ISP's mail server.

(Have I made it clear that Linux (command line) email clients like mutt, pine, elm do not use POP, SMTP, or IMAP, but only read and write mail to local mail "spools" in a user's home directory. And, am I sure about that?)

Similarly, the SMTP server is not named SMTP -- common program names are sendmail, postfix, and qmail, and is also known as an MTA (Mail Transfer Agent).

There is a significant difference between the typical Windows and traditional Linux ways of handling email. On Linux, the typical machine incorporates an email server, like sendmail, postfix, or qmail. In Windows, the typical user's machine incorporates only an email client which can speak SMTP (and POP3, or IMAP in place of both SMTP and POP3). Because of this, Windows users coming to the Linux world are often surprised by the complexity of handling email in Linux, and the terminology and concepts involved. They don't even know about this difference in approach, so simply mentioning this to them with a short overview, can help them a lot. And, maybe, their email should be configured more like Windows (a gui client using SMTP and POP3 or IMAP instead of a full blown or partial mail server), unless they really need this additional functionality.

Another useful point for a Windows user -- when a machine is not connected to the Internet full time, the normal email servers (postfix, sendmail, qmail) don't work without a special helper program like fetchmail or getmail. These programs simply go out to the ISP and fetch mail destined for the server that is not connected continuously, presenting it to that server on port 25 which is the normal "entrance" to an smtp server.

Contributors

  • RandyKramer - 14 Feb 2002
  • <If you edit this page, add your name here, move this to the next line>
Edit | Attach | Watch | Print version | History: r4 < r3 < r2 < r1 | Backlinks | Raw View | Raw edit | More topic actions
Topic revision: r4 - 2002-02-16 - RandyKramer
 
  • Learn about TWiki  
  • Download TWiki
This site is powered by the TWiki collaboration platform Powered by PerlCopyright 1999-2017 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding WikiLearn? WebBottomBar">Send feedback
See TWiki's New Look