September 2008

Illustration: iStockphoto

Ask The Professor

What happens to lost e-mail?

This month's faculty expert, Samuel Guyer, assistant professor of computer science, responds:

The most common reason that e-mail fails to reach its intended destination is the same as with regular "snail mail": errors in the address render it undeliverable. Like the postal service, e-mail systems typically return such e-mail to the sender in a special message called a "bounce" message. Even with a correct address, however, e-mail can seem to disappear into the digital ether. To understand why this happens, it's useful to look at how e-mail works.

E-mail travels the Internet through a series of hops between server computers. Participating servers, called mail transport agents, communicate using a scheme called the Simple Mail Transfer Protocol (SMTP) that dates back to 1982. SMTP is called a "push" protocol: at each hop, the server that currently holds a piece of e-mail contacts the next server in the chain to see if it can accept and deliver the e-mail. An important feature (or perhaps deficit) of SMTP is that once e-mail has been handed off, there is no systematic way to check that it reaches its ultimate destination.

Most failures in the e-mail transport process result in an automatically generated bounce message. E-mail can bounce for a number of reasons: the recipient's address is incorrect or defunct; the recipient's inbox is full (servers often impose an e-mail quota); the e-mail is too large (servers also often impose a per-e-mail size limit), or a server somewhere along the way has failed (caused, for example, by a power outage in the recipient's city).

If the recipient's server is down, the sender's server will hold the e-mail and attempt to resend it over a period of several hours or days. If the server currently holding the message goes down, however, other servers in the chain have no record of the e-mail and cannot generate a bounce message. This scenario is unusual because industrial-scale e-mail servers are built to be very robust, using features such as redundancy, fault tolerance and backup generators.

Another reason e-mail can disappear is related to "spam," also known as unsolicited bulk e-mail. Spam has become one of the scourges of the Internet: it is estimated that 85 to 90 percent of all e-mail sent is spam. In an effort to combat spam, many servers use sophisticated spam-filtering algorithms that attempt to identify and remove spam before it reaches the recipient.

Users can add another level of filtering using a technique called "white listing": they only accept e-mail from senders in their list of contacts. Spam is either deleted or filed without generating a bounce message because we don't want spam senders (or "spammers") to know it was detected, and because we don't want to inundate the e-mail system with spam bounce messages. This scheme backfires when legitimate e-mail is misclassified as spam: neither the sender nor the recipient may be aware of the lost e-mail.

In general, our e-mail system is very reliable, despite the enormous volume of messages it carries and the tremendous stress it endures from spam and other nefarious uses. And while several attempts have been made to update it, it still lacks the ability to verify to the sender that the intended recipient received the message.

Article Tools

emailE-mail printPrint