28 Simple Mail Transfer Protocol (SMTP)

Prof. Bhushan Trivedi

Introduction

We have looked at three different application layer protocols so far, the DNS, the HTTP, and the FTP. In this module, we will look at the solution which helps the sender sending his mail to the receiver using a protocol called Simple Mail Transfer Protocol or SMTP. We will look at the structure of this protocol, the format of the mail, the role of SMTP client and server and how the client communicates with the server.

Simple Mail Transfer Protocol (SMTP)

The email system of Internet is a two tier system and the only case where sender and receiver are connected asynchronously; they do not require to be online at the same point in time. When a sender sends mail, it is stored at the server’s end in a typical folder assigned to that user, known as his mailbox, and whenever the receiver logs in, he will fetch his mail from his mailbox. Thus there are two operations, sender delivering the mail to receiver’s mailbox and receiver reading it from his own mailbox. The protocol SMTP that we are discussing here in this module is used for the first operation.

Figure 30.1 describes the operation of sending mail. The sender communicates with a process known as the user agent for sending the mail. The user agent provides the facility to construct a new outgoing mail and reading incoming mail and other services like forwarding and searching emails etc. The sending part is done as a background process known as the spooler. The emails are delivered to spooling area and SMTP client running on the sender’s machine picks up emails from that spooling area and send it across to the receiver’s mailbox.

Figure 30.2 describes the receiving end of the mail. The SMTP server at the user’s end or web mailer’s end (for example, if the receiver address is Yahoo! or Gmail), it is stored in receiver’s mailbox. The receiver uses his own user agent to get the mail from his mailbox at his own leisure, however using another protocol and not SMTP.

Computer Networks and ITCP/IP Protocols 2

The SMTP protocol’s role is described in 30.3. The SMTP client running in a process known as a message transfer agent establishes a TCP connection with a similar message transfer agent at the other end where the SMTP server is running. The SMTP server does not run at receiver’s premise usually. It either runs on ISP’s premises or at web server when the receiver uses a webmail. The figure 30.4 depicts a typical case where a sender known as Lara sends a mail to a person named Gayle.

SMTP features

Let us list some of the features of SMTP.

1. SMTP works in asynchronous fashion as described above

2. Sender and receiver are not required to be online at the same point in time

3. The spooling process delivers mail when the connection is established, this mail transfer part, known as a mail transfer agent, is separated from mail interface known as the user agent.

4. The user agent is the part of mail system every sender of the mail interacts with. The mail transfer agent at the sender is the SMTP client and SMTP server at the receiver. They establish a TCP connection and navigate the mail to the other end

5. The mail itself is an important component of the mail system. There is a structure attached to it which we will soon explore.

The mailing process

Let us take an example to understand the mailing process. Consider Lara@abcict.org is sending a mail to Gayle@OBS.com. Lara needs to invoke her user agent to compose the mail first. The user agent passes the mail to SMTP client running on the user’s desktop or laptop or sometimes ISP’s machine. Once this is done, the user agent’s job is over and message transfer agent takes over. The SMTP client running at abcict.org instructs the TCP to connect to the SMTP server running at OBS.com and deliver the mail to a mailbox called Gayle. Once the TCP connection is established, both the parties communicate with each other for sending this mail and (also other emails if they are pending at either side). Once abcict.org completes sending its emails, the roles are reversed and OBS.com may send if there are any emails from any users of OBS.com to abcict.org. Once both parties have done their job, the TCP connection is terminated. When Gayle logs into his mail account, a notification is provided, Gayle uses another protocol (sometimes pop3, sometimes IMAP) to access his mailbox. Another server running on the machine where the mailboxes are managed, POP3 server or IMAP server provides the mailbox contents to Gayle’s user agent.

If Lara or Gayle uses webmail accounts, the process is little different. Consider Lara@yahoo.com sends a mail to Gayle@gmail.com. HTTP, in the beginning, is used to fetch the web page of the mail server provider’s user agent to the user. For example, if you have a Yahoo! account, when you type mail.yahoo.com, your browser communicates to the Yahoo! server using HTTP. Once that connection is established, the user agent of Yahoo mail is displayed which provides facilities to write a mail, forward a mail etc. Once Lara completes the mail writing part and presses the SEND button, it is delivered to SMTP client running at Yahoo! server. The Yahoo! SMTP client communicates with SMTP server located at Gmail. The SMTP server at Gmail delivers the mail to Gayle’s mailbox there itself. When Gayle opens his browser and connects to mail.gmail.com, he is presented with his own user agent, he now has a notification indicating that the mail has arrived from Lara. When he tries to open it, the mail is converted to the HTML format and delivered to his user agent.

The process of the second phase of mail in both cases, an ISP based solution, and a webmail solution are depicted in Figures 30.5 and 30.6. 30.5 describes Gayle receiving a mail on his ISP-based account so he has to use a POP3 client to access that mail. Unlike that, when Lara uses her webmail account, the communication with browser and server delivers the mail in HTML format.

Alias expansion

Many times we encounter a case where the mail address belongs to a group. So a mail delivered to that group delivers the mail to all members. Another case is when we use software like Microsoft Outlook or Apple Mail where there are emails for multiple accounts be combined into one. Most of the mobile mailing apps also provide a similar feature of combining multiple mail accounts. In that case, the sender can decide from which account the next mail be sent from and can see all his emails from all of his accounts from one interface. Both of them are known as alias expansion. The outgoing mail process to a group is depicted in figure 30.7.

You can see that the place where the mail is constructed, the alias expansion takes place. Multiple TCP connections are established to multiple members of the group and each one of them receives an individual copy of the mail.

When a user has multiple mail accounts, another type of mail expansion takes place. In the case of Lara, she is also a director and has a mail address director@abcict in addition to her normal mail address Lara@acbcit. She uses a user agent which is capable of alias expansion where both of the mail account emails are copied to the single user agent. Figure 30.8 depicts the case.

Observe the case depicted in figure 30.9. When a user sends a webmail account of a group, the similar process takes place at the web server. The mail reaches in a single copy from the sender’s user agent to the web server. The web server SMTP client now connects to each receiver and sends a copy of the mail to each one of them.

Two parts of SMTP standard

The SMTP standard is divided into two parts. The first part describes how the mail is structured and the second part describes how client and server communicate with each other.

There are two RFCs designed for the purpose. RFC 2821 and 2822 (original RFCs were 821 and 822 which are revised a few times to these current versions) describe the format of the mail and process of communication from an SMTP client to an SMTP server.

Email Format

SMTP email format is depicted in the figure 30.11. The mail contains two parts, envelope, and the message. The envelope contains sender’s and receiver’s mail addresses. The message contains a header and the body. The header contains both sender’s and receiver’s names and the body contains an actual message that is being sent. You can also see that the email format is quite similar to a conventional mail which is shown in the leftmost portion of the figure.

Once we have seen the email format, the second part of the standard, communication between the client and the server is addressed in the next section.

Client-server communication in SMTP

A typical communication between an SMTP client and server is depicted in the figure 30.12. You can see that there are two communicating parties. S indicates the server while C indicates the client. Every communication line contains two things. The first number is the indicator for the server and the client to understand and the message is for human administrators to understand what is going on. HTTP is also using something similar which we have already seen.

S: 220 OBS.com SMTP server ready

C: HELO abcict.org

S: 250 Hello OBS.com, I am glad to meet you

C: MAIL FROM:Lara@abcict.org

S: 250 Ok

C: RCPT TO:Gayle@OBS.com

S: 250 Ok

C: RCPT TO:Ramnaresh@OBS.com

S: 250 Ok

C: RCPT TO:Chandarpol@OBS.com

S: 550 No such user here

C: DATA

S: 354 End data with <CR><LF>.<CR><LF>

C: From: “LARA Brian” < LARA@abcict.org >

C: To: “Gayle Chris” Gayle@oup.com

C: Cc: Ramnaresh@OBS.com

C: Subject: The Second Book

C: Hello Gayle.

C: I will soon send you the second Book.

C: regards

C: Lara

C: .

S: 250 Ok:

C: QUIT

S: 221 Bye {The server closes the connection}

Let us try to understand the communication. The first message indicates that I am OBS.com server and I am ready. The client responds with a message Hello; I am abcict.org. These two lines are introductory lines which set the ball rolling. Now server says, I am ready to accept emails and client says there is a mail from Lara and for three different recipients. The server responds back OK if the recipient owns a mailbox under that server. Otherwise, it responds with ‘No Such User’. In the case of no such user, it does not but continue and deliver emails to the rest of the mailboxes.

The client now sends the message called ‘DATA’ indicating the content of the mail. The server responds back, with his requirements. He requires the client to complete mail by a line containing a single dot. It is specified as <CR><LF>. <CR> <LF> the CR is carriage return and LF is the line feed. These two characters indicate the end of the line in a text file. Carriage return goes back to the first column and line feed goes to next line. A typical text file used in Windows has this format. The server wants the client to send the mail content one line after another and terminate by a line containing the only dot. The client obliges and does just that. The server responds back with 250 OK. That means the mail is accepted properly. It is possible that servers may change the role and now OBS.com sends any emails for abcict.org but there is no such need here and both terminate the connection when the client indicates so by sending QUIT. The server responds back with OK and asks the TCP to close the connection.

Internet media type

SMTP is a very old protocol. When it was designed the mail used be only of 7 bit ASCII values. When other than text is being sent as an attachment; for example, images and Microsoft office files, videos, etc., these non-ASCII objects to be converted to 7 bit ASCII format before sending to the other end. The IETF provided a solution using IMT or Internet Media Types. They have defined 8 different types and thousands of subtypes. The eight types are listed in the figure 30.13. One of the type starts with X- which is user defined type. If both client and server support, such user defined types can be used.

Figure 30.13 Internet Media Types

The Internet media types can be used in the mail using MIME or multi-purpose mail extension. Whenever a data containing other than ASCII 7 bit is sent across, one needs a header known as MIME to indicate so.

In the figure 30.14, a typical case is shown when Lara sends an image to Gayle. The SMTP mail needs to be sent only in the 7 bit ASCII text so MIME is invited to do so. MIME used a method known as base-64 encoding (discussed later) to convert the image, which was in form of jpeg structure1. The header Content-Type indicate so. The MIME-Version header indicates that the sender is using 1.0 version. Content-Transfer-Encoding describes the method to convert the image file (which was in jpg format) to convert to 7 bit ASCII is base64.

When the receiver receives this mail, it picks up the mail and converts the 7 bit ASCII into binary form and use some application associated with Jpg files to open that image.

From: Lara@abcict.org

To: Gayle@OBS.com

MIME-Version: 1.0

Content-Type: image/jpeg

Content-Transfer-Encoding: base64

< a blank line>

………..binary data for the image in text form………

Figure 30.14 using MIME

We have already seen HTTP having some similarity with SMTP, the major being the text format for sending and receiving emails. There are other points to compare as well. Let us discuss them one after another.

Comparison between SMTP and HTTP

Here are a few points of comparison.

1. Both of them establish connection and then transfer files

2. Like HTTP 1.1, SMTP uses a single connection for multiple data transfer

3. SMTP transfers everything together and not object by object like HTTP

4. Both uses headers and values in their description to provide additional information to receiving party

5. SMTP converts non-text data into 7-bit ASCII, HTTP sends them as it is

6. HTTP client fetches the data while SMTP client sends the data

7. SMTP needs MIME for converting non-Text into text, HTTP does not need any such mechanism.

1 JPEG or joint photographer expert group is a standard for storing images.

Base 64 encoding

Base 64 is one method to convert non-ASCII content into ASCII content.

Figure 30.15 showcases the process. The non-ASCII content is divided into groups of 6 bit each. The 6-bit content is converted into characters using a simple mapping. 26 upper case, 26 lower case characters, and 10 digits make 62 elements, two braces make it 64. They are numbered from 000000 (6 bits) to 111111 and mapped accordingly. Thus every 6-bit value is mapped to a unique ASCII alphabet or a number. The final step includes sending the actual 7 bit ASCII value of that character. The figure 30.15 first row indicates the input; second row describes the character based on our 6-bit mapping. The characters’ ASCII values are described in the final row which is the output and that is sent.

One can argue sending 8 bit for every 6 bit of data. It is a clear overhead of 25%. Such a scheme can always be replaced by a better method which does not demand such a conversion. If SMTP cannot send other than 7-bit ASCII, one may find another protocol to do so. Unfortunately, in our industry hardly ever one finds a program running for years. SMTP is one such program. It is used in billions of machines and there are many application programs which have embedded the logic of SMTP running on those computers. Replacing those application programs is not recommended for many reasons including sheer administrative overhead, follow up for making sure every stakeholder is aware of the changes etc. When we have a working solution with little overhead, administrators always prefer that.

Sometimes, though, one can eliminate the overhead, especially when most of the content is in ASCII and only a few characters in non-ASCII format. One more method, known as Quoted Printable Encoding is used instead. The idea is to convert that non-ASCII character into a specific form the receiver can read and convert back.

The non-ASCII character is divided into two characters of 4 bit each. That 4 bit is mapped to its Hex value (from 0 to F). The ASCII value of that Hex character is used to send the information. In above case, the value 1100 1011 needs to be sent. 1100 is 0xC and 1011 is 0xB. So we send ASCII values of these two characters, B, and C here. The receiver must be able to convert back these two characters to the original non-ASCII 8-bit value. The preceding = character, which is inserted at the time of conversion, indicate so. This method adds more overhead as one 8-bit value needs three 7-bit values to transfer. However, if we have only a few non-ASCII values it is better than base 64.

Filters and SPAM

The filters are part of the user agent. They filter incoming emails based on user’s instructions and also based on some default instructions given at the time of installation. The filtering is done on attributes of mail like subject line, sender’s name or some specific words in the body of the mail. You can find such filters in every user agent, for example, Yahoo! and Gmail accounts.

SPAM is one popular type of filter. SPAM filtering is done based on AI-based techniques beyond the scope of this course. The AI-based solutions check the mail and decide the content to be SPAM or otherwise based on their intelligent decision-making process. They also consider user’s opinion when their decision is different than the user’s to improve their learning.

Summary

In this module, we have seen that how SMTP protocol works. The two-layer model where the sender sends the mail to the receiver’s mailbox and how the receiver uses another protocol to get the mail from his mailbox at his leisure. The SMTP work is based on two different RFCs, one of them describes the email structure and second one describes how an SMTP client talks to another. We have also seen the difference between SMTP and HTTP as well as how MIME is used by SMTP for converting the non-ASCII data into 7 bit ASCII at the sender and vice versa at the receiver.

you can view video on Simple Mail Transfer Protocol (SMTP)

References

1.Computer Networks by Bhushan Trivedi, Oxford University Press

2.Data Communication and Networking, Bhushan Trivedi, Oxford University Press