3 Complete Communication process

Prof. Bhushan Trivedi

epgp books

 

Introduction

 

In this module, we will look at the complete communication process using the TCP/IP model and protocol stack. We have already seen how each layer functions and how the protocols in TCP/IP suit processes each incoming request from applications to transfer data across. We will take two real world cases, first is the HTTP protocol for web transfer and the second is the SMTP protocol, for mail transfer.

 

It is critical that we understand how the entire communication is being done right in the beginning. In later modules, we will dig deep and look at each of the components in detail. The idea of showing the entire communication, in the beginning, is to give you a complete picture of what a network does, a kind of ‘bird’s eye view’ of the entire subject. It will become easier once we have this vision to see how each part fits in1.

 

The Entire Communication Process

 

Now let us look at the entire communication process considering all the layers together. The idea is to see how these layers work in sync to help sender and receiver communicate. Let us take a case of sending an email to the address bhtrivedi@glsuniversity.ac.in. Here ‘bhtrivedi’ is called a mailbox on the mail server of ‘glsuniversity.ac.in’. A mail server is one which manages all the mails of its registered users. Every mail server is capable of handling many registered users. Each user is given a typical mailbox which is basically a folder to contain his emails. That mailbox is the personal place for that particular user to store its mail which he already wrote but is yet to be sent and the mails which are arrived from other senders. Thus every mail server caters to all mailing needs of all registered users by providing a mailbox and allowing him to use that for sending and receiving mails.

 

This arrangement is analogous to having a few mailboxes at the entry of an apartment for each of its residents. The address glsuniversity.ac.in is analogous to an apparent address while bhtrivedi is analogous to a physical mailbox. The glsuniversity.ac.in domain name is obtained by the university where the author works. All employees of GLS university are given a unique mailbox under that domain name.

 

Two figures, 3.1 and 3.2 depicts the idea. Figure 3.1 indicates how the application layer takes the services of the transport layer for doing its job. The application (like FTP or SMTP)

 

1.  Research in pedagogy has revealed that some of us are a typical type of learners known as global learners. It becomes easier (and sometimes essential) for them if the bird’s eye view of the subject is provided in the beginning. If you are not that type, even in that case, it will strengthen your understanding of the subject.passes commands according to the need to the a pplication l ayer. For e xample, if t he user wants to downlo ad multiple files from a typical server, the applicatio n passes a n mget command to the application layer.

 

Application layer passes the m to the transport lay er after that. The transport laye r relays that to the receiver’s transport layer f or it to pas s it the receiving app lication la yer. The transpo rt layer of the sender to the tra nsport laye r of receive r communication is d escribed in further detail in 3.2.

 

 

When the computer mailing system r eceives the request t o send the mail to the GLS university address, it has to find out the mail serv er running at glsuniver sity.ac.in and then find the mailbox under it. The mailing system running at the s ender mach ine will be sending this mail to a mail serve r on the machine r epresente d by glsun iversity.ac.in. The glsuniv ersity.ac.in domain name is usually assoc iated with one or more mach ines. A machine dedicated to handli ng emails is usually provided in every suc h domain and it is known as a mail server. Th at mail server is our target. In other word s, the me aning of glsuniv ersity.ac.in, for us, is this mail server. T hat means bhtrivedi @glsunivers ity.ac.in means the mailbox bhtrivedi situated with the mail server running at glsuniversity.ac.in. The glsuniversity.ac.in is what users understands to be the server but the actual address is a number, popularly known as an IP address. A service known as domain name service or DNS maps the domain name to an IP address and process continues. The IP address of the recipient (in this case the glsuniversity.ac.in) may be 1.2.3.42.

 

Note that there may be more than one processes running on the server. The mail server is just one of them. When we send something, how would the recipient TCP know that the message is for the mail server and not for anybody else? The problem is solved by providing a separate number for each service. This integer which indicates service is known as the port number. The mail server’s port number is usually fixed and known globally. All machines should run their mail servers at port no 253.

 

Once the DNS provides IP address to the application, our mail client application layer requests the TCP process (which is running below in the TCP/IP protocol stack) to establish a connection with the IP address and the port number specified. Armed with both these values, TCP generates a connection request segment with a typical port number in the header for identifying sending as well as receiving processes. TCP passes it to the IP layer below. The TCP segment contains many other things in the header which we will ignore for the time being. TCP also passes the IP address it received from DNS to the IP process.

 

There is an IP address of the sender (popularly known as a client) also, which is input when the IP is installed on the system or procured by other means. The sender’s address is not very useful for processing at the sender side but the receiver’s address is. The sender’s address is kept to help the receiver know whom to respond back. Both of these addresses are inserted in the IP packet header.

 

The IP layer looks at the destination IP address and decides where to send that packet next. Note that IP process takes the help of a routing table it has already created communicating with neighbors. It is possible that our recipient is ten networks away on a specific path. The IP layer decides the router on the next network as a next immediate receiver for this packet4. Then it prepares the header including the sender’s and receiver’s IP addresses. The TCP segment received is embedded inside this packet. Then the IP process passes the packet to the network interface card (the Ethernet card or the wireless card for example).

 

2. This is a 32‐bit binary value, the first byte is binary equivalent of decimal 1, second byte is binary equivalent of decimal 2 and so on. This representation is better for humans to understand and process rather than pure binary 32‐bit value. This notation is known as dotted decimal notation.

 

3. This is a standard but it is not compulsory for anybody to run their mail server to run on standard ports. They can run it on other ports if all users are aware of that number. It is like producing non‐standard sockets, only those who know that specification and build plugs accordingly may connect to those sockets.

 

4. Though the IP elaborates the entire path, it only decides to send it to next router and next router decides the path further for providing the router’s autonomy that we discussed earlier.

 

Additionally, the next immediate neighbor’s IP address is converted to a physical address5 and passed to the data link layer with the packet.

 

This NIC contains the data link layer which generates a frame. The sender’s physical address is taken from the card itself (as it is the card’s own address) and the recipient’s address is taken as passed to it from IP process. Like other cases, the frame also contains many other fields which we ignore for now.

 

The frame is now sent to the next immediate router. That router’s network interface card (Ethernet card in most of the cases) receives that frame. It is possible that the card receives a packet not destined for it. There are a few reasons for this case. It may be done when there is an error in the communication system. The second possibility is that the received packet is a broadcast or multicast packet. Broadcast means relaying a message to every machine in the network, while multicast means sending to a specific group of machines. If the machine is a part of a multicast group, then the card may receive that packet. If the card receives a packet with the destination address of a group the machine belongs to, then the card will have to accept that packet. Similarly, in the case of a broadcast address, the card must accept that packet.

 

When the physical layer receives the frame in form of bits, it is passed up. The data link layer now understands the bit streams received as a frame. It checks the destination address which is the card’s own address or a multicast or broadcast address. If it is ok, the content of the frame (i.e., the network layer data, the packet) is passed to the network layer.

 

The network layer now checks for the IP address of the destination. If that address is not its own, it has to forward that packet. In the case of intermediate routers, the destination network’s address is not their own address, so they have to route that packet to some other router. They now refer to the routing table. This table suggests the route for reaching to a given destination. Once the router learns from the table where to send the packet, the router constructs a new frame. The new frame will contain the sender’s address as the router’s own address and the receiver’s address as the next destination router’s address. Then it is passed to the physical layer to transmit it to the physical layer of the next router. Remember that the next router is decided at the network layer looking at the routing table.

 

It is important to see that the router processes this message till the network layer and not let it go up. The reason is that the message is not for the router and the transport layer data

 

5. The IP address of the immediate next recipient is also converted to the recipient’s physical address by means of a process known as address resolution. Address resolution is also a complicated process based on typical message exchanges in IPv4. IPv6 follows a simpler method of having an IP address which contains the physical address as part of it. Thus getting a physical address from an IP address requires a simple extraction process is of no use for the router. The only information needed is the receiver’s address, looking at which they can find out where to send the packet next. The router will be able to route successfully with the data available at the network layer itself. That is why when the message reaches the network layer of an intermediate router, it just takes a return journey back to the physical layer and does not go up.

 

Then the packet is received by the next router. The next router does similar processing to find out next router, for a given destination, from the routing table, construct the frame, and send it across. This process will continue till the packet reaches to the final destination.

 

When the final destination network layer receives the packet (the network layer of the mail server running at glsuniversity), it concludes that the address is its own. Thus, instead of forwarding the packet, it passes the content of the packet up to the transport layer. The transport layer, upon receiving the segment, sends an acknowledgment on the reverse channel in a similar fashion. The transport layer, if finds everything ok, passes the data up to the application layer and the application acts upon the command sent by the sender. In our case, the command is coming from a mail client for a mail server. We have looked at one typical example in the previous module indicating typical mail client commands and responses from mail servers. This command can be any one of them. The server decides to respond back based on what it received. The response comes back in exactly the same fashion, each layer deciding what to do and processing that information into packets. If it has to send the file back (as the sender wants to have a file download), it will send the first block of that file now and will continue to do so till the file gets over6.

 

One critical point is, the path of server’s response is independent of the path of the request and thus, it is possible that the path through which the request has come is different than the path through response traveled.

 

In fact, the process is quite similar for other client and server, for example. If the sender has sent the command ls to the receiver which is a Telnet server, then the receiver interprets that the sender wants to have a directory listing. It acts upon that command and prepares the directory listing. Then it sends the listing back on the reverse channel exactly like the previous case7.

Another case of web page access

 

We have taken mail as one example. Now we will take another example, the web access process of reiterating how each layer functions and how the entire communication process

 

6. In fact, the server established another TCP connection for file transfer but for now we will not elaborate that

 

7. One thing which we have not considered is that the mail communication is asynchronous, that means both sender and receiver may not be online at the same point of time to communicate. The mail systems are designed as a two‐tier model to solve this problem. We are only talking about the first tier right now. is mana ged. The id ea is to loo k at some of the additional issues which we did not encounter in the p revious case.

 

Suppose we are in terested in searching something and type https://www.google.c om. The browser may send a typical HTTP messa ge to the web server8.

 

GET /H TTP/1.1

 

Host: w ww.google.com

 

…..

 

The URL that we typed cont ain one pa rt. The HTTP followed by the na me of the website, www.g oogle.com, but some times it contains tw o parts. For example, take a case of browsi ng espncric nfo website. Our query to the we b server is as follows.

 

http:// www.espncricinfo.com /netstorage/summary .json, whi ch contains the name of the website www.esp ncricinfo.com but additionally a directory an d a typical JSON file un der it.

 

which results into an HTTP he ader

 

Get /ne tstorage/s ummary.jso n HTTP/1.1

 

……

 

 

8. It is possible to capture HTTP hea ders from browsers, for example, Firefox allows users to install and add‐in called “Live HTTP headers”. When that is enabled, the users can star capturing headers for any we b activity thereaft er. Wireshark like network analyzers als o provide that information.

 

The first part, which exists in both cases, indicates the protocol, https (technically HTTP over SSL or TLS), the domain name of the server (google in the first place and espncricinfo in the second case). The second part which is not mentioned in the first case is assumed by default to be a typical page (default.html) while in the second case it is explicitly specified to be summary.json. The browser requests the DNS to get the IP address associated with the first part, the web server domain name, and once got, the HTTP header is created with data being the remaining part. The DNS processing is depicted in figure 3.3. the client gets the IP address by asking DNS server at the sender. The browser, which is running at the application layer, constructs the message with HTTP header. The final part (HTTP/1.1) indicates the version of the HTTP request that is being prepared. We will not be looking at the details of this command as well as the exact format of the HTTP header. Once this HTTP request is constructed, it is passed to TCP.

 

TCP is also given the IP address obtained as an argument, with port number 80 (which is a default port for a web server). It constructs its own header with specific port numbers for the sender (the port number associated with the browser, a typical 16‐bit number given by the OS) as well as port 80 for the receiver. Now TCP constructs a new segment. It is kind of a request, telling IP to establish a connection to that IP address, with the IP address it has obtained as an argument. Assume that IP address to be 5.6.7.8. This TCP segment is called CR or connection request segment. It is a request from the browser to the web server.

 

Now the IP process has both, the TCP segment, and the IP address of the receiver. It already has the IP address of the machine where it is running (sender’s IP address), so it is armed with information to construct the header so it does and embeds the TCP segment into it. This is the IP packet. It contains the sender’s and receiver’s IP addresses, few other fields and the TCP segment embedded in it.

 

Now the IP packet is passed to the next layer, the Data Link Layer. The receiver’s data link layer address is not the IP address of the receiver, as we have already discussed, it is the next immediate neighbor’s address. The receiver’s IP address help the IP process to find next immediate neighbor’s IP address using a routing table. If the next immediate router’s IP address is 6.7.8.9 then that would be the receiver. The data link layer uses the physical address (which is also called the MAC address) of that receiver and not the IP address.

 

Let us take a detour to discuss these two types of addresses. IP address is a global address, unique in entire Internet, while the physical address is the local address which may not be unique globally. The physical address represents the network card and thus based on that typical type. For example, if the machine has Ethernet card, it will have an Ethernet address as a physical address. As Ethernet frame is to be constructed, to be carried by the Ethernet network, we must use an Ethernet address and not the IP address. As we have mentioned earlier, we need a mechanism of address resolution to get the physical address of the machine for a typical IP address it possesses.

 

Once the physical address of the recipient is obtained, and sender’s address (its own address) already there in the kitty, the frame is ready to be constructed and off she goes; to the physical layer.

 

Upon receiving the frame, the physical layer sends that over the wire (or a wireless channel as the case may be) to the next immediate router

 

The physical layer, upon receiving the frame, sends it bit by bit to the other end. The physical layer at the other end (of an intermediate router) passes the frame to the data link layer after the entire frame is received.

 

The same sequence of events happens at the intermediate router and it continues like that until the end. When the TCP of the receiver receives the message, it sends back a response in a typical fashion. When the sender receives that typical response, it sends a confirmation back. That is known as the three‐way handshake. Once that is done, first actual data transmission begins. TCP constructs a new segment, encapsulates the HTTP request into that header and passes that with IP address as an argument to the next layer (IP), which eventually is received at the receiving TCP. Once it is done, that HTTP request is given to the HTTP server (a web server). Similar three‐way handshake process happens every time TCP establishes connection to another TCP. In case of mail transfer too that happens which we ignored for simplicity.

 

We have discussed two different examples to describe the connection‐oriented communication process. What if the process is connectionless? Let us take one more example to do so.

 

Connectionless transfer

 

Now let us look at a typical example of a DNS client is asking for an IP address for a domain name, www.google.com. The DNS client running at application layer, constructs a DNS request and gives it to UDP. Unlike TCP, UDP does not do much in the segment, however, it constructs a segment with port number 53 as a receiver’s port and sender’s port as its own. It passes that with the IP address of the DNS server (which is obtained using other methods, usually configured with that value when the client is installed). The IP process works exactly like before and the process goes on exactly in the same fashion till it reaches to the UDP at the other end.

 

Unlike TCP, UDP does not respond back upon receipt of an acknowledgment to the sending UDP. It will just pass it up.

 

What if the packet, traveling from one router to another, is lost in transit? In the case of connection‐oriented traffic, TCP remembers every segment it sent and if the acknowledgment of that segment does not come back in stipulated time, it will resend that segment, that means IP gets that segment again and will have another chance to send it across. In the case of UDP, there is no chance as the sender’s UDP does not do that type of bookkeeping, nor does it run a timer to find if there is an acknowledgment missing.

 

However, we have left a few questions unanswered, for example, how does the DNS convert a name to an IP address? How exactly TCP determines the timeout value and decide the retransmission time? What are other fields of the frame, IP packet, and TCP segment? We will discuss these questions and their answers in due course.

 

We have discussed TCP and UDP and clients and servers. The Internet also operates on peer to peer mode. How does peer to peer way of communication differ from the client server way of communication? We have discussed the answer to that critical question and few other important things next.

 

Peer to peer communication

 

The web page access communication based on HTTP protocol or mail access based on SMTP protocol are examples of client‐server communication. Here we have a client at one end and server at the other end. Both parties have a dedicated role to play. The client is requesting a service and server is responding back with that service. The server is ready to accept (and sometimes reject) the incoming requests from clients. When the client tries to connect, the server which is ready to receive that request is running at a specific location (the port number), receives that request and respond. The server never initiates a request on its own. It is the client who always initiates the communication. Both client and servers are computing devices may or may not be part of the same network.

 

It is possible to have a case where such roles are not strictly defined. All the connected parties can initiate the request and respond to anybody’s request. The communication, in this case, has no basic structure and there are no clear‐cut roles of client and server. A communicating device can act as a client at one point and server at another, or both at the same point in time. It is possible that one node is downloading a song from another node, after asking for it, and thus acting as a client, also delivering another song to some other node, accepting that node’s request, like a server.

 

This kind of communication, which assumes both parties as equal, is called peer‐to‐peer, or P2P for short. The idea of such peer‐to‐peer communication has become so popular that it has surpassed the client‐server communication already. The very internet and TCP/IP protoco l stack pro vides us c lient‐server communic ation when we use p rotocols li ke HTTP but provides P2P communication when we use Torrent like pro tocols.

 

P2P ne tworks came into existence from the need fo r a normal user to sh are his content. For example, videos a nd photos etc. P2P sys tems, unfortunately, used for m any illegal a ctivities like sharing license d songs, a nd criticized a lot for it, has many good appli cations as w ell. The critical difference between a client‐serv er system a nd a P2P system is that when the server in client‐server sys tems are d esigned to run 24* 7 and are closely monit ored and e nhanced to respond to clie nt’s requests, P2P syst ems are b uilt over no rmal user nodes and have no central point of control and no guaranteed service.

 

This ch aracteristic of P2P net works mak es them ver y good at scaling. Tha t means, in creasing the nu mber of use rs does not increase demands fo r bandwid th or processing from servers. It is no t easy for the syste m to have optimum performance when us ers have d ifferent upload and down load spee ds. One very serious problem with P2P network is to have intermittent conn ections fro m all users. Unlike ded icated we b servers which remai n online all the times and also ready to have m any concur rent connections, nor mal users may join and may leave the network a nytime and cannot ha ndle more than a few c onnection s.

 

One more point s ometimes is made in f avor of P2 P networks is that the informatio n share has mo re user co ntrol and thus user privacy iss ues are be tter handle d. Many i ncidents proved that users who are o blivious of the simple privacy rela ted disciplinary meas ures can compro mise any system and vulnerable to any atta ck based on privacy compromise.

 

Figure 3.4 and 3 .5 summarizes our discussion about client‐server and peer to peer commu nication.

Standardization

 

Who decides that TCP should have t his segment structu re and ho w the fra mes are genera ted, how a re their co ntent designed, and w hen netwo rk cards a ctually con vert bits into signals, how they choose a typical m ethod wh en possibly an infinite number of ways to do it?

 

The answer lies in providing standardization. Ther e are quit e a few rea sons for providing standar dizations.

 

First, there are multiple vendors of an y networki ng device and all of them have an idea about h ow the thi ngs to be ca rried out. If all of the m prepare d evices usin g their own idea, it will result in chao s. When a packet goes to a route r of vendo r 1, he proc esses it differently than vendor‐2, no system can work.

 

Second, when a m anufacture r produces a device, for exampl e, a wireless card, he must be sure ab out a mac hine which installs this card can talk to an access point manufactured by another company. This is o bserved in all other cases, for e xample, an electric socket is designe d so that a plug ma nufactured by any other company can fi t into it. N ot only mecha nical stand ards are m et (so they perfectly fit), but ele ctrical standards are a lso met (so current passin g from the socket is pr operly rec eived by th e plug), and so does th e other standar ds (for example, one of the wires in household plug will be a neutral wire an d one of them is also ground wire and the last on e carries the current).

 

Similarly, a wireles s card mus t fit exactly into the slot provide d on the la ptop. It als o should have n umber and sequence of pins exp ected by th e laptop. Thus people who man ufacture devices can freely design their devices without wor rying about interoperability issues.

 

There are three major bodi es who act in standar dizing the networkin g equipme nt’s and protoco ls. First is IEEE (Institute of El ectrical an d Electronics Engineers), which address lowerm ost two la yers and thus has s ome say in building Ethernet and wireless cards. Another player i s ISO or Internation al Standar ds Organization, one part of which is standar dized by NIST (National Institute of Sci ence and Technology) which helps to standardize many internet protocols for security like IPsec, SSL etc. IETF or Internet Engineering Task Force help standardize the internet protocols. The SMTP, FTP, TCP, IP are designed and managed by IETF. The International Telecommunication Union or ITU has a branch ITU‐T (telecommunication standardizing body) helps to standardize many communicating protocols that we will encounter, especially those which are used in LANs.

 

Summary

 

In this module, we have looked at complete communication process including connection‐ oriented and connectionless communication. How an entire communication process involves each of the layers and what exactly is the role of each layer is exemplified by two examples involving HTTP and SMTP. We have clearly seen the job of all layers, how the process is managed by all layers collectively. The P2P communication is where the role of client and server is not predefined. There are multiple players in standardizing the TCP/IP protocols.

you can view video on Complete Communication process

References

  1. Computer Networks by Bhushan Trivedi, Oxford University Press
  2. Data Communication and Networking, Bhushan Trivedi, Oxford University Press