10 Network Protocols: Concept, Types –TCP / IP, OSI. Other Protocols
Manoj Kumar
Table of Contents
1. Objectives
2. Introduction
3. Definition
4. Concept of Layering
5. ISO‐OSI Model
5.1 Physical Layer
5.2 Data Link Layer
5.3 Network Layer
5.4 Transport Layer
5.5 Session Layer
5.6 Presentation Layer
5.7 Application Layer
6. TCP / IP Model
6.1 SMTP
6.2 Telnet
6.3 FTP
6.4 HTTP
7. Z39.50
8. Harvesting Protocols OAI‐PMH
1. Objectives
Objective of this module is to teach students about the basic concept of protocols in Network and Internet. Few standard models of protocols such as OSI, TCP/IP is discussed in details. After completion of the module, the student will be able to understand basic principles of data communication and protocols and standards for Internet and networks. Protocols used for data exchange is also discussed in this module.
2. Introduction
Protocol is a way of agreeing to a common set of rules for communication over network. Even a novice user, who uses a computer for his/her basic needs, also need to come across with may protocols, which he/she might be unknowingly using on his/her computer. For e.g. For browsing a website using a standard browser software, generally one user need to type a web address starting with ‘http://’ in the browser. In simple words, HTTP stands for a protocol for transferring and displaying web pages on a browser screen. In few popular browsers, if nothing is mentioned, ‘http’ is assumed as default protocol. Basic protocol for Internet architecture is TCP/IP. Mail programs use SMTP protocol and File transfer on Internet uses FTP protocol. Protocols are required for transferring data, when network is created using different underlying technologies. These protocols are predefined and can be adopted before the implementation of a network.
3. Definition
Data transmission happens between different networks may vary in terms of topology, architecture, standards and underlying technologies. The fragmented portion of the data or component wised blocks of data called “packets” travels from one node in a specific network to a distance network which may be using different technology or protocol. It is important to interpret the path of travelling packets on each network. Dissimilar networks should agree on common rules and standards for handling the data it received from one network to forward to the targeted destination. This set of rules is called „protocols’ in networks. The entire network (LAN, WAN, MAN etc.) and Internet agreed upon common rules and standards for data exchange. Network protocol is a code of correct conduct with a set of rules and conventions for sending information over a network for exchanging data. Such standard procedures for regulating data transmission between computers are essential to send a data across dissimilar network. The popular models are ISO-OSI, TCP/IP, SNA and DNA. The major two protocols will be discussed in this model.
When networking concept was introduced, many major companies tried to invent their own communication standards for network. All standards introduced by major private IT network companies used the same to implement on their network and became their own propriety protocols. IBM (International Business Machines Corp.) used special SNA (System Network Architecture) for their devices to connect and make network. Meanwhile, Japanese based DEC (Digital Equipment Corporation) company used DNA (Digital Network Architecture) for connection their devices to build network. These two major network protocols standard are used even for others to implement, but remained as their proprietary protocols. Later in 1983, an open protocol is introduced by International Standards Organisation (ISO) which is based on sever layer approach for communication from one machine to another. This model is called Open System Interconnections (ISO-OSI), which is emerged as an open protocol standard for connecting computers for creating networks. It is a sophisticated protocol which distinctively defines roles for each layer. Two network models will be discussed in this module.
4. Concept of Layering
Computer networks are designed in a modular way for easy and efficient handling of the network system. The whole network functions are categorised into a series of layered modules and are logically composed of a many layers depending upon the model used. Each layer at bottom offers particular services to the immediate higher layer by shielding those layers from the details of how the offered services are actually implemented in that layer. Therefore, each layer has its own set of protocols. A particular layer of one machine communicates only with the corresponding layer of another machine by using the protocols of that layer.
Each layer is designed to do its functionalities independently without bothering the other layers. Major functionalities which are to be done on various layers are modulation/demodulation, encoding, framing, detect and correct errors, Medium Access Control, routing, reliability, in-order delivery, multiplexing and de- multiplexing, quality of services (QoS), security, compression, naming, addressing, application handling etc.
The main reasons for using the concept of layered protocols in network design are as follows:
The protocols of a network are fairly complex. Designing them in layers makes their implementation more manageable.
Layering of protocols provides well-defined interfaces between the layers, so that a change in one layer does not affect the adjacent layer. Various functionalists can be partitioned and implemented independently, so that each one can be changed as technology improves without the other ones being affected. For example, a change in a routing algorithm of a network control program should not after affect the functions of message sequencing, which is located in another layer of the network architecture.
Layering of protocols also allows interaction between functionally paired layer in different locations. This concept aids in permitting the distribution of functions to remote nodes.
The term protocol suite, protocol family, or protocol stacks are used to refer to the collection of protocols (of all layers) of a particular networking system.
5. ISO-OSI Model
ISO stands for International Standards Organization and OSI stands for Open Systems Interconnect. ISO-OSI model is the fundamental and popular model introduced in 1978 and revised in 1984. This model formulates the communication process into structured layers as shown in figure. There are seven layers in the model and it is called a 7-Layer model. The model acts as a frame of reference in the design of communications and networking products. The OSI model passes the communication from the Physical Layer at media level, all the way to the Application layer. As shown in the figure, message traffic moves up and down the OSI model, depending on the purpose of the message. If host ‘A’ wants to communicate with Host ‘B’, the communication starts from the Application layer of ‘A’ to the application layer of ‘B’ through all other layers. A virtual channels are created from one host to another host. One layer communicates with the corresponsing layer in the other system. Each layer adds its own header information to the data received. There is no direct communication between each layer physically except physical layer. Physical layers of both the hosts are connected directly through a physical media. The media can be cable, fiber, wireless, radio frequency signals etc.
The OSI Model works by establishing a set of rules and standards for communication in between the layers. These rules ensure that different products can communicate between each other because they are developed around the same guidelines. The picture below shows how messages are transferred between the different layers of the OSI model.
5.1 Physical Layer
The physical layer of the OSI model defines media specifications such as connector and interface specifications, as well as the medium (cable) requirements. Through physical media, information is passed as signals which are coded as bit streams. Electrical, mechanical, functional and procedural specifications are provided for sending a bit stream on a computer network. Equipment like hubs and cables are related to physical layer.
5.2 Data Link Layer
Data Link layer allows a device to access the network to send and receive messages. It is responsible for offering a physical address (MAC address) so that device‟s data can be sent on the network. This layer also has error detection facilities. Networking components that function at layer 2 include NIC (Network interface cards) and Ethernet and Token Ring switches, Bridges etc. A layer 2 switch uses this address to filter and forward traffic, helping relieve congestion and collisions on a network segment.
5.3 Network Layer
The basic unit of transfer is a datagram that is wrapped (encapsulated) in a frame. The datagram is also composed of a header and data field. There is usually at least one router on WANs between two computers. The connection between two neighbouring routers on the link layer is always direct. The router unpacks the datagram from a frame, only to wrap it again into a different frame (or, more generally, in a frame of different link protocol) before sending it to a different line. The network layer does not see the appliances on the physical and link layers (modems, repeaters, switches, etc.). The network layer also does not care about what kind of link protocols are used on routes between the source and the destination.
An internetwork addressing scheme assigns each network and each node a unique address. The network layer supports multiple data link connections. In the Internet Protocol suite, IP is the network layer internetworking protocol which uses IP address. In the IPX/SPX suite, IPX is the network layer protocol.
5.4 Transport Layer
The transport layer provides end-to-end communication services and ensures that data is reliably delivered between those end systems. This layer works a Quality of Service for networks. Both end systems establish a connection and engage in a dialog to track the delivery of packets across the internetwork. The protocol also regulates the flow of packets to accommodate slow receivers and ensures that the transmission is not completely halted if a disruption in the link occurs. TCP and SPX are transport layer protocols.
5.5 Session Layer
The session layer coordinates the exchange of information between systems by using conversational techniques, or dialogs. Dialogs are not always required, but some applications may require a way of knowing where to restart the transmission of data if a connection is temporarily lost, or may require a periodic dialog to indicate the end of one data set and the start of a new one.
5.6 Presentation Layer
The presentation layer is responsible for how an application formats the data to be sent out onto the network. This layer basically allows an application to read (or understand) the message. Functionalities include encryption and decryption of a message for security, compression and expansion of a message so that it travels efficiently, graphics formatting, content translation, system-specific translation etc.
5.7 Application Layer
Application layer provides an interface for the end user operating a device connected to a network. The application layer defines the format in which the data should be received from or handed over to the applications. Application layer is layer user views while using the network applications.
Examples of applications on this layer functionality include:
• Support for file transfers (FTP)
• Remote Connections (Telnet)
• Ability to print on a network
• Electronic mail (E-Mail) and Electronic messaging
• Browsing the World Wide Web (HTTP) Few of the examples are discussed separately later.
6. TCP / IP Model
The reference model was named after two of its main protocols, TCP (Transmission Control Protocol) *12+ and IP (Internet Protocol). TCP/IP is based on a four‐layer reference model with many protcols included in it. All protocols that belong to the TCP/IP protocol suite are located in the top three layers of this model. Each layer of the TCP/IP model corresponds to one or more layers of the seven‐ layer Open Systems Interconnection (OSI) reference model proposed by the International Standards Organization (ISO). Following table describes the various layers and its responsibilities:
Ref: http://technet.microsoft.com/en‐us/library/
Core protocols in the TCP/IP suite are Address Resolution Protocol (ARP), Internet Protocol (IP), Internet Control Message Protocol (ICMP), Internet Group Management Protocol (IGMP) , User Datagram Protocol (UDP, Transmission Control Protocol (TCP) etc.
These applications include SMTP, Telnet, FTP, HTTP, DNS etc. Few popular application protocols are discussed below;
6.1 SMTP
SMTP stands for Simple Mail Transfer Protocol, which allows a software to transmit Email over the Internet. Most email software is designed to use SMTP for communication purposes when sending email and It only works for outgoing messages. When people set up their email programs, they will typically have to give the address of their Internet service provider’s (ISP) SMTP server for outgoing mail (smtp.gmail.com). There are two other protocols ‐ POP3 and IMAP ‐ that are used for retrieving and storing email.
The SMTP server understands very simple text commands like HELO, MAIL, RCPT and DATA. The most common commands are:
HELO ‐ introduce yourself
EHLO ‐ introduce yourself and request extended mode MAIL FROM: ‐ specify the sender
RCPT TO: ‐ specify the recipient
DATA ‐ specify the body of the message (To, From and Subject should be the first three lines.) RSET ‐ reset
QUIT ‐ quit the session
HELP ‐ get help on commands
6.2 Telnet
Many a times, servers hosted in a hosing centre which is remotely kept in a distant place need to be managed from outside. For remote connections to a system, Telnet protocol is used. It is a terminal emulation program for TCP/IP networks such as the Internet. One can then enter commands through the Telnet program and those will be executed as if the user is entering them directly on the server console. This enables to control the server and this communicate with other servers on the network. To start a Telnet session, you must log in to a server by entering a valid username and password.
Telnet protocol uses port number 23 on a computer system to allow connections from the remote clients. This protocol is very useful in managing server remotely. There is also provision in computer which uses firewall to allow connections only from known hosts. There are many freely available Telnet programs which can be installed in client’s machine to get connected to servers remotely. Telnet program uses IP address of the host machine to get connected (As shown in picture). Popular operating systems gave default telnet command line programs. ‘Putty’ is one of the popular freeware to install as a client telnet prpogram. Putty also has other services like FTP, SSH etc.
Remote connection program ‘Putty’ for telnet connection
6.3 FTP
FTP is File Transfer Protocol which is to be enabled to share data from one host to another. This is also a very popular service on the Internet. The FTP server program is to be installed on a host FTP server. FTP will allow to transfer data over TCP/IP. FTP uses port numbers 20 and 21 and authentication using a username and password is required for getting access to FTP server. ‘Anonymous’ FTP configuration is also possible so that a user can transfer, normally download, data from the FTP server. Software are shared using FTP server due to the larger size of data.
6.4 HTTP
The Hypertext Transfer Protocol (HTTP) is an application‐level protocol for distributed, collaborative, hypermedia information systems which is used for web pages. HTTP has been in use by the World‐ Wide Web (hence it is called www) global information initiative since 1990. HTTP/1.0, as defined by RFC 1945. RFC is the approved technical documentation for protocols and models.
The HTTP protocol is a request/response protocol. A client sends a request to the server in the form of a request method, URI, and protocol version, followed by a MIME‐like message containing request modifiers, client information, and possible body content over a connection with a server. The server responds with a status line, including the message’s protocol version and a success or error code, followed by a MIME‐like message containing server information, entity meta‐information, and possible entity‐body content. HTTP communication usually takes place over TCP/IP connections. The default port is TCP 80, but other ports can be used (eg. 8080).
Since this protocol is used in web page transmission, the URL starts with ‘http://’ on browser address screen. If nothing is specified in browser address bar, ‘http’ is assumed as the protocol by the modern browsers and http becomes the default protocol for browsers.
7. Z39.50
Z39.50 is an international standard client–server, an application layer communications protocol for searching and retrieving information from a database over a TCP/IP computer network. It is covered by ANSI/NISO standard Z39.50, (ISO standard 23950) and maintained by the Library of Congress.
Z39.50 is widely used in library environments and is often incorporated into integrated library systems and personal bibliographic reference software. Interlibrary catalogue searches for interlibrary loan are often implemented with Z39.50 queries. It supports a number of actions, including search, retrieval, sort, and browse. Searches are expressed using attributes, typically from the bib‐1 attribute set, which defines six attributes to be used in search of information on the server computer: use, relation, position, structure, truncation, completeness.
8. Harvesting Protocols OAI-PMH
The Open Archives Initiative Protocol for Metadata Harvesting (OAI‐PMH) provides an application‐ independent interoperability framework based on metadata harvesting. There are two classes of participants in the OAI‐PMH framework (1) Data Providers administer systems that support the OAI‐ PMH as a means of exposing metadata; and (2) Service Providers use metadata harvested via the OAI‐PMH as a basis for building value‐added services.
A harvester is a client application that issues OAI‐PMH requests. A harvester is operated by a service provider as a means of collecting metadata from repositories. A repository is a network accessible server that can process the OAI‐PMH requests. A repository is managed by a data provider to expose metadata to harvesters. The OAI‐PMH distinguishes between three distinct entities related to the metadata made accessible by the OAI‐PMH ie. resource, item and a record. A resource is the object or “stuff” that metadata is “about”. The nature of a resource, whether it is physical or digital, or whether it is stored in the repository or is a constituent of another database, is outside the scope of the OAI‐PMH. An item is a constituent of a repository from which metadata about a resource can be disseminated. That metadata may be disseminated on‐the‐fly from the associated resource, cross‐ walked from some canonical form, actually stored in the repository, etc. A record is metadata in a specific metadata format. A record is returned as an XML‐encoded byte stream in response to a protocol request to disseminate a specific metadata format from a constituent item. Following picture shows how metadata is harvested into a repository from various databases and harvesters.