25 Application Layer, Domain Name Service
Prof. Bhushan Trivedi
Introduction
From this module our journey to the topmost layer of TCP/IP stack begins, we will be discussing the application layer for four modules now. We will start looking at the first application layer service, known as DNS or Domain Name Service in this module. Whenever we type a URL or an email address it is to be converted to an IP address for the TCP to initiate the communication to the other party. That process is carried out by this service. DNS is three things rolled into one. It is a distributed database which contains the pairs of domain name and IP address and a few more things, it is also a system which enables applications like browsers to get the IP addresses of the domain names and it is also a set of hierarchical collection of domain name database managed by specific servers especially designed for it. The meaning will be clear from the context so we will continue using word DNS for all three items.
Application layer
The application layer is the one which interacts with the user so it is very critical. For example, when we run a browser or a web server, run a mail client or a mail server, start an FTP client to download or upload a file, type any URL which inherently invites a DNS client, we are interacting with the application layer. Every application runs at the application layer and it takes the advantage of lower layers starting from transport to physical layer to communicate with other application at the other end. We have also learned that different applications demand different types of services and thus either TCP or UDP is chosen for them to work at the transport layer. For building applications, normally a SOCKET API is used. This module does not throw any light on how that API can be used but one can refer to Reference-1 for an introductory text. For a detailed description, one can refer to many other books and even resources on the web for socket programming using C, C++, Java, Python and many other languages.
One of the characteristics of the application layer is the variety of applications available is huge and increasing every day. All other layers have only one or two choices but application layer has DNS, SMTP, HTTP, FTP and hundreds of other applications. The reason for application layer to have many applications is because the user wants those services and for each and every service usually a different application is designed. Users also prefer to have choices and thus a single application like browser comes in many varieties. However different they are, users also expect them to work in a similar fashion. For example, one may be using Safari and another may be using an Internet Explorer and the third one may choose Firefox, all of them must show a website in the similar fashion to the user and provide similar services.
All in all, the application layer demands very different approach to handling. In this and three more subsequent modules, we will discuss four different applications and learn about complexities that are demanded from them to handle the problem they are designed to. In this module, we will begin with the first application, the domain name service.
Domain Name System (DNS)
The people are good at remembering names but computers are better at handling numbers. This clear mismatch demands many solutions in past for converting a string representation of something into a number and vice versa. DNS is one such solution. Users remember email addresses and website names as strings but computers know them as IP addresses. DNS is designed primarily for converting a domain name (the URL or email name are examples of domain names) into an IP address. DNS, apart from that conversion, does a few more things. We will discuss those things in this module.
The requirements for such a domain name system are listed below.
1. The DNS must be efficient and respond in real time. As users frequently type URLs, it is important for DNS to respond fast so users do not feel the delay.
2. The Internet continues to evolve, new networks added, older vanish and lines come up and go down frequently, networks and especially wireless networks continue to lose packets. Despite all these, the DNS must continue functioning.
3. The users may generate packets from anywhere in the world in an unpredictable fashion. The DNS should respond despite such unpredictability in the traffic.
4.While DNS should work as per the demand of the user, however, it should not introduce additional overheads that slow down the functioning of Internet.
5. Another important point is about the network bandwidth when a DNS client asks a DNS server for a conversion, it is better if Information should be available from the nearby servers so the process should not introduce additional network traffic.
6. Not only the IP address, sometimes some other information, for example, mail server or an FTP server of the specific domain, alternate name of the domain, details about the admin of the specific domain etc. are additionally needed to be stored and provided when needed.
7. Sometimes the domains are so small that sparing an entire server to handle is an overkill. Ideally, a single server should be able to handle multiple domains.
8. The DNS should be secure; one should not be able to poison it. That means if somebody types the URL of the HDFC bank the IP address of the very bank should be returned and not any other hackers.
The DNS provides all of above services except the final one. For providing security, there is an extension called DNSsec is introduced by IETF. We will brief about DNSsec at the end of this module.
Domain Names
Addressing every individual computer and other devices across the world in a unique fashion is the prerequisite for any communication system and the Internet is no exception. DNS is designed to make sure every network and every node have a unique name. The DNS is designed in a hierarchical fashion. An example of a domain name is org. which indicates a domain known as org or organization. Another example is abcict.org which indicates a domain known as abcict which is defined under org. Another example is vishwanath.abcict.org which is another domain defined under abcict. In the domain name vishwanath.abcict.org, different labels are said to be separated by dots. Vishwanath, abcict, and org are all called labels. Figure 27.1 describes the domain name system. It also includes our contrived domain name abcict.org domain name. additionally, it includes well-known examples of domain names like oracle.com and Microsoft.com.
The labels that construct domain names are connected by the dot (.). The figure 27.2 depicts the labels for a domain name ict.gls.ac.in. the last dot is not insignificant. A root server, sitting at the helm of the domain name system, is denoted by a null label. The last dot is actually followed by that null string describing a root server.
Partially and fully qualified domain names
When a domain name is not written in the complete form including the final dot, it is called partially qualified domain name or PQDN. Only when it is completely written including the final dot, it is called FQDN or fully qualified domain name. Examples of PQDN are microsfot.com, abcict.org, abcict, Microsoft, org, com etc. Examples of FQDN are Microsoft.com., abcict.org., vishwanath.abcict.org. etc.
The domain name system is not designed to work with PQDNs but users tend to provide them for the sake of convenience. The program which reads the domain name from the user and passes them to the DNS server is known as the resolver. One of the most useful operations of the resolver is to convert a PQDN into the most logical form of FQDN. Resolver, after converting a PQDN to an FQDN, does its main job, getting the IP address associated with that typical domain name. This process is known as resolution. The resolution process happens in two different ways, one is called recursive and another is called iterative. Here is the description.
Iterative and recursive resolution
Normally every network contains a local server and the resolver sends a query to the local server for a domain name. If the domain name is available with the local DNS server, it will respond back. Whenever a local DNS server is installed, one entry is manually provided, the IP address of one of the root servers (there are 13 root servers in total). So if nothing else, it will have the address of that root server and start the process. Let us take an example to understand the process.
Consider the sender wants to learn about the IP address of vishwanath.abcict.org. The sender passes that information to the local DNS server. The local server passes that information to the root server and asks for the IP address of the org server. As org is under root, the root server has the IP address of (one of the) org servers. It returns that to the local server. Now local server sends a query to the org server to get the address of abcict domain which is under org. When the abcict address is obtained, the local server now communicates with the abcict server and ask for the IP address of Vishwanath. The abcict server responds back with that value and thus the sender gets the required address.
The recursive query operates in a little different way. The local server sends a query to the root server, root server passes it to org server, org server passes that to abcict and abcict returns the address to org and on the reverse line, it reaches to the local server. The local server, finally, sends that information to the sender.
There are two things to note. First, it is possible that some part of the resolution is iterative and some portion of the resolution is recursive, normally root servers do not provide recursive responses. It is possible for one to ask for a recursive or an iterative query at any stage and it depends on the other party to either provide the type of resolution that is asked for or not. The servers along the line, are sometimes known as DNS servers but usually known as name servers for short.
The Domain name system is designed in the hierarchical fashion. Let us try to see why it is designed so.
Need for hierarchy
Let us brief about the advantage of the hierarchical structure of the DNS.
1. If we keep a flat file space to store all domain names, the number of domain names is so large that it is impossible to store them in a single location in a single database
2. Even if somehow we manage to enter all these data in a single database, searching for a domain name in real time is next to impossible.
3. Even if both of above problems are solved, it would be difficult for keeping all the information at a certain location. Every URL and every email address typed need to refer to this database and the network traffic to that location will become so huge that it will jam all lines going to that single DNS server
4. The data becomes more organized in hierarchical fashion. All domain names for an organization abcict.org are listed under it and all domain names for another organization called aesict.org are listed under that. The root servers are located at the helm of the hierarchy and at next level we have top level domain servers like com, org, and in, jp etc. at the leaves, we have local servers.
5. Name resolution may act parallel in different branches of the domain name system. For example, when one local server is busy getting the IP address of ict.abc.ac.in the other local server might be working on getting the address of ict.aes.ac.in.
6. One big advantage of the DNS is that the database remains quite static. The IP address of the servers seldom change. One can keep them in their own memory after successful resolution and use it for some length of time. This process is known as caching and quite useful in reducing the latency of subsequent access to the domain name. Most local servers are designed to cache IP addresses for frequently accessed domain names. Usually, every machine also contains some domain names it frequently used stored in local files. Most resolvers can read these files and load them in local server cache so a local server does not need to fetch them over the Internet. Such a design reduces a lot of load on the server as most of the responses are resolved locally.
7. Another critical design advantage is that every domain controls subdomains below it. Such a design frees a single server from managing everything and also is fault tolerant as if one server goes down, only domains it handles has an issue and the rest of the DNS can continue functioning.
8. Another advantage of clear autonomous control to every domain is that each domain can evolve on its own and grant unique subdomains under them without any trouble.
As the information is stored in a form of distributed databases, there are a few additional advantages of this design.
Distributed database
As the database is designed in a distributed fashion, every domain contains a pointer to every subdomain below it.
Look at four tables, 27.1 to 27.4, describing a very simplified version of the database for hierarchical domains in, ac, ABC, and ict. The higher level domain contains a pointer to the subdomain under it. The subdomain may contain a few additional records apart from pointers to all of its subdomains. What are the advantages of distributed database design? Here is a quick account.
1. It provides means of a clear controlling domain for every owner of a domain. The authority to manage one’s own domain is given to the owner; they decide their own next level of members by allowing their entries in their database.
2. They have all information about their members as well as pointers to their next level and thus can always provide responses to DNS queries based on either of them.
3. Most of the queries are about machines which are of the same domain and most of the queries are resolved locally, so network latency, the load on a network server, processing at the name servers, all are reduced; resulting in a much better user experience.
4. Distributed design is much more robust than a single server design; as it does not offer a single point of failure. There was an attack on DNS root servers by an attacker named ‘Mafia Boy’ which was really a very serious attempt to close the servers down, but due to distributed nature, 9 out of 13 servers went down but rest managed the Internet and most of the users did not feel anything unusual.
5. The maintenance of a distributed database is easier as compared to a big large database.
6. Both resolver and Local DNS server are part of the same database and that is why the queries which can be resolved at a local DNS level are immediately resolved.
7. Apart from the mandatory one root servers address, the local servers usually also have all TLDs cached and thus they hardly need to communicate to the root server.
Zone
The name server which keeps track of all the names of the domain is called an authoritative server of the domain. Domain size is based on user’s interest and also on logical division. For example, a university domain is logically divided into multiple departments like English, Mathematics, Chemistry etc. Not every department is of the same size. Chemistry department might have 2000 computers connected to and Mathematics department might only have two computers. It is logical to spare a server (name server) to store information about the Chemistry domain but an overkill for the Math. It is actually possible to organize servers in an administratively convenient fashion and keep a single server which can track both Chemistry and Math department together or even the English department with it. Exactly opposite requirements stem in some other cases. Companies like Microsoft and Oracle have branches in every country. Bodies like IEEE and ACM have their chapters all around the world. There are many members of these domains spread across globally. Keeping a single name server is not a very efficient solution. In such cases, usually a name server is assigned to manage a part of the domain and thus multiple servers are needed for a single domain.
A part of the domain or a collection of multiple domains, which is controlled by a single name server is known as a zone. They store complete information about all the domains they control in a file called a zone file. Figure 27.7 showcases the example of a zone. Department of Atomic energy (dae) is located under gov.in and president of India is located under nic.in, for example, are managed by a single zone.
Domain name registration
Whenever one establishes a new company or organization, he must get his own domain name. He has to find if the domain name he wants is not already owned by somebody else. Normally there are ISPs who provide web space and also acts as agents for us to get the domain name that we want. We can check if the domain name we want is available or otherwise using their website. Once we get the unique domain name, we need to register that under the higher level domain name. For example, if we want to register it under com, we need to inquire the server manages com database for the uniqueness of the name as well as registration. The ISPs usually provide both these services. A non-profit organization known as ICANN (Internet Corporation for Assigned Names and Numbers) is responsible for providing the domain name. It is impossible for ICANN to handle the requests coming from all over the world and thus they have appointed many registrars for that process. An ISP usually communicates with Registrar for domain name registration, who starts communicating with the domain name owner under which he has registered our name when we make that request. Now our domain name is inserted in the parent domain name structure. Our domain database is constructed and usually, two entries are supplied in it. First is usually a www (or web server’s) and mail server’s entries. That means, after two entries, people who want to access our website will be able to access the website by name
www.OurCompanyName. OurParentDomain and mail our employees at EmployeeName@OurCompanyName. OurParentDomain. For example, if we have registered abcict.org. we can now have our web server’s IP address associated with domain www.abcict.org and all mailboxes of our institute of the form vishwanath@abcict.org working. Let me remind you that abcict.org used in both names indicate two different types of domain names, first is the domain name of the institute and second is a name of a mail server. Figure 27.8 illustrates these two critical first steps as the part of the registration process. The user here sends the request for a new company xyz.co.in and registrar registers xyz under co.in. additionally, the registrar enters www.xyz.co.in as well as mail server domain name for xyz.co.in.
Types of servers
Roughly, the name servers can be categorized into three types, first is a root server which is located at 13 different places in form of a cluster. Next level is known as TLDs or top level domains which are located across the world and finally other servers including the local servers.
Figure 27.9 depicts the root servers and other servers. You can see that ABC servers are connected with two different servers ORG as well as AC. This happens when an institute has more than one domain names. Other servers may also hold information about authoritative servers in a cached form.
Dynamic DNS
Solutions like DHCP (Dynamic Host Configuration Protocol) provides a different IP address every time the node joins the network. When a server or a node has a different IP address assigned at different times, the DNS record gets invalidated. The solution to this problem is to connect DHCP with DNS. Whenever DHCP assigns an IP address to a machine with a specific domain name, it also informs the DNS. The DNS updates its own database accordingly in a dynamic fashion. This solution is known as Dynamic DNS.
Processing the DNS
DNS can provide dynamic pages which are generated on the fly. Normally a static page is stored in the database and delivered as and when the query is fired. The process is depicted in the figure 27.11. The information is stored in form of resource records which we will explore further in the next module.
DNSSec
Normal DNS records are stored in plain fashion and thus are vulnerable to hacking. DNS records need to be encrypted so no third party can read them without proper authorization. Moreover, it should not be possible for anybody to change the DNS records without authorization. DNSsec or DNS security is designed to provide these two services. The DNSsec uses a method known as public key encryption. Public key encryption uses two different keys. The private key, the first key, is available to only the owner of the domain. The public key, the other key is available to all. The DNS records are encrypted using the private key and can be decrypted using the public key of the owner. Everybody is given the public key to open and check the authenticity of the DNS record. A modification of DNS record or generation of a fake record demands to have a private key of the owner and it is not possible for a hacker to do it easily. In the next module, we will brief about some records related to DNSsec. Reference 1 and 2 have more details on these encryption process.
Summary
In this module, we looked at the application layer, the types of the application layer and dealing with multiple types of applications. We have also seen how DNS works, the advantage of hierarchical design and distributed database. We have also briefed bout DDNS and Dynamic DNS.
References
1. Computer Networks by Bhushan Trivedi, Oxford University Press
2. Data Communication and Networking, Bhushan Trivedi, Oxford University Press