5 Services and packet classification
Prof. Bhushan Trivedi
Introduction
We have looked at some very important principles of communication using packet switching, a method used by the Internet and other networks. We have also briefed about how multiplexing is done at each layer for providing different services, we will explore it further in this chapter to see how that concept changed over a period of time and why a new mechanism, called packet classification makes inroads. We have also discussed how the multi-layer switches are used instead of routers in high-speed networks. We will see that packet classification is the go-to solution when high-speed networks are used instead of conventional layered multiplexing and conventional service model of layers.
The conventional multiplexing-demultiplexing at layers
Let us begin this section with an example. When an Ethernet frame arrives at a physical layer (Ethernet card attached to our laptop or desktop), it contains a field called TYPE which tells us if the recipient is IPv4 or IPv6. Based on that value of TYPE, the network card sends the content of the frame to either an IPv4 process or IPv6 process. When that packet goes up, IPv4 contains a field called Protocol and IPv6 contains a field called Next Header which describes if the recipient is TCP, UDP or SCTP. Thus, based on that value, the receiving transport protocol process is determined. Thus the segment is passed to TCP, UDP or SCTP based on that value. Once it is given to the appropriate transport layer, for example, TCP, it contains Port Number which determines if the receiving application is SMTP, FTP or HTTP etc.
The values of these fields, TYPE, Protocol and port number are set when the sender constructs the packet. When the sending application, for example, HTTP, decides to send a packet and consult TCP, TCP automatically decides that the receiver application port number is 80. It is also allowed that the sender decides a typical port number and overrule TCP’s default decision. Once TCP constructs the segment and passes the segment to IP layer, it also passes the IP address of the recipient, received from DNS to the IP process. If that IP address is IPv4, TCP passes it to IPv4 process or IPv6 if the address belongs to IPv6. In either of the cases, the IPv4 header value of protocol field is set accordingly. In the case of IPv6, the next header value is set accordingly. When IPv4 or IPv6 process passes it to the network adapter, based on which process is passing this, the network adapter (the data link layer buried within) decides the TYPE value.
The complex part of the situation is that it is possible to have multiple communication going on simultaneously. For example, when we login to our desktop in the morning we may telnet to an office server, start downloading a file from an FTP server, connect to a Windows server for some processing, and start accessing the Internet using our Internet connection. Our TCP is multiplexing all these connections, means, sending one segment to FTP server and another to Windows server and so on. Some of the traffic may also be traveling over UDP and UDP also needs to multiplex them. The IP process running on our desktop, carry both TCP and UDP traffic to the other end. The Same way our Ethernet or wireless card is multiplexing IPv4 and IPv6 traffic. It is quite possible that the Internet, FTP, and Windows servers run on the same physical machine so that machine’s (receiver’s) TCP/IP stack following the similar process. In that case, our desktop is multiplexing while sending and the physical server of the server is demultiplexing while receiving1. The process changes when the server returns back. The TCP/IP stack at the server starts multiplexing and our desktop starts demultiplexing them.
Above example illustrate three things.
1.There is a field, at every layer, which indicates which upper layer is to be passed at the receiver’s end, whatever content that it has, with given information.
2.There are multiple choices available at every layer, the receiver should choose what is chosen at the sender. For example, if the sender has sent an IPv4 packet, the Ethernet TYPE field2 indicate so, and thus the receiver’s Ethernet card provides it to receiver’s IPv4 process.
3. Sender’s layers collect inputs from various sources, for example, TCP collects data from FTP, SMTP, HTTP and various other applications running at that time. Receiver, upon receipts, sorts them to respective receivers, for example, the receiver’s TCP distributes the incoming data to various applications intended; giving mail data to SMTP, giving HTTP data to HTTP and so on.
For years, this multiplexing and demultiplexing process used to provide services to different layers and proven quite successful. It is elegant, clear and solves the problem we have. However, in recent years there is an upsurge in the LANs which are truly high speed (10 Gb/sec is common now in Ethernet). The other point is the introduction to classified services like 2G,3G, and customers with various categories like Gold, Silver etc. and demand for specific quality of service increased accordingly.
The problem with the conventional model is that it is inappropriate for the routers to decide on other than IP header information. The information like 2G or 3G service, or the application type (based on port number), are stored in the transport and application
1 In fact, the scenario is more complex than this, considering that we may be sending to multiple servers and a given server receives from many senders, not only us.
2 There are similar fields in communication process used for various wireless networks. So the discussion does not change even if we are using a wireless adapter instead of Ethernet.
headers in the packet which are not processed by routers3. Thus, if we need to process packets based on their service types, it is not possible in the conventional model. We must have an alternate solution.
Another point, the solution that we decide must be able to work at the current speed of the network. However elegant the demultiplexing process is, it is not possible to do it fast as it happens at each layer in a linear fashion and the independence of layers prohibits communication between layers. We need an alternate solution with both, better performance and ability to assess any part of the packet irrespective of the layer it belongs to, in whatever way the access to the data is needed.
3 Refer our discussion on how routers route over packets. The process only happens at last three router layers, physical, data link and network.
The figure 5.1 describes how port numbers are used by transport layer for multiplexing. Typical port numbers like 20 for FTP and 80 for HTTP are illustrated. 5.2 showcases our discussion on how transport and network layer multiplexing carried out together. 5.3 illustrates the network layer multiplexing process for a simple case where two communicating parties are connected by just one communication line. Typically, it shows how a single line of communication between IP multiplexes the TCP and UDP traffic. 5.4 illustrates how the receiver decides, on the basis of port number, which transport layer process should receive the content of the packet. 5.5 illustrates how Ethernet decides (based on typical TYPE field values) the receiver of the frame content.
Network classification has both the answers. Let us try to see how it works.
Network Classification
Network classification has two critical problems to address, it has to classify incoming packets based on information of ANY layer, and do it FAST. The solution is provided by an additional layer called classification layer.
Network classification solves this problem using a simple trick. It maps the incoming packet into an array of bytes and thus every element of the packet becomes addressable randomly (directly). Second, this classification layer has a clear set of rules for classification, based on the byte value at the specific position. For example, if 125th byte of the packet contains information about the packet to be sent using 2G or 3G or 4G scheme, the classification layer just looks at that specific byte and decide. Because it is like an array, fetching 125th byte is done directly without any need to fetch any other byte. The classification layer does not do any demultiplexing and open the packet layer by layer; it just looks at the specific byte and decides, or sometimes looks at (multiple) specific bytes at specific positions together and then decide.
Next question you might ask is, who decides these values of bytes? How do the designers know? The network administrators decide that based on many things. For example, if they want to know the sender’s IP address, based on the information about the structure of the packet they have, it is possible to calculate the offset from the beginning of the packet. Once they know the offset, they know which byte to begin with reading the four-byte address. This process is not easy; it needs the length of the IP header which is another field. So it might need to get the header length from that field, get the proper offset value and then access the IP address starting from that particular byte.
Another critical point is, where can we have this classification layer? Let us go back to the discussion that we had about TCP/IP protocol stack. The receiver receives the frame at the physical layer. Once the data link layer processes that frame, it sends the content up. Now the data makes an entry into the TCP/IP protocol stack from beneath the IP layer. The entry point to the TCP/IP model is the IP layer in a normal case. When the classification layer resides below IP layer, it can process the packet before any TCP/IP layer. Thus, the classification layer situate below IP layer.
Multi-layer switches
We mentioned layer-2 (or data link layer) switches in the previous module and stated that switches for other layers like layer-3 (network or IP layer), layer-4 (transport or TCP/UDP layer) or layer-5 (Application layer) are also used. How do these switches work? They use network classification. They cannot use conventional demultiplexing as the information that they are looking for (ex. port number) is not available.
An interesting option LANs is called a virtual LAN. A virtual LAN, unlike conventional LAN, can be logically grouped. Conventional LANs deploy their nodes in physical proximity for example, in a lab or a specific department. A single or multiple switches connects all nodes of that particular LAN. Thus for a switch, all ports and thus machines connected to them, belongs to single LAN. A virtual LAN, on the contrary, can have members cutting across different locations and departments. It is possible that port no. 1,3,4,5 are connected to finance department network and port 2 and 6 belongs to sales while port 7 and 8 connects to nodes which belong to accounts department.
More about how VLANs manage this and an IEEE standard 802.1Q describing an Ethernet version of VLAN, please refer Ref [1] and [2]. We will introduce VLAN in module 15. We will only look at them as a network classification example.
The implementation of VLAN demands that a few ports of the switch belong to one LAN while few other ports belong to other LAN. A frame might be destined for a specific LAN so that will have to be forwarded to those ports. Conventional Ethernet frame does not have any field to process like this so an additional field called VLAN tag is provided which can help here in classification. The VLAN-aware switches can send a broadcast (to all members of a typical network) and multicast (to specific groups where members may belong to multiple networks) traffic to specific ports describing those LANs and not on all ports. VLAN switch also is an example of how network classification is used at layer-2. Network classification helps the VLAN aware switches to just look at the destination (physical) port to decide whether to forward that packet to that physical port or not, in parallel and send the frame to only those ports which are part of the network.
In the case of a layer-3 switch, IP header fields like source IP address and destination IP address and in the case of layer-4 switch the TCP and UDP source and destination port numbers are used to make the decision. Kindly understand the difference between a transport layer port (which is only a number, a software entity) and a physical layer port (a connector socket which is a part of a switch, a physical entity).
When multi-layer switches perform these jobs, the process happens in the hardware. The architecture of this process contains three modules.
First is called, obviously, a classifier, it categorizes incoming packets based on rules and send them out to specific paths. It contains rules and very fast logic to apply rules for classification of incoming packets.
Second is the module which processes the packets based on appropriate rules. For example, it might attach a specific VLAN tag to an outgoing packet. We call it rule applier.
Third is the configuration interface to this classifier. Using the interface, the administrator adds rules, modify them, choose the path for each rule and so on.
Look at figure 5.6 closely. The packet arrives at the LAN card which processes it and gives it to the classifier. The classifier applies the rules and according to matching with the rules it has, sends for different processing. After the processing, the packets travel further. The configuration module directly manages the classifier, provide rules, modify and delete rules if need be, and suggests the type of processing when the match occurs.
Interestingly, not all traffic coming in the classifier demands classification. It is quite possible that some packets are coming from conventional routers and demands conventional layer based demultiplexing process. The classifier, as it examines the packets before any other TCP/IP layer does, must provide a solution to this. The problem is solved by providing an additional rule, which forwards the packet up to the IP layer. IP process that packet in the conventional method that we have already studied once it has it. Thus one of the rules to match is, “is it a packet needing conventional processing” and the process, in that case, is, pass it up to the IP layer.
Hardware solutions for implementation
We have answered both questions, speeding up the process by providing a direct check instead of looking at each header sequentially, layer by layer and finding out where the packet should be heading to, and second, allowing arbitrary rules which may include information based on any layer (or outside the scope of any layer, for example, segregating 2G and 3G traffic based on some typical bit value in the application).
However, the speeds of current networks demand much faster processing. According to a simple observation, a 10 Gb line requires an average frame to be processed in about 1.5 microseconds. If there are multiple fields to be examined for different actions to be performed, it is not possible to be done in this short time.
High-speed routers deploy many techniques at the hardware level to speed up the process. Most of the techniques use some kind of parallelism. The classification process uses those techniques to the full. It is not possible for us to have detailed information on all those methods, one such technique based on Content Addressable Memory and Ternary Content Addressable Memory is worth noting. To understand how that is used, let us take a detour and understand these two terms first.
Content Addressable Memory (CAM)
Conventional memory works like this. The CPU decides to fetch some content stored in some part of the memory. If finds the address of that content, send that address to the memory unit and memory unit obliges with the content. This is a normal memory access mode quite suitable for most operations.
Unfortunately, for high-speed search operations, it is not quite appropriate. It is because the memory address is not provided, we have an input and we are interested in finding out some content in the memory that matches with our input. For example, take the case of switch program running in a layer-2 switch, it has a MAC address of the incoming frame, it has to find that MAC address from the table. Remember our discussion on switches in the previous module. When switches receive a frame, looking at the destination address, switches decide the port number on which device with that address is attached. Switches keep a MAC address table, having the MAC address as well as the port on which device with that MAC address is attached. The search process, looking for a typical MAC address, has to look to every possible location in the MAC address table in a linear fashion. We end up searching the entire memory to find that matching data if the MAC table is big enough to fit in the entire memory.
Content Addressable Memory (CAM) comes to the rescue in such cases. When the CPU supplies the input it received, CAM can search the entire memory in a single operation to see if there is a match (or if there are multiple matches). The CAM returns with all addresses which contain that content. This is ideal for the search operation. It gets the answer in a single operation. This mechanism allows addressing memory locations by content and hence the name.
CAMs are quite common in network hardware implementation. For example, a layer-2 switch, when in receipt of a frame, and in need of forwarding it to a specific network based on MAC address. When this has to happen with the speeds of the current network, even if there are a handful of entries in the MAC table, it would be impossible to match the incoming flow. Binary CAMs, or CAMs which can search based on typically fixed values containing 0s and 1s (for example, 00000000 11110000 00011110 11001100 11110000 00011110 is a valid 48-bit search value for binary CAM usually found with Ethernet.) When such input is provided, it is searched in one go. This reduces the latency for searching valid entry in the entire table and decide the output port to forward the incoming packet in just one go.
Binary CAMs were found to be quite useful for searching MAC addresses but routers have one more problem which prohibits their use in routing tables.
The Router’s dilemma and need for Ternary CAM
Unlike layer-2 switches where there are fixed number of interconnections and only connecting network’s MAC addresses are part of the MAC address table, routers need to store a much larger number of entries.
Let us try to understand. Let us go back to what we have discussed in previous and earlier modules. A data link layer connects with neighbors and exchange frames with neighbors. They only need to remember their neighbors and nothing else. On the contrary, an IP layer or network layer have any valid node anywhere in the world as a destination or source address. This is because a data link layer transmits to their neighbors and the frames are sent only between them. Unlike that, a network layer has an actual sender and ultimate destination in the header. That can be anything. Thus in the MAC address tables, we have MAC addresses of all your immediate neighbors but the routing table needs to have many more entries. Those entries are aggregated normally.
Kindly observe the first column of figure 5.7 closely. The problem with this situation is, we do not really have to match the address AS IS. We only are looking at a specific pattern, or subpart of the input address, known as a mask. Representing such entry using binary CAM is not possible.
Let us elaborate. The routing table in figure 5.9 indicates that anything starts with 11 (and rest 24 bits can be anything) should match with the first routing table entry and anything having first two bytes with value 192 and 12 should match with the second entry. Binary CAMs cannot specify ‘anything’. If Binary CAMs are to be used, the routers can only deploy the first version of the routing table depicted in figure 5.8.
There is another innovation, called ternary CAM, comes to the rescue. In ternary CAM, we can specify X as a wild card, which can assume 0 or 1. That means when we type 10XX11 as an entry in the routing table, it matches with 100011, 101011, 100111, 101111, thus matching all values of XX. Thus in above case, we can have entries as
11XX…XX (total X amounts to 24 bits) and 192.12.XX….XX(total X amounts to 16 bits)4. This value X, which means either zero or one, is sometimes denoted as ‘don’t care’ bit. Here we can represent each of the bits as 0,1 or X, and hence it is called ternary CAM. Now we can have a small routing table with aggregated entries and we can search all entries together using ternary CAM for every incoming packet.
The same ternary CAM helps us with the classifier implementation. The classifiers, unlike routers, may have any bit patterns to match for any action. It is quite possible that a typical bit pattern is to be matched for the incoming packet. some examples are in order.
1. any IP address of a sender begins with 12.0
2. when either 150th bit or 201th bit is 1
3. when IP address begins with 125 and port number is between 200 to 500.
4 These decimal values are actually represented as their binary equivalent values in actual implementation.
This notation (called dotted decimal notation) is used for human reader’s convenience.
The classifier has to pick up a specific set of bits from the input packet and decides which pattern it matches. You can see how ternary CAM can be useful here.
A classifier may decide to do something irrespective of the rules designed for conventional forwarding. For example, it might find all packets coming from 192.28.*, and decide to send all of them to outgoing line number 5, without being bothered about anything else, especially destination address. Remember that normal routing takes place on the basis of the destination address, so we are asking for something quire untraditional. Such cases are required when a special customer’s data (so they have their source address based on that customer’s network address) is to be routed through a different route, to give specific type of quality of service (like provide 3G to that customer).
One can understand why the classifier layer operates below IP layer now, for an incoming packet, if a classifier needs to take some decision and route, the IP may not need to even look at that packet. On the contrary, if the classifier decides that the packet needs conventional forwarding, the IP layer takes over when classifier passes it up.
The process of classification
Using ternary CAM, the process of classification can be made much faster. Let us see how it works.
- Incoming frame arriving at the network card passes first two layers and the packet is extracted and delivered to the classification layer.
- Classification layer, based on rules, CHECK FOR ALL OF THEM SYMULTENEOUSLY based on specified values of the packet content and Ternary CAM-based search circuitry.
- Based on classifier’s output, a specific path is taken and the packet is processed accordingly.
The method used above provides something called cross-layer optimization. Let us take an example to understand this term. Conventional layering system disallows a transport layer to have information about other layers. For example, the best size of a transport layer segment is one that fits perfectly in an IP packet which in turn fits perfectly in the biggest possible frame size in a given network. For example, maximum frame size for a normal Ethernet is 1518 bytes where 18 bytes are headers, so it can have an IP packet with 1500 bytes. If IP header is of its normal size or 20 bytes, best transport layer segment size is 1480 bytes including headers. If the TCP is aware of this information, it can download or upload a file in the chunks of size 1460 bytes (+20 byte TCP header makes it 1480 byte TCP segment).
If layers are not allowed to communicate, such an optimization is impossible5.
Another point in favor of classification is that it provides a service provider some room for quality of service. For example, a 2G and a 3G customer, belonging to the same network, can be offered different services based on some indicator of service they paid for. One option is to use a table with customer’s IP address and identify a customer based on his IP address. Once that identification is in place, it is possible to extract source address of the packet to provide that service. A more complex case is looking at the monthly charges paid as well, or customer has exceeded a limit of the volume of data allowed etc. which demands further processing. Classification enables the ISPs to deploy all these checks simultaneously at a very high speed.
The classification is a great help but does not come without the cost. The CAM itself adds a lot of overhead. As it is designed to search its entire memory in a single shot, it has to have its own content based search circuitry to find a matching entry. This extra circuitry increases the size of CAM and thus the manufacturing cost as compared with conventional memory. Power dissipation increases, as for every search, every memory cell is probed for a match.
Thus, for every router, it is critical to decide how many entries it is going to have in the classification buffer and usually a cap is introduced based on expected overhead.
We will discuss classification once again when we look at Software Defined Networking in modules 31,32, and 33. For learning more about how network classification helps while using MPLS (Multi-Protocol Label Switching, a method to provide QOS and popular among high-end ISPs) please refer Ref [3].
Summary
We have looked at network classification, a very fast method to search for multiple matches in parallel. High-speed networks and QOS demands from the vendors requires to forgo conventional demultiplexing techniques and provide a better solution. Network classification provides an answer to this puzzle. It operates on a packet in form of an array and uses hardware solutions like ternary CAM to search for a typical pattern exists in the packet for all possible options together. This improves the speed and can also provide cross-layer optimization with searches based on byte values of the packet irrespective of the layer which contains that value. Most, if not all, implementations forgo the layer independence and opt for optimization like this during the implementation process.
you can view video on Services and packet classification |
References
- Computer Networks by Bhushan Trivedi, Oxford University Press
- Data Communication and Networking, Bhushan Trivedi, Oxford University Press
- Internetworking with TCP/IP, Douglas Comer, Pearson