28 Hash and MAC Algorithms

epgp books

 

 

 

 

Learning Objectives

 

  • To introduce general ideas behind cryptographic MAC function
  • To distinguish between two categories of MAC function: those with a compression function made from scratch and those with a block cipher as the compression function
  • To discuss the structure of HMAC as an example of a cryptographic MAC function with a compression function made from scratch
  • To discuss the structure of CMAC as an example of a cryptographic MAC function with a block cipher as the compression function

INTRODUCTION

 

One of the most fascinating and complex areas of cryptography are that of message authentication and the related area of digital signatures. We now consider how to protect message integrity (ie protection from modification), as well as confirming the identity of the sender. Generically this is the problem of message authentication, and in ecommerce applications is arguably more important than secrecy. Message Authentication is concerned with: protecting the integrity of a message, validating identity of originator,& non-repudiation of origin (dispute resolution). There are three types of functions that may be used to produce an authenticator: a hash function, message encryption, message authentication code (MAC). Hash functions, and how they may serve for message authentication. The remainder of this section briefly examines the remaining two topics. The remainder of the chapter elaborates on the topic of MACs. Mmessage authentication is concerned with these protecting the integrity of a message, validating identity of originator, non-repudiation of origin (dispute resolution),this will consider the security requirements here three alternative functions are used hash function ,message encryption, message authentication code (MAC)

  1. Message security Requirements

The message security requirements are disclosure, traffic analysis, masquerade, content modification, sequence modification, timing modification, source repudiation, destination repudiation.

In the context of communications across a network, the attacks listed above can be identified, with more detail given in the text. The first two requirements (Disclosure: Release of message contents; and Traffic analysis: Discovery of the pattern of traffic between parties) belong in the realm of message confidentiality, and are handled using the encryption techniques already discussed. Measures to deal with items 3 through 6 (Masquerade: Insertion of messages into the network from a fraudulent source; Content modification: of the contents of a message; Sequence modification: to a sequence of messages between parties; and Timing modification: Delay or replay of messages) are generally regarded as message authentication. Mechanisms for dealing specifically with item 7 (Source repudiation: Denial of transmission of message by source) come under the heading of digital signatures. Generally, a digital signature technique will also counter some or all of the attacks listed under items 3 through 6. Dealing with item 8 (Destination repudiation: Denial of receipt of message by destination) may require a combination of the use of digital signatures and a protocol designed to counter this attack. The message authentication is a procedure to verify that received messages come from the alleged source and have not been altered. Message authentication may also verify sequencing and timeliness. A digital signature is an authentication technique that also includes measures to counter repudiation by the source.

2. Symmetric Message Encryption

 

Message encryption by itself can provide a measure of authentication. The analysis differs for symmetric and public-key encryption schemes. If use symmetric encryption, If no other party knows the key, then confidentiality is provided. As well, symmetric encryption provides authentication as well as confidentiality, since only the other party can have encrypted a properly constructed message in Figure

 

1.  Here, the cipher text of the entire message serves as its authenticator, on the basis that only those who know the appropriate keys could have validly encrypted the message. This is provided you can recognize a valid message (i.e. if the message has suitable structure such as redundancy or a checksum to detect any changes).

3.Public-Key Message Encryption

 

The public-key techniques can use a digital signature which can only have been created by key owner to validate the integrity of the message contents. To provide both confidentiality and authentication, A can encrypt M first using its private key, which provides the digital signature, and then using B’s public key, which provides confidentiality in Figure 2. The disadvantage of this approach is that the public-key algorithm, which is complex, must be exercised four times rather than two in each communication. If public-key encryption is used the encryption provides no confidence of sender, since anyone potentially knows public-key, however if sender signs message using their private-key then encrypts with recipients public key have both secrecy and authentication, again need to recognize corrupted messages but at cost of two public-key uses on message

 

 

  1. Message Authentication Code (MAC)

An alternative authentication technique involves the use of a secret key to generate a small fixed-size block of data, known as a cryptographic checksum or MAC that is appended to the message. This technique assumes that two communicating parties, say A and B, share a common secret key K. A MAC function is similar to encryption, except that the MAC algorithm need not be reversible, as it must for decryption.

The message authentication code generated by an algorithm that creates a small fixed-sized block depending on both message and some key like encryption though need not be reversible ,it is appended to message as a signature .The receiver performs same computation on message and checks it matches the MAC provides assurance that message is unaltered and comes from sender

 

An alternative authentication technique involves the use of a secret key to generate a small fixed- size block of data, known as a cryptographic checksum or MAC that is appended to the message. This technique assumes that two communicating parties, say A and B, share a common secret key K. When A has a message to send to B, it calculates the MAC as a function of the message and the key: MAC = C(K, M). The message plus MAC are transmitted to the intended recipient. The recipient performs the same calculation on the received message, using the same secret key, to generate a new MAC. The received MAC is compared to the calculated MAC Figure 3. If we assume that only the receiver and the sender know the identity of the secret key, and if the received MAC matches the calculated MAC, then the receiver is assured that the message has not been altered, is from the alleged sender, and if the message includes a sequence number then the receiver can be assured of the proper sequence because an attacker cannot successfully alter the sequence number. A MAC function is similar to encryption. One difference is that the MAC algorithm need not be reversible, as it must for decryption. In general, the MAC function is a many-to-one function. The process depicted on the previous slide provides authentication but not confidentiality, because the message as a whole is transmitted in the clear. Confidentiality can be provided by performing message encryption either after or before the MAC algorithm. In both these cases, two separate keys are needed, each of which is shared by the sender and the receiver. Typically, it is preferable to tie the authentication directly to the plaintext, so the method is used. Can use MAC in circumstances where just authentication is needed (or needs to be kept), see text for examples (e.g. such as when the same message is broadcast to a number of destinations; when one side has a heavy load and cannot afford the time to decrypt all incoming messages; or do not need to keep messages secret, but must authenticate messages). Finally, note that the MAC does not provide a digital signature because both sender and receiver share the same key. The MAC provides authentication it can also use encryption for secrecy, it generally use separate keys for each and can compute MAC either before or after encryption.MAC is used sometimes only when authentication is needed or in need authentication to persists longer than the encryption. The MAC is not a digital signature.

5. Properties and Requirements of MAC

 

A        MAC (also known as a cryptographic checksum, fixed-length authenticator, or tag) is generated by a function C. The MAC is appended to the message at the source at a time when the message is assumed or known to be correct. The receiver authenticates that message by re-computing the MAC. The MAC function is a many-to-one function, since potentially many arbitrarily long messages can be condensed to the same summary value, but don’t want finding them to be easy. a MAC is a cryptographic checksum MAC = CK(M) condenses a variable-length message M ,using a secret key K and to a fixed-sized authenticator ,it is a many-to-one function potentially many messages have same MAC but finding these needs to be very difficult. In assessing the security of a MAC function, we need to consider the types of attacks that may be mounted against it. Hence it needs to satisfy the listed requirements. The first requirement deals with message replacement attacks, in which an opponent is able to construct a new message to match a given MAC, even though the opponent does not know and does not learn the key. The second requirement deals with the need to thwart a brute-force attack based on chosen plaintext.

The final requirement dictates that the authentication algorithm should not be weaker with respect to certain parts or bits of the message than others

6. Security in MAC

 

Just as with symmetric and public-key encryption, we can group attacks on hash functions and MACs into two categories: brute-force attacks and cryptanalysis. A brute-force attack on a MAC is a more difficult undertaking than a brute-force attack on a hash function because it requires known message-tag pairs. The strength of a hash function against brute-force attacks depends solely on the length of the hash code produced by the algorithm, with cost O(2^m/2). A brute-force attack on a MAC has cost related to min(2^k, 2^n), similar to symmetric encryption algorithms. It would appear reasonable to require that the key length and MAC length satisfy a relationship such as min(k, n) >= N, where N is perhaps in the range of 128 bits. As with encryption algorithms, cryptanalytic attacks on hash functions and MAC algorithms seek to exploit some property of the algorithm to perform some attack other than an exhaustive search. The way to measure the resistance of a hash or MAC algorithm to cryptanalysis is to compare its strength to the effort required for a brute-force attack. That is, an ideal hash or MAC algorithm will require a cryptanalytic effort greater than or equal to the brute-force effort. There is much more variety in the structure of MACs than in hash functions, so it is difficult to generalize about the cryptanalysis of MACs. Further, far less work has been done on developing such attacks.

7. Keyed Hash Functions as MACs

 

In recent years, there has been increased interest in developing a MAC derived from a cryptographic hash function, because they generally execute faster in software than symmetric block ciphers, and because code for cryptographic hash functions is widely available. A hash function such as SHA was not designed for use as a MAC and cannot be used directly for that purpose because it does not rely on a secret key. There have been a number of proposals for the incorporation of a secret key into an existing hash algorithm, originally by just pre-pending a key to the message. Problems were found with these earlier, simpler proposals, but they resulted in the development of HMAC.

 

8. HMAC Design Objectives

 

RFC 2104 lists the following design objectives for HMAC:

 

●     To use, without modifications, available hash functions. In particular, hash functions that perform well in software, and for which code is freely and widely available.

  • To allow for easy replace ability of the embedded hash function in case faster or more secure hash functions are found or required.
  • To preserve the original performance of the hash function without incurring a significant degradation.
  • To use and handle keys in a simple way.
  • To have a well understood cryptographic analysis of the strength of the authentication

Mechanism based on reasonable assumptions about the embedded hash function. The first two objectives are important to the acceptability of HMAC. HMAC treats the hash function as a”black box.” This has two benefits. First, an existing implementation of a hash function can be used as a module in implementing HMAC. In this way, the bulk of the HMAC code is prepackaged and ready to use without modification. Second, if it is ever desired to replace a given hash function in an HMAC

 

Implementation all that is required is to remove the existing hash function module and drop in the new module. This could be done if a faster hash function were desired. More important, if the security of the embedded hash function were compromised, the security of HMAC could be retained simply by replacing the embedded hash function with a more secure one (e.g., replacing SHA with Whirlpool).

 

The last design objective in the preceding list is, in fact, the main advantage of HMAC over other proposed hash-based schemes. HMAC can be proven secure provided that the embedded hash function has some reasonable cryptographic strength. We return to this point later in this section, but first we examine the structure of HMAC.

9. HMAC Algorithm

 

Overall operation of HMAC. Define the following terms:

 

H = embedded hash function (e.g., MD5, SHA-1, RIPEMD-160)

IV = initial value input to hash function

M = message input to HMAC(including the padding specified in the embedded hash function)

Yi = ith block of M, 0 i (L – 1)

L = number of blocks in M

b = number of bits in a block

n = length of hash code produced by embedded hash function

K= secret key recommended length is n; if key length is greater than b; the key is input to

the hash function to produce an n-bit key

K+ = K padded with zeros on the left so that the result is b bits in length ipad = 00110110 (36 in hexadecimal) repeated b/8 times opad = 01011100 (5C in hexadecimal) repeated b/8 times

 

Then HMAC can be expressed as follows:

 

HMAC(K,M) = H[(K+  opad)||H[(K+  ipad)||M]] In words,

 

1.     Append zeros to the left end of K to create a b-bit string K+(e.g., if K is of length 160 bits and b = 512 then K will be appended with 44 zero bytes 0 x 00).

2.     XOR (bitwise exclusive-OR) K+ with ipad to produce the b-bit block Si.

3.     Append M to Si.

4.     Apply H to the stream generated in step 3.

  1.    XOR K+ with opad to produce the b-bit block So
  1. Append the hash result from step 4 to So
  1. Apply H to the stream generated in step 6 and output the result.
  1. XOR K+ with opad to produce the b-bit block So
  1. Append the hash result from step 4 to So
  1. Apply H to the stream generated in step 6 and output the result.

Note that the XOR with ipad results in flipping one-half of the bits of K. Similarly, the XOR with opad results in flipping one-half of the bits of K, but a different set of bits. In effect, by passing Si and So through the compression function of the hash algorithm, we have pseudo randomly generated two keys from K. HMAC should execute in approximately the same time as the embedded hash function for long messages. HMAC adds three executions of the hash compression function (for Si,So and the block Produced from the inner hash). A more efficient implementation is possible.

Two quantities are precomputed:

    f(IV, (K+       ipad))

 

f(IV, (K+       opad))

 

Where f(cv, block) is the compression function for the hash function, which takes as arguments a chaining variable of n bits and a block of b bits and produces a chaining variable of n bits. These quantities only need to be computed initially and every time the key changes. In effect, the precomputed quantities substitute for the initial value (IV) in the hash function. With this implementation, only one additional instance of the compression function is added to the processing normally produced by the hash function. This more efficient implementation is especially worthwhile if most of the messages for which a MAC is computed are short.

9.1 Security of HMAC

 

The security of any MAC function based on an embedded hash function depends in some way on the cryptographic strength of the underlying hash function. The appeal of HMAC is that its designers have been able to prove an exact relationship between the strength of the embedded hash function and the strength of HMAC.

 

The security of a MAC function is generally expressed in terms of the probability of successful forgery with a given amount of time spent by the forger and a given number of message-MAC pairs created with the same key. In essence, it is proved in for a given level of effort (time, message-MAC pairs) on messages generated by a legitimate user and seen by the attacker, the probability of successful attack on HMAC is equivalent to one of the following attacks on the embedded hash function:

  1. The attacker is able to compute an output of the compression function even with an IV that is random, secret, and unknown to the attacker.
  2. The attacker finds collisions in the hash function even when the IV is random and secret.
  3. In the first attack, we can view the compression function as equivalent to the hash function applied to a message consisting of a single b-bit block. For this attack, the IV of the hash function is replaced by a secret, random value of n bits. An attack on this hash function requires either a brute-force attack on the key, which is a level of effort on the order of 2n or a birthday attack, which is a special case of the second attack, discussed next.

     

    In the second attack, the attacker is looking for two messages M and M’ that produce the same hash: H (M) = H(M‘). This is the birthday attack. We have shown that this requires a level of effort of 2n/2 for a hash length of n. On this basis, the security of MD5 is called into question, because a level of effort of 264 looks feasible with today’s technology. Does this mean that a 128-bit hash function such as MD5 is unsuitable for HMAC? The answer is no, because of the following argument. To attack MD5, the attacker can choose any set of messages and work on these off line on a dedicated computing facility to find a collision. Because the attacker knows the hash algorithm and the default IV, the attacker can generate the hash code for each of the messages that the attacker generates. However, when attacking HMAC, the attacker cannot generate message/code pairs off line because the attacker does not know K. Therefore, the attacker must observe a sequence of messages generated by HMAC under the same key and perform the attack on these known messages. For a hash code length of 128 bits, this requires 264 observed blocks (272bits) generated using the same key. On a 1-Gbps link, one would need to observe a continuous stream of messages with no change in key for about 150,000 years in order to succeed. Thus, if speed is a concern, it is fully acceptable to use MD5 rather than SHA-1 as the embedded hash function for HMAC.

    10 CMAC

     

    The Data Authentication Algorithm defined in FIPS PUB 113, also known as the CBC-MAC (cipher block chaining message authentication code). This cipher-based MAC has been widely adopted in government and industry. Demonstrated that this MAC is secure under a reasonable set of security criteria, with the following restriction. Only messages of one fixed length of mn bits are processed, where n is the cipher block size and m is a fixed positive integer. As a simple example, notice that given the CBC MAC of a one-block message X, say T = MAC(K, X), the adversary Immediately knows the CBC MAC for the two-block message X||(X T)since this is once again T.

     

    Black and Rogaway demonstrated that this limitation could be overcome using three keys: one key of length k to be used at each step of the cipher block chaining and two keys of length n, where k is the key length and n is the cipher block length. This proposed construction was refined by Iwata and Kurosawa so that the two n-bit keys could be derived from the encryption key, rather than being provided separately. This refinement has been adopted by NIST cipher-based message authentication code (CMAC) mode of operation, for use with AES and triple DES. It is specified in NIST Special Publication 800-38B.

     

    First, let us consider the operation of CMAC when the message is an integer multiple n of the cipher block length b. For AES, b = 128 and for triple DES, b = 64. The message is divided into n blocks, M1, M2,…,Mn. The algorithm makes use of a k-bit encryption key K and an n-bit constant K1. For AES, the key size k is

    128, 192, or 256 bits; for triple DES, the key size is 112 or 168 bits. CMAC is

     

    calculated as follows

     

    C1 = E(K,M1)

     

    C2 = E(K,[M2  C1])

     

    C3 = E(K,[M3      C2])

       ·

     

    ·

     

    ·

     

    Cn = E(K,[Mn  Cn1  K1])

     

    = MSBTlen(Cn) where

     

    T = message authentication code, also referred to as the tag

     

    Tlen = bit length of T

     

    MSBs(X) = the s leftmost bits of the bit string X

  4. Figure 7 Cipher-Based Message Authentication Code (CMAC)

    If the message is not an integer multiple of the cipher block length, then the final block is padded to the right (least significant bits) with a 1 and as many 0s as necessary so that the final block is also of length b. The CMAC operation then precedes as before, except that a different n-bit key K2 is used instead of K1. The two n-bit keys are derived from the k-bit encryption key as follows:

    • L = E(K, 0n)

           K1 = L · x

     

        K2 = L · x2 = (L · x) · x

where multiplication (·) is done in the finite field (2n) and x and x2 are first and second order polynomials that are elements of GF(2n) Thus the binary representation of x consists of n – 2 zeros followed by 10; the binary representation of x2 consists of n – 3 zeros followed by 100. The finite field is defined with respect to an irreducible polynomial that is lexicographically first among all such polynomials with the minimum possible number of nonzero terms. For the two approved block sizes, the polynomials are and x64 + x4 + x3 + x + 1 and x128 + x7 + x2 + x + 1. To generate K1 and K2 the block cipher is applied to the block that consists entirely of 0 bits. The first subkey is derived from the resulting cipher text by a left shift of one bit, and, conditionally, by XORing a Constant that depends on the block size. The second sub key is derived in the same manner from the first sub key. This property of finite fields of the form GF(2n).

Summary

 

General ideas behind cryptographic MAC function is explored

  • The difference between two categories of MAC function: those with a compression function made from scratch and those with a block cipher as the compression function is studied
  • The structure of HMAC as an example of a cryptographic MAC function with a compression function is explored
  • The structure of CMAC as an example of a cryptographic MAC function with a the compression function is studied
you can view video on Hash and MAC Algorithms