CodingBison

For hash tables, the first key requirement is to have a good hashing function. This is important because it is likely that many keys can be mapped to the same index in the hash table, something often referred to as a collision. To reduce collision, it is recommended to use a prime number that is closer to the size of the hash table. In the above example, if we had to hash numbers from 0 to 100, then using 97 would provide a uniform generation of indices and hence should lead to less collisions.

There are two common methods used in hashing functions: (a) Division method and (b) Multiplication method.

Division Method

In division method, we take the input number (or the number generated from the input string) and divided it by a constant (that can often be a prime number close to the size of the hash table). Thus, if we have the input number of Num and if the constant is C, then we get the hash index by taking the modulo of the two. The modulo ensures that the hash index always belongs to the range [0, C-1].

 hash_index = Num modulo C
 or 
 hash_index = Num % C 

The hashing functions, which we used earlier in this section, were based on division method for finding the hash index: "ret_hash % size". Here, "size" passed was the size of the table.

Multiplication Method

In multiplication method, we take a number and multiply it by a fraction, let us say, Fr -- the resulting number would be another fraction. Typically, the fraction can be a recurring fraction or a long fraction. A recommended value is often (sqrt(5) -1)/2, which is the fraction part of a ratio known as a golden ratio (the ratio is (sqrt(5) + 1)/2). Next, we take a modulo with 1 for the product of this fraction and the input number -- this gives us another fractional number. Next, we multiply this new fraction with a constant C, which is (again) the size of the hash table. Multiplying C with a fraction ensures that the hash index belongs to the range [0,C-1]. Lastly, we take the floor of the product to get rid of the fractional part.

Here is a new hashing function that is based on multiplication method (must include "math.h" library for the floor() function):

 int hash (char *input_string, int size) {

     double ret_hash = 0;
     int hash_index = 0;
     char *c = input_string;
     int intFr; 

     /* This is fractional part of golden ratio */
     double Fr = (sqrt(5) - 1)/2; 
     printf("Fr: %f \t", Fr);

     while (*c != '\0') {
         ret_hash += *(c++);
     }   

     Fr = ret_hash * Fr;

     /* Get Fractional part of Fr (equivalent of mod 1) */
     intFr = (int)Fr;
     Fr = Fr - intFr;
     printf("(Num*Fr)mod 1: %f \t", Fr);

     /* Get the floor of the new number */
     hash_index = floor(size * Fr);
     printf("hash_index: %d \n", hash_index);

     return (hash_index);
 }




comments powered by Disqus