Hash Function

Introduction

Hash function is used to convert a string(or any other type) into an integer smaller than hash size and bigger or equal to zero. A good hash function can avoid collision as less as possible. A widely used hash function algorithm is using a magic number 33, consider any string as a 33 based big integer like follow:

hashcode(abcd)=(ascii(a)333+ascii(b)332+ascii(c)331+ascii(d)330)modhash_size=(97333+98332+99331+1001)modhash_size=3595978modhash_size\begin{aligned} hashcode(``abcd'') & = (ascii(a) * 33^3 + ascii(b) * 33^2 + ascii(c) * 33^1 + ascii(d) * 33^0) \bmod hash\_size \\ & = (97 * 33^3 + 98 * 33^2 + 99 * 33^1 + 100*1) \bmod hash\_size \\ & = 3595978 \bmod hash\_size \end{aligned}

where HASH_SIZE is the capacity of the hash table (you can assume a hash table is like an array with index 0 ~ HASH_SIZE-1).

Question (LC.128)

Given a string as a key and the size of hash table, return the hash value of this key.

You do not have to handle collisions.

Example

Input: key = "abcd", size = 100
Return: 78

Analysis

The algorithm is given. Now the hard thing is to implement it carefully.

Fail #1

public int hashCode(char[] key,int HASH_SIZE) {
    int hashValue = 0;
    for (int i = 0; i < key.length; i++) {
        hashValue += key[i] * Math.pow(33, key.length-1-i);
    }
    return hashValue % HASH_SIZE;
}

Asking the input size for generating value questions. "ubuntu", 1007 will overflow the hashValue.

Fail #2

public int hashCode(char[] key,int HASH_SIZE) {
    long hashValue = 0;
    for (int i = 0; i < key.length; i++) {
        hashValue += key[i] * Math.pow(33, key.length-1-i);
    }
    return (int) (hashValue % HASH_SIZE);
}

Ok. Quick and dirty fix since long can hold more digits. "abcdefghijklmnopqrstuvwxyz", 2607 overflows the hashValue again ᶘ ᵒ㉨ᵒᶅ.

Fail #3

Well then, can we distribute the %?

public int hashCode(char[] key, int HASH_SIZE) {
    long hashValue = 0;
    for (int i = 0; i < key.length; i++) {
        hashValue += (key[i] * Math.pow(33, key.length-1-i)) % HASH_SIZE;
    }
    return (int) (hashValue);
}

Heck no. Revisit modular multiplication.

Passing Code

Op formulas for modular divisions (a * b) % m = a % m * b % m and (a * b) % m = ((a % m) * (b % m)) % m. [33 [33 [33 [33*0 + a] + b] + c] +d]

public int hashCode(char[] key,int HASH_SIZE) {
    long hashValue = 0;
    for (int i = 0; i < key.length; i++) {
        hashValue = 33 * hashValue + key[i];
        hashValue %= HASH_SIZE;
    }
    return (int) (hashValue);
}

ʕノ•ᴥ•ʔノ

References

Last updated