Hash

The goal

To illustrate linked structures using the hash table and to show how use use classes of the Java Collections Framework in the implementation of a hashes.

Etymology

Our use of hash is more related to the dish, the hash chops us information.

A little explanation

Also, try the YouTube video.

Ways of using the array

A driving reader

package edu.unca.cs.csci202;

import java.util.Scanner ;

public class HashIt {

    public static void main(String[] args) {
        Scanner stdin = new Scanner(System.in) ;
        while (stdin.hasNext()) {
            String line = stdin.nextLine() ;
            try {
                Scanner lineIn = new Scanner(line) ;
                String firstWord = lineIn.next() ;
                if (firstWord.equals("?")) {
                    String key = lineIn.next() ;
                    System.out.println("Looking for " + key) ;
                    System.out.println("Hash of "+key+" is "+key.hashCode()) ;
                } else {
                    String key = firstWord ;
                    long value = lineIn.nextLong() ;
                    System.out.println("Associating "+key+" with "+value) ;
                    System.out.println("Hash of "+key+" is "+key.hashCode()) ;
                }
            } catch(Exception e) {
                System.err.println("Something bad with " + line) ;
                return ;
            }
        }
    }
}

Closed hashing

Creating the arrays

Choose an array size and a hash function.

    static private final int HASHSIZE = 2013 ;

    private static int getSlot(String key) {
        int hashIndex = (key.hashCode()&0x7FFFFFFF)%HASHSIZE ;
        return hashIndex ;
    }

Define and initialize two arrays, keys and values.

Inside main add some code to store and retrieve the values. This will be about two lines each. Don’t worry about storing over an exising key.

Handling collisions

If a slot is full, just go to the next free one. Be sure to wraparound at the end of the array.

    private static int getSlot(String key) {
        int hashIndex = (key.hashCode()&0x7FFFFFFF)%HASHSIZE ;
        while (keys[hashIndex] != null && !key.equals(keys[hashIndex]) {
            if (++hashIndex == HASHSIZE) {
                hashIndex = 0 ;
            }
        }
        return hashIndex ;
    }

Modify your program to use linear probing. Make sure you can add colliding keys.

To test your code, set HASHSIZE to a small number, say 4.

Plus and minus

Closed hashing works well it there are plenty of open slots. If there are twice as many slots as needed, the average number of probes will be two.

It is also possible to create larger arrays that are populated with the present values.

Open hashing with chaining

You’ll need a data structure to represent the nodes.

package edu.unca.cs.csci202 ;

class StringLongNode {
   
    public String key ;
    public long value ;
    public StringLongNode next ;

    public StringLongNode() {
        this(null, 0L, null) ;
    }

    public StringLongNode(String key, long value) {
        this(key, value, null) ;
    }
    
    public StringLongNode(String key, long value, StringLongNode next) {
        this.key = key ;
        this.value = value ;
        this.next = next ;
    }
}

Unordered list

For unordered list, you need a method with the following header.

    private static StringLongNode lookupKeyInChain(String key, int slot) ;

You will call lookupKeyInChain with a statement similar to the following.

    StringLongNode matchNode = lookupKeyInChain(key,slotNum);

Create lookupKeyInChain. It should return returns null if key is not found in the chain and return the appropriate StringLongNode if it is.

Modify main to use lookupKeyInChain.

Ordered lists and dummy nodes

Sometimes programmers keep the elements of the linked list sorted and use dummy nodes to simplify the programming of the lists. Sorting the lists gives a rather small performance improvement, especially when comparing to just doubling the size of the arrays; however, it is a good exercise in improving programming skills.

Also, it makes a little more sense in C++ than it does in Java.

Dummy nodes

We are going to use StringLongNode objects with a null key to be dummy nodes. You can use any value you desire.

Be sure your program has allocated an array chains of StringLongNode objects and has “filled” each of these with a dummy node.

A modified lookup method

We want to make a add a new method called lookupKey with the following methods header.

    private static StringLongNode lookupKey(String key) ;

The first action of lookupKey will be to use the hash function to find the chain. From now on, assume that the keys in the chain are sorted. Otherwise, this description will make little sense.

If the search key is in the chain, continue to return the StringLongNode containing the search key. If the chain contains only a dummy node or if the search key is less than any key in the chain, return the dummy node. Otherwise, return the last StringLongNode with a key less than the search key.

Use the compareTo method of String to compare keys.

Note the lookupKey should never return null. However, it may return the dummy node.

Write the lookupKey method. One of the odd things about this method is that is will often be looking one ahead of the present nodes.

If you don’t have something like presNode.next.key in your program, it might not be right.

Look at this code very carefully. It should be no more than six lines long.

Modify the key lookup code, the handing of the '?', so that it calls lookupKey.

Test your program by looking up keys, but do not try to enter a new key-value pair. Your program should always say the key is not found.

Watch out for null pointer exceptions.

If the key lookup case is working, it’s time to try out the key update case.

Modify the key lookup case. Just think about the two cases: Either the key is or is not stored in the node returned by lookupKey. It’s harder if it is not.

Of course, test out your code.

The standard stack operations