Monday, December 22, 2014

Hash table Java implementation

I remember one company has an interview question about implementing a hash table. I couldn't find a good one online (except the Java src, obviously), so I decide to write my own.

However, if you are really interested in how Java implements the Hashtable, go to this website. My friend made this, and I only realized today that it is the best website if you are interested in any source code.

The Entry class
I refer to the Java Hashtable Entry class. There are two key objects in the class: the key, and the value (well, it's map...). It is a private inner class because I don't need to access the Entry class outside the Hashtable class. It is basically a Java Bean which contains two constructors, an object representing the key and an object representing the value of the data.

The hash method
Java uses hashCode() method. I referred to this blog, which uses Java's toString() method.

Separating chaining
I use LinkedList implementation. I still keep the load factor and resize() method presented in Java, just for learning purpose. However, I would strongly recommend you only preserve one. The LinkedList implementation is definitely more straightforward, since it doesn't require the copy of the table. However, use a constant size table means the search of the element will take longer time (O(n) if your hash function is not a good one).

The code
import java.util.LinkedList;


public class HashTable {
 private int table_size;
 private Object[] table;
 //coefficient of map usage
 //Actually, I use linkedlist for separate chaining, 
 //thus I don't really have to resize the table,  
 //but for the learning purpose, let's keep it here
 private float loadFactor = 0.75f;
 private int numElements; 
 
 
 public HashTable() {
  this.table_size = 11;
  table = new Object[this.table_size];
 }
 public HashTable(int table_size) {
  if (table_size <= 0)
   throw new IllegalArgumentException("Illegal Capacity" + table_size);
  this.table_size = table_size;
  table = new Object[this.table_size];
 }
 public int size() {
  return this.numElements;
 }
 public boolean isEmpty() {
  return this.numElements == 0;
 }
 
 @SuppressWarnings("unchecked")
 public boolean containsKey(Object key) {
  boolean result = false;
  int hash = this.hash(key);
  if (this.table[hash] != null) {
   Entry node = new Entry();
   node.setKey(key);
   if (((LinkedList)this.table[hash]).indexOf(node) > -1) {
    result = true;
   }
  }
  return result;
 }
 //since we are doing resize, let's keep an eye on the maximum size too
 private static final int MAX_table_size = Integer.MAX_VALUE - 8;
 private void resize()
 {
  int oldSize = table_size;
  if (oldSize == MAX_table_size)
    return;
  // use the java method
  int newSize = (oldSize << 1) + 1;
  if (newSize >= MAX_table_size) {
   newSize = MAX_table_size;
  }
    
  Object[] newTable = new Object[newSize];
  for (int i=0; i)this.table[position]).add(node);
   this.numElements++;
  }
  
  else
  {
   int index = ((LinkedList)this.table[position]).indexOf(node);
   //append the element to the linkedlist
   if (index == -1) {
    ((LinkedList)this.table[position]).add(node);
    this.numElements++;
   }
   //find the node in the linkedList, 
   //set the value
   else {
    ((LinkedList)this.table[position]).get(index).setValue(node.value);
   }
  }
 }
 
 //Get values
 @SuppressWarnings("unchecked")
 public Object get(Object key)
 {
  int hasVal = this.hash(key);
  if (this.table[hasVal] == null)
   throw new Error("no such key!");
  
  Entry node = new Entry();
  node.setKey(key);
  int index = ((LinkedList)this.table[hasVal]).indexOf(node);
  return ((LinkedList)this.table[hasVal]).get(index).getValue();
  
 }
 
 //remove  pairs
 @SuppressWarnings("unchecked")
 public void remove (Object key) {
  int hashVal = this.hash(key);
  if (this.table[hashVal] != null) {
   Entry node = new Entry();
   node.setKey(key);
   if (((LinkedList)this.table[hashVal]).indexOf(node) > -1)  {
          ((LinkedList)this.table[hashVal]).remove(node);
          this.numElements--;
   }
  }
 }
 public void clear() {
  for (int index = this.table.length; --index >= 0;) {
   this.table[index] = null;
  }
  numElements = 0;
 }
 
 @SuppressWarnings("unchecked")
 public String toString() {
  StringBuffer sb = new StringBuffer();
  sb.append(System.getProperty("line.separator"));
  sb.append("{");
  sb.append(System.getProperty("line.separator"));
  for (int i = 0; i < this.table.length; i++) {
   if (this.table[i] != null) {
    //"\t" tab
    sb.append("  " + (LinkedList) this.table[i]);
    sb.append(System.getProperty("line.separator"));
   }
  }
  sb.append("}");
  return sb.toString();
 }

 private int hash(Object key) {
  //Start with a base, just so that it's not 0 for empty strings
     int result = 27;
     
     String inputString = key.toString().toLowerCase();
     
     char[] characters = inputString.toCharArray();
     for (int i = 0; i < characters.length; i++)
     {
      char currentChar = characters[i];
      if (currentChar == 'a' || currentChar == 'b' || currentChar == 'c' ||
          currentChar == 'e' || currentChar == 'e' || currentChar == 'f')
       result += Integer.parseInt("" + currentChar,16);//convert to hexidecimal
      int j = (int)currentChar;
      result += j;
     }
     result ^= (result >>> 20) ^ (result >>> 12);
     return result % this.table_size;  
 }
 //I maintained the Entry name, 
 //but since it's a simplified implementation
 //I am not going to implement Map.Entry
 //http://www.codatlas.com/project/L_fXVCOhW4_lzXEd3R5DNQ__/src/share/classes/java/util/Hashtable.java?keyword=hashtable&line=951
 private static class Entry
 {
  private Object key;
  private Object value;
  protected Entry() {
   this.key = null;
   this.value = null;
  }
  protected Entry(Object key, Object value) {
   this.key = key;
   this.value = value;
  }
   
  public Object getKey() {
   return key;
  }
  public Object getValue() {
   return value;
  }
  public void setValue(Object value) {
   if (value == null)
    throw new NullPointerException();
   this.value = value;
  }
  public void setKey(Object key) {
   if (key == null)
    throw new NullPointerException();
   this.key = key;
  }
  public boolean equals(Object obj) {
   //The instanceof keyword can be used to test 
   //if an object is of a specified type.
   if (!(obj instanceof Entry))
    return false;
   Entry node = (Entry)obj;
   //only based on the key alone assuming there can't be 
   //two entries with the same key in the table
   return (this.key == null ? node.getKey() == null : key.equals(node.getKey())); 
  }
  
  public String toString() {
   return this.key.toString() + " has value " + this.value.toString();
  }
 }
 
}

*********************************************************************************

Some notes when I try to understand the Java Hashtable src:
Serialization: the process of making the object's state is persistent. That means the state of the object is converted into stream of bytes and stored in a file.
De-serilization: bring the object's state from bytes.

Transient: the variable should not be serialized. It the variable is declared as transient, then it will not be persisted.
http://www.javabeat.net/what-is-transient-keyword-in-java/

Native: It marks a method, that it will be implemented in other languages, not in Java. It works together with JNI (Java Native Interface).
Native methods were used in the past to write performance critical sections but with Java getting faster this is now less common. Native methods are currently needed when
  • You need to call a library from Java that is written in other language.
  • You need to access system or hardware resources that are only reachable from the other language (typically C). Actually, many system functions that interact with real computer (disk and network IO, for instance) can only do this because they call native code.

No comments:

Post a Comment