Friday, March 13, 2015

Kruskal Algorithm for minimum spanning tree

I am always obsessed with graphs, so I am trying to study every common algorithm that relates to graph.

First, a tree, is any graph that doesn't contains any cycles. Some properties of trees include:

  • A graph is a tree iff there is one and only one path joining any two of its vertices.
  • A connected graph is a tree iff every one of its edges is a bridge.
  • A connected graph is a tree iff it has N nodes and N - 1 edges.


Second, the minimum spanning tree, is the spanning tree, which is a subgraph that connects all vertices together, that has the minimum total weight in a weighted edge graph.

The Kruskal's Algorithm is a greedy algorithm that finds the minimum spanning tree. Each time it pulls out the edge with the minimum weight that hasn't connected the existing tree with a new node, a disjoint data set/union find set is used to avoid cycles (otherwise it will not be a tree): join the two nodes once the edge is added to the minimum spanning tree, so it cannot be joined again.

The algorithm itself is not hard to understand, it still takes me sometime to debug my disjoint data set code.


public class UnionFind {
 //store the roots of each up-tree
 List elements;
 List roots;
 List rank;
 //number of groups in the set
 int count;
 public UnionFind(){
  elements = new ArrayList();
  roots = new ArrayList();
  rank = new ArrayList();
  count = 0;
 }
 public UnionFind(int size){
  elements = new ArrayList(size);
  roots = new ArrayList(size);
  rank = new ArrayList(size);
  count = 0;
 }
 public void add(T ele){
  elements.add(ele);
  roots.add(ele);
  rank.add(1);
  count++;
 }
 private int indexOf(T ele){
  return elements.indexOf(ele);
 }
 public boolean contains(T ele){
  return elements.contains(ele);
 }
 public void union(T ele1, T ele2){
  if(!contains(ele1) || !contains(ele2)){
   System.out.println("Missing one or more elements!");
   return;
  }
  T root1 = find(ele1);
  T root2 = find(ele2);
  if(root1 == root2){
   System.out.println("Already in the same group!");
   return;
  }
  int index1 = indexOf(root1);
  int index2 = indexOf(root2);
  
  int rank1 = rank.get(index1);
  int rank2 = rank.get(index2);
  if(rank1 > rank2){
   //roots.set(index2, find(ele1));
   roots.set(index2, ele1);
   //rank.set(index1, rank1 + rank2);
   //find(ele2);
  }
  else if (rank1 < rank2){
   //roots.set(index1, find(ele2));
   roots.set(index1, ele2);
   //rank.set(index2, rank1 + rank2);
   //find(ele1);
  }
  else{
   roots.set(index2, ele1);
   rank.set(index1, rank.get(index1) + 1);
  }
  count--;
 }
 public boolean isUnion(T ele1, T ele2){
  return find(ele1) == find(ele2);
 }
 /**
  * if the root of the element is not the same as
  * root of the root of the element
  * recursively set the root of the element to the
  * root of the whole tree
  * @param ele
  * @return
  */
 public T find(T ele){
  if(!contains(ele))
   throw new NullPointerException("Element doesn't exist!");
  int index = indexOf(ele);
  int index2 = indexOf(roots.get(index));
  if(roots.get(index) != roots.get(index2)){
   roots.set(index, find(roots.get(index2)));
  }
  return roots.get(index); 
 }
}
public class Kruskal {
 private double weight;
 private Queue mst;
 public Kruskal(Graph G){
  mst = new LinkedList();
  weight = 0.0;
  PriorityQueue pq = new PriorityQueue();
  for (Edge e : G.edges)
   pq.add(e);
  UnionFind uf = new UnionFind(G.getSize());
  for(Node n : G.vertices.values())
   uf.add(n);
  while(!pq.isEmpty() && mst.size() < G.getSize() - 1){
   Edge e = pq.poll();
   List nodes = (List) e.getNodes();
   if(!uf.isUnion(nodes.get(0), nodes.get(1))){
    uf.union(nodes.get(0), nodes.get(1));
    mst.add(e);
    weight += e.getWeight();
   }
  }
  for(Node n : uf.elements)
   System.out.print(n.label + " ");
  System.out.println();
  for(Node n : uf.roots)
   System.out.print(n.label + " ");
  System.out.println();
  for(int n : uf.rank)
   System.out.print(n + " ");
  System.out.println();
  
  //test assumption
  //When the system runs the assertion, 
  //it evaluates Expression1 and if it is false throws an 
  //AssertionError with no detail message
  assert check(G);
 }
 /**
  * check optimality conditions
  * @param G
  * @return
  */
 private boolean check(Graph G){
  //check total weight
  double total = 0.0;
  for(Edge e : mst)
   total += e.getWeight();
  double epsilon = 1e-12;
  if(Math.abs(total - weight) > epsilon){
   System.err.format("Weight of edges does not equal to weight: %f vs. %f\n",total, weight);
   return false;
  }
  //if acyclic
  UnionFind uf = new UnionFind();
  for(Node v : G.vertices.values())
   uf.add(v);
  for(Edge e : mst){
   List nodes = (List) e.getNodes();
   if(uf.find(nodes.get(0)) == uf.find(nodes.get(1))){
    System.err.println("Containing a circle!");
    return false;
   }
  }
  
  for(Edge e : mst){ 
   uf = new UnionFind();
   for(Edge f : mst){
    if(f.equals(e))
     continue;
    List nodes = (List)f.getNodes();
    uf.union(nodes.get(0), nodes.get(1));
   }
   for(Edge f : G.edges){
    if(f.equals(e))
     continue;
    List nodes = (List)f.getNodes();
    if(!uf.isUnion(nodes.get(0), nodes.get(1))){
     if(f.getWeight() < e.getWeight()){
      System.err.println("Edge " + f + "violates cur optimality conditions");
      return false;
     }
    }
   }
  }
  return true;
 }
 public double weight(){
  return weight;
 }
 /** 
  * return all edges in the mst
  * @return
  */
 public Iterable edges(){
  return mst;
 }
}

The graph used for the test can be found here.






1 comment:

  1. The development of artificial intelligence (AI) has propelled more programming architects, information scientists, and different experts to investigate the plausibility of a vocation in machine learning. Notwithstanding, a few newcomers will in general spotlight a lot on hypothesis and insufficient on commonsense application. IEEE final year projects on machine learning In case you will succeed, you have to begin building machine learning projects in the near future.

    Projects assist you with improving your applied ML skills rapidly while allowing you to investigate an intriguing point. Furthermore, you can include projects into your portfolio, making it simpler to get a vocation, discover cool profession openings, and Final Year Project Centers in Chennai even arrange a more significant compensation.


    Data analytics is the study of dissecting crude data so as to make decisions about that data. Data analytics advances and procedures are generally utilized in business ventures to empower associations to settle on progressively Python Training in Chennai educated business choices. In the present worldwide commercial center, it isn't sufficient to assemble data and do the math; you should realize how to apply that data to genuine situations such that will affect conduct. In the program you will initially gain proficiency with the specialized skills, including R and Python dialects most usually utilized in data analytics programming and usage; Python Training in Chennai at that point center around the commonsense application, in view of genuine business issues in a scope of industry segments, for example, wellbeing, promoting and account.

    ReplyDelete