Saturday, January 17, 2015

Disjoint Set Union/Find ADT

Definition


  • Set U = {a1, a2, ..., an}
  • Maintain a partition of U, a set of subsets of U {S1, S2, ..., Sk} such that:
    •    each pair of subsets Si and Sj are disjoint;
    •    together, the subsets cover U;
    •    each subset has a unique name. 
  • Union(a, b) creates a new subset which is the union of a and b;
  • Find(a) returns the representative member of a set: similar to find parent of a TreeNode in a tree; 
  • Thus Disjoint Set can be represented by an up-tree;
  • Disjoint set equivalence property: every element of a DS U/F structure belongs to exactly one set;
  • Dynamic equivalence property: the set of an element can change after execution of a union.



Up-Tree Union-Find Data Structure


Image source: http://courses.cs.washington.edu/courses/cse326/00wi/handouts/lecture18/sld015.htm

  • Each subset is an up-tree with its root as its representative member;
  • All members of a given set are nodes in that set's up-tree;
  • Not necessarily binary;
  • Find(a): traverse from the leaf to the root;
  • Union(a, b): union the root with the other;


Path Compression

Image source: http://courses.cs.washington.edu/courses/cse326/00wi/handouts/lecture18/sld034.htm
Points everything along the path of a find to the root;
Thus reduces the height of the entire access path to 1.


Union-by-rank 

Use an extra array to store the rank of a given node. Rank[a] is the depth of the tree rooted at a.
Image source: http://courses.cs.washington.edu/courses/cse326/00wi/handouts/lecture18/sld029.htm

Union(a, b): pick the shallower tree and point it at the deeper one.

Image source: http://courses.cs.washington.edu/courses/cse326/00wi/handouts/lecture18/sld031.htm
This can cut down on height of the new tree;
Find(a) complexity: a tree of height h must have at least 2 ^h nodes. Find(a) takes O(max height) time. Thus Find(a) takes O(log(n)) time.


Complexity of Union-by-rank path compression

Tarjan proved that the find operation on a set of n elements in a union with m union-by-rank subsets has worst complexity O(m * alpha(m,n)).

Implementation
This implementation uses an array.
The test data set: 16 integers range from 0 to 15.
After union, each set should contain 4 integers.


public class DisjointSets {
 //array[x] stores the root of the tree (representative of the set)
 //if array[x] is less than 0, that means x is the root of the tree (representative of the set)
 private int[] array;
 
 //initial number of elements;
 //also the initial number of disjoint sets
 //since every element is initially in its own set
 public DisjointSets(int numElements) {
  array = new int [numElements];
  //since every element is the root of its tree,
  //set each element in the array to negative
  Arrays.fill(array, -1);
 }


Every element is the root of the tree it belongs
Initial Array: every element is negative


public void union(int root1, int root2) {
  if (array[root2] < array[root1]) 
   array[root1] = root2; // root2 is taller; make root2 new root
  else {
   if (array[root1] == array[root2]) 
    array[root1]--;
   array[root2] = root1; // root1 equal or taller; make root1 new root
  }
 }



After first union
Array after first union
After second union

Array after second union


public int find(int x) {
  if (array[x] < 0)
   return x; //x is the root of the tree; return it
  else {
   //find the root of the tree
   array[x] = find(array[x]);
   return array[x];
  }
 }


Test code and result:

public static void main(String[] args) {
  int NumElements = 16;
  int NumInSameSet = 4;
  DisjointSets s = new DisjointSets(NumElements);
  int set1, set2;
  for (int k = 1; k < NumInSameSet; k *= 2) {
   for (int j = 0; j + k < NumElements; j += 2 * k) {
    set1 = s.find(j);
    System.out.println("set1: " + set1);
    set2 = s.find(j + k);
    System.out.println("set2: " + set2);
    s.union(set1, set2);
   }
  }
  for (int i = 0; i < NumElements; i++) {
   System.out.print(s.find(i) + "*");
   if (i % NumInSameSet == NumInSameSet - 1)
    System.out.println();
  }
  System.out.println(); 
 }





No comments:

Post a Comment