- Set U = {a1, a2, ..., an}
- Maintain a partition of U, a set of subsets of U {S1, S2, ..., Sk} such that:
- each pair of subsets Si and Sj are disjoint;
- together, the subsets cover U;
- each subset has a unique name.
- Union(a, b) creates a new subset which is the union of a and b;
- Find(a) returns the representative member of a set: similar to find parent of a TreeNode in a tree;
- Thus Disjoint Set can be represented by an up-tree;
- Disjoint set equivalence property: every element of a DS U/F structure belongs to exactly one set;
- Dynamic equivalence property: the set of an element can change after execution of a union.
Up-Tree Union-Find Data Structure
|  | 
| Image source: http://courses.cs.washington.edu/courses/cse326/00wi/handouts/lecture18/sld015.htm | 
- Each subset is an up-tree with its root as its representative member;
- All members of a given set are nodes in that set's up-tree;
- Not necessarily binary;
- Find(a): traverse from the leaf to the root;
- Union(a, b): union the root with the other;
Path Compression
|  | 
| Image source: http://courses.cs.washington.edu/courses/cse326/00wi/handouts/lecture18/sld034.htm | 
Thus reduces the height of the entire access path to 1.
Union-by-rank
Use an extra array to store the rank of a given node. Rank[a] is the depth of the tree rooted at a.
|  | 
| Image source: http://courses.cs.washington.edu/courses/cse326/00wi/handouts/lecture18/sld029.htm | 
Union(a, b): pick the shallower tree and point it at the deeper one.
|  | 
| Image source: http://courses.cs.washington.edu/courses/cse326/00wi/handouts/lecture18/sld031.htm | 
Find(a) complexity: a tree of height h must have at least 2 ^h nodes. Find(a) takes O(max height) time. Thus Find(a) takes O(log(n)) time.
Complexity of Union-by-rank path compression
Tarjan proved that the find operation on a set of n elements in a union with m union-by-rank subsets has worst complexity O(m * alpha(m,n)).
Implementation
This implementation uses an array.
The test data set: 16 integers range from 0 to 15.
After union, each set should contain 4 integers.
public class DisjointSets {
 //array[x] stores the root of the tree (representative of the set)
 //if array[x] is less than 0, that means x is the root of the tree (representative of the set)
 private int[] array;
 
 //initial number of elements;
 //also the initial number of disjoint sets
 //since every element is initially in its own set
 public DisjointSets(int numElements) {
  array = new int [numElements];
  //since every element is the root of its tree,
  //set each element in the array to negative
  Arrays.fill(array, -1);
 }
|  | 
| Every element is the root of the tree it belongs | 
|  | 
| Initial Array: every element is negative | 
public void union(int root1, int root2) {
  if (array[root2] < array[root1]) 
   array[root1] = root2; // root2 is taller; make root2 new root
  else {
   if (array[root1] == array[root2]) 
    array[root1]--;
   array[root2] = root1; // root1 equal or taller; make root1 new root
  }
 }
|  | 
| After first union | 
|  | 
| Array after first union | 
|  | 
| After second union | 
|  | 
| Array after second union | 
public int find(int x) {
  if (array[x] < 0)
   return x; //x is the root of the tree; return it
  else {
   //find the root of the tree
   array[x] = find(array[x]);
   return array[x];
  }
 }
Test code and result:
public static void main(String[] args) {
  int NumElements = 16;
  int NumInSameSet = 4;
  DisjointSets s = new DisjointSets(NumElements);
  int set1, set2;
  for (int k = 1; k < NumInSameSet; k *= 2) {
   for (int j = 0; j + k < NumElements; j += 2 * k) {
    set1 = s.find(j);
    System.out.println("set1: " + set1);
    set2 = s.find(j + k);
    System.out.println("set2: " + set2);
    s.union(set1, set2);
   }
  }
  for (int i = 0; i < NumElements; i++) {
   System.out.print(s.find(i) + "*");
   if (i % NumInSameSet == NumInSameSet - 1)
    System.out.println();
  }
  System.out.println(); 
 }

 
No comments:
Post a Comment