- Set U = {a1, a2, ..., an}
- Maintain a partition of U, a set of subsets of U {S1, S2, ..., Sk} such that:
- each pair of subsets Si and Sj are disjoint;
- together, the subsets cover U;
- each subset has a unique name.
- Union(a, b) creates a new subset which is the union of a and b;
- Find(a) returns the representative member of a set: similar to find parent of a TreeNode in a tree;
- Thus Disjoint Set can be represented by an up-tree;
- Disjoint set equivalence property: every element of a DS U/F structure belongs to exactly one set;
- Dynamic equivalence property: the set of an element can change after execution of a union.
Up-Tree Union-Find Data Structure
Image source: http://courses.cs.washington.edu/courses/cse326/00wi/handouts/lecture18/sld015.htm |
- Each subset is an up-tree with its root as its representative member;
- All members of a given set are nodes in that set's up-tree;
- Not necessarily binary;
- Find(a): traverse from the leaf to the root;
- Union(a, b): union the root with the other;
Path Compression
Image source: http://courses.cs.washington.edu/courses/cse326/00wi/handouts/lecture18/sld034.htm |
Thus reduces the height of the entire access path to 1.
Union-by-rank
Use an extra array to store the rank of a given node. Rank[a] is the depth of the tree rooted at a.
Image source: http://courses.cs.washington.edu/courses/cse326/00wi/handouts/lecture18/sld029.htm |
Union(a, b): pick the shallower tree and point it at the deeper one.
Image source: http://courses.cs.washington.edu/courses/cse326/00wi/handouts/lecture18/sld031.htm |
Find(a) complexity: a tree of height h must have at least 2 ^h nodes. Find(a) takes O(max height) time. Thus Find(a) takes O(log(n)) time.
Complexity of Union-by-rank path compression
Tarjan proved that the find operation on a set of n elements in a union with m union-by-rank subsets has worst complexity O(m * alpha(m,n)).
Implementation
This implementation uses an array.
The test data set: 16 integers range from 0 to 15.
After union, each set should contain 4 integers.
public class DisjointSets { //array[x] stores the root of the tree (representative of the set) //if array[x] is less than 0, that means x is the root of the tree (representative of the set) private int[] array; //initial number of elements; //also the initial number of disjoint sets //since every element is initially in its own set public DisjointSets(int numElements) { array = new int [numElements]; //since every element is the root of its tree, //set each element in the array to negative Arrays.fill(array, -1); }
Every element is the root of the tree it belongs |
Initial Array: every element is negative |
public void union(int root1, int root2) { if (array[root2] < array[root1]) array[root1] = root2; // root2 is taller; make root2 new root else { if (array[root1] == array[root2]) array[root1]--; array[root2] = root1; // root1 equal or taller; make root1 new root } }
After first union |
Array after first union |
After second union |
Array after second union |
public int find(int x) { if (array[x] < 0) return x; //x is the root of the tree; return it else { //find the root of the tree array[x] = find(array[x]); return array[x]; } }
Test code and result:
public static void main(String[] args) { int NumElements = 16; int NumInSameSet = 4; DisjointSets s = new DisjointSets(NumElements); int set1, set2; for (int k = 1; k < NumInSameSet; k *= 2) { for (int j = 0; j + k < NumElements; j += 2 * k) { set1 = s.find(j); System.out.println("set1: " + set1); set2 = s.find(j + k); System.out.println("set2: " + set2); s.union(set1, set2); } } for (int i = 0; i < NumElements; i++) { System.out.print(s.find(i) + "*"); if (i % NumInSameSet == NumInSameSet - 1) System.out.println(); } System.out.println(); }
No comments:
Post a Comment