Disjoint Sets (Union-Find)

A Disjoint Set (also known as a Union-Find data structure) is a data structure that stores a collection of disjoint (non-overlapping) Sets [1].

For example:

In this case, we have a total of 3 Disjoint Sets:

{0, 1}.
{2, 3, 4}.
{5}.

From now on, let's only use Union Find.

Common Operations

A Union Find data structure allows two main operations:

union(a, b): merge the sets to which the nodes a and b belong.
find(a): determine the set for the node a.

There are two ways¹ to implement Union Find. The complexities below refer to the Time. For the Space, we need to build a list of n nodes to contain information about the nodes, hence it will always be $O (n)$ .

Quick Find:
- $O (1)$ for find operation.
- $O (n)$ for union operation.
Quick Union.
- $O (n)$ find operation.
- $O (n)$ union operation.

At a first look, Quick Find seems better, but there are several Quick Union optimization techniques we can use to drastically reduce the overall complexity.

We are going to use Find if Path Exists in Graph problem to apply such data structures.

References

[1]: https://en.wikipedia.org/wiki/Disjoint-set_data_structure

It is also common to have a third operation, which takes two nodes as input and return True if they belong to the same set, otherwise False.

algo-ds.com

Disjoint Sets (Union-Find)

Common Operations

Implementation

References