Union by Rank in Quick Union

The Union by Rank is an optimization technique that can be applied to the Quick Union implementation.

In the previous implementations of Quick Find and Quick Union, we always chose the first node as the dominant one.

However, this could lead to a skewed list of nodes, which looks like a Linked List. As we know, for this data structure the time complexity to find an element is . Therefore, the same applies here.

Instead, we could use the highest height of the set that belongs to either node a or b to choose the root node.

Now you might ask: why should we select the highest height?

Let's look at an example:

example4400110--1221--2332--3

If you want to union 1 and 4, you will end up having two possible scenarios.

After calculating the parent nodes of 1 and 4, which are 0 and 4, respectively, we could build a set with 0 (height is 3) as a root node, and another with 4 (height is 0).

We will end up having two different sets:

example00110--1440--4221--2332--3

example00110--1221--2332--3444--0

As you might notice, choosing the root of the node with the lowest height, has given us a skewed tree.

Implementation

Besides keeping a list of parents as we did in Quick Union, we also need a rank value for each node, which indicates its height. Then, we will use this value to pick our root node.

class UnionFind:
    def __init__(self, n: int) -> None:
        self.n = n
        self.parents = [i for i in range(self.n)]
        self.ranks = [1] * self.n

    def union(self, a: int, b: int) -> None:
        self._validate(a)
        self._validate(b)

        parent_a = self.find(a)
        parent_b = self.find(b)
        if parent_a == parent_b:
            return

        rank_a = self.ranks[a]
        rank_b = self.ranks[b]
        same_rank = rank_a == rank_b

        if rank_a <= rank_b:
            self.parents[parent_a] = parent_b
            if same_rank:
                self.ranks[b] += 1
        else:
            self.parents[parent_b] = parent_a

    def find(self, a: int) -> int:
        while a != self.parents[a]:
            a = self.parents[a]

        return a

    def connected(self, a: int, b: int) -> bool:
        return self.find(a) == self.find(b)

    def _validate(self, a: int) -> None:
        assert 0 <= a < self.n

If we apply this approach to the Find if Path Exists in Graph problem, finally our code will be accepted.

from typing import List


class Solution:
    def validPath(
        self, n: int, edges: List[List[int]], source: int, destination: int
    ) -> bool:
        union_find = self.build_disjoint_set(n, edges)

        return union_find.connected(source, destination)

    def build_disjoint_set(self, n: int, edges: List[List[int]]) -> UnionFind:
        union_find = UnionFind(n)

        for a, b in edges:
            union_find.union(a, b)

        return union_find

Complexity

The space complexity is , since we have to store all parents and ranks in two lists of elements (, hence ).

Regarding the time complexity, the union function just depends on the find one.

This ranking approach ensures that the complexity is , where is the height of the set, which is always balanced thanks to this ranking properties. Hence, the final complexity will be .

Lastly, since we need to union edges, before finding out if two nodes are connected, we need time in total.