Path Compression

The Path Compression is an optimization technique, which can be applied to the Quick Union implementation.

Let's suppose to have a skewed scenario like:

A find applied on node 3 would traverse all nodes. Calling the function again would cause the same number of iterations.

However, what if we use the find function to group the nodes into the root one?

Basically, the example above would become the following after calling find(3). The next find(3) will be must faster.

Implementation

As explained above, only the find must be changed to accomplish such optimization:

class UnionFind:
    def __init__(self, n: int) -> None:
        self.n = n
        self.parents = [i for i in range(self.n)]

    def union(self, a: int, b: int) -> None:
        self._validate(a)
        self._validate(b)

        parent_a = self.find(a)
        parent_b = self.find(b)
        if parent_a != parent_b:
            self.parents[parent_b] = parent_a

    def find(self, a: int) -> int:
        if a == self.parents[a]:
            return a

        self.parents[a] = self.find(self.parents[a])

        return self.parents[a]

    def connected(self, a: int, b: int) -> bool:
        return self.find(a) == self.find(b)

    def _validate(self, a: int) -> None:
        assert 0 <= a < self.n

If we apply such approach in the Find if Path Exists in Graph problem, finally our code will be accepted.

from typing import List


class Solution:
    def validPath(
        self, n: int, edges: List[List[int]], source: int, destination: int
    ) -> bool:
        union_find = self.build_disjoint_set(n, edges)

        return union_find.connected(source, destination)

    def build_disjoint_set(self, n: int, edges: List[List[int]]) -> UnionFind:
        union_find = UnionFind(n)

        for a, b in edges:
            union_find.union(a, b)

        return union_find

Complexity

The space complexity is $O (n)$ , since we have to store all parents in a list of $n$ elements.

Regarding the time complexity, the union function just depends on the find one.

This kind of balancing allow us to achieve a similar time complexity compared to Quick Union Find by Rank, hence, $O (l o g (n))$ .

Lastly, since we need to union $m$ edges, before finding out if two nodes are connected, we need $O (m l o g (n))$ time in total.

algo-ds.com

Path Compression

Implementation

Complexity