Find Median from Data Stream

The problem description can be found in LeetCode.

Approach

Suppose to have an input like:

lst = [1, 2, 3, 4, 5]

If we divide such inputs in small and big values, we would have:

small_values = [1, 2, 3?]
big_values = [3?, 4, 5]

The median can be either in the small_values list or in the big_values one. It will either be the last or first element of small_values and big_values, respectively.

So, if we maintain such list of values, we could easily find the median in constant time in one of the two lists.

The Heap looks like a perfect data structure we can use in this case. We just need to maintain the balance properties between the two lists, hence, abs(len(small_values) - len(big_values)) <= 1.

Solution

from heapq import heappop, heappush


class MedianFinder:
    def __init__(self):
        self.small = []  # max heap
        self.large = []  # min heap

    def addNum(self, num: int) -> None:
        # add the value in the `max Heap`
        heappush(self.small, -num)

        # rearrange the values to maintain the heaps property
        if self.small and self.large:
            largest = -self.small[0]
            smallest = self.large[0]

            if largest > smallest:
                _ = heappop(self.small)
                heappush(self.large, largest)

        self._balance_heaps()

    def findMedian(self) -> float:
        assert abs(len(self.small) - len(self.large)) <= 1

        if len(self.small) > len(self.large):
            return -self.small[0]
        if len(self.large) > len(self.small):
            return self.large[0]

        return (-self.small[0] + self.large[0]) / 2

    def _balance_heaps(self) -> None:
        if len(self.small) > len(self.large) + 1:
            largest = -heappop(self.small)
            heappush(self.large, largest)
        elif len(self.large) > len(self.small) + 1:
            smallest = heappop(self.large)
            heappush(self.small, -smallest)

Complexity

  • addNum:
    • Time:
    • Space:
  • findMedian:
    • Time:
    • Space:

Where is the number of function calls to the addNum method.