Introduction

Hello, world! 👋

After struggling for so much time to prepare for coding interviews, I have decided to help you (and myself) by creating this website.

My aims are to:

Explain the common algorithms and data structures used in the FAANG interviews.
Explain which Python components to use in order to solve such problems.
Collect all the free resources to study a certain topic more in depth.
Collect the solutions of some LeetCode problems. I will start with the famous Grind 75 list, and then add more problems based on your feedback.

If you are a novice, probably, this is not the best place to start with, but you can still use it later, once you are more familiar with coding interviews concepts.

Again, the main idea is to minimize the amount of time you would need to re-prepare for a new interview. In fact, for those of you who do not know, typically, if one interview does not go well, you would need to wait some time before re-applying for the same company. The amount of time varies from company to company, but it can go from 6 months to 1 or more years. If you do not practice enough during this period, you can lose all the progress you have made, and you might also lose some internal brain connections that are allowing you to solve problems in no-time (when you are well-trained).

At least, that's what I have experienced. You will surely not start from zero every time, but this website can help you to minimize the time you need to revise things you already know!

Contribute

Currently, the repository is still private, so, if you want to contribute or fix something, you can write me at mmicu.github00@gmail.com or join the public Discord channel.

Please, indicate if you want to be cited in the Contributors section. If so, you can also provide your name and either one link (e.g. GitHub, X, etc.) or your email.

Support

As you might imagine, writing the content of this website required a lot of my spare time, plus, endless hours of reviews. As well as some money for the domain and other maintenance costs.

If you want so support me or just thank me, you can donate via Patreon or buy me a coffee.

Official Channels

Apart from the public Discord channel and private Discord one, there is no social media I am using to advertise or communicate for algo-ds.com.

So, please, avoid using other communication sources and report them to me at mmicu.github00@gmail.com.

Additional Information

The website has been written by using the mdBook utility.

Moreover, all the Python code you see here is for version 3.8 and earlier, which will impact the type hits. This is because all the Python function signatures that you find in LeetCode have such notation. In fact, we are going to use:

from typing import List

x: List[int] = [1, 2, 3]

Instead of:

x: list[int] = [1, 2, 3]

The latter is only available from Python >= 3.9.

Check this mypy type hints cheat sheet for more information.

Hidden code

It looks like the following snippet contains a single line of code. However, if you go with your mouse over the snippets, a couple of icons will appear at the top right. The one with the eye symbol will allow you to show the hidden code:

s = "hello, world"

print(s)

Complexity Analysis

We can assess the efficiency of an algorithm in terms of space and time, without actually running it and yet understand how slow/fast it is.

The complexity of an algorithm is denoted with the Big O notation, which describes the limiting behavior of a certain function when the argument tends to a particular value or infinity [1].

Classes of Functions

Before explaining how to measure the efficiency of an algorithm, we have to understand how to compare functions bounded to an input $n$ .

In fact, since both time and space complexities are expressed through mathematical functions, we need a way to compare them.

This chart [2] will help us:

Big O complexity chart. *Chart by bigocheatsheet.com.*

The importance of $n$

You will likely found $n$ somewhere inside this $O$ notation (e.g. $O (n)$ ), but $n$ is just the name of a variable. You can use any name you want, but, of course, $n$ is the most common to indicate the size of a certain input.

However, nothing prevents you to use other names. Just keep in mind that you have to pay attention on how you map the name of a certain variable to the inputs of your algorithm.

For example, you can use $n$ to map the size of a list. However, you could also use $m$ , $x$ , $y$ , or your preferred variable's name.

Common complexities

Before looking at some examples, remember that constant values are not considered. So, for example, $O (2 n)$ and $O (10 + n)$ are both $O (n)$ .

Moreover, we are also going to ignore terms with the lowest growth rate: $O (n^{2} + n)$ is just $O (n^{2})$ .

Constant time

It is indicated with $O (1)$ . Some examples:

# mathematical operations
a = 1
b = 2
c = a + b

# sets/hash-maps lookup
1 in s  # where `s = {1, 2, 3}`
2 in map  # where `m = {1: "one", 2: "two", 3: "three"}`

# pushing/popping from a stack
s = []
s.append(1)
s.append(2)
s.pop()

# enqueuing/dequeuing from a double-ended queue
from collections import deque
q = deque()
q.append(1)
q.append(2)
q.popleft()

Logarithmic time

It is indicated with $O (l o g (n))$ . This is common for the Binary Search algorithm, and, more generally, whenever the search space is halved at each iteration.

Linear time

It is indicated with $O (n)$ . This is probably the most common one, especially if the algorithm uses loops bounded to the size of the input.

The following are all $O (n)$ algorithms. Even with nested loops, we can still have such linear complexity. Just be careful on the used variables.

nums: List[Any] = ...  # a list took as input

# traverse a list (simplest case)
for num in nums:
    print(num)


# two nested loop, but `m` is a constant
m = 2
for num in nums:
    for _ in range(m):
        print(num)


# two consecutive loops: `O(2 * n)`, hence `O(n)`
for num in nums:
    print(num)

for num in nums:
    print(num)
# ~


# find an element in a list. In the worst case, you need to traverse the entire list
target = 10
for num in nums:
    if num == target:
        print(f"Found the target `{target}`")
        break

Logarithmic time with constant operations

It is indicated with $O (c l o g (n))$ , where $c$ is a constant factor.

You will usually encounter this complexity after applying a logarithmic algorithm for $c$ times. Some examples are Binary Search and push/pop operations in a Heap.

Linearithmic time (Linear * Logarithmic)

It is indicated with $O (n l o g (n))$ .

You will usually encounter this complexity for sorting algorithms, and divide and conquer ones.

Quadratic time

It is indicated with $O (n^{2})$ . Some examples:

# traverse a quadratic matrix
matrix: List[List[Any]] = ...  # a quadratic matrix took as input
n = len(nums)
for i in range(n):
    for j in range(n):
        print(matrix[i][j])


# traverse a list twice
nums: List[Any] = ...  # a list took as input
n = len(nums)
for i in range(n):
    for j in range(i, n):
        pass

Exponential time

It is indicated with $O (2^{n})$ . This usually appears in combinatorial problems. The best example is the recursive (non-optimized) Fibonacci sequence.

Factorial time

It is indicated with $O (n!)$ , where $n!$ is equal to 1 * 2 * 3 * ... * n. This usually appears in combinatorial problems.

You use this when you need to calculate all the permutations of a certain list.

References

Analyze Recursive Functions

The time complexity for recursive functions is the product between:

The number of times the recursive function is called.
The time complexity per call.

For the former, we can get help from the State Space, whereas for the latter we can simply use the technique I explained before.

The Space Tree

Based on the function we are examining, we can derive the space tree to understand the total number of calls.

Let's look at this example:

from typing import List


def print_list(lst: List[int]) -> None:
    if not lst:
        return

    print(lst[0])
    print_list(lst[1:])


lst = [1, 2, 3]
print_list(lst)

Which would give us the following space tree:

As you can see, the number of nodes is proportional to the size of the input list, which is $n$ . Therefore, the function will be called $n$ times (point 1) and the time complexity per call is $1$ (point 2). In fact, we are simply printing a value, nothing else. So, the final time complexity will be the product of such value, which is $O (n)$ .

Regarding the space complexity, we can use the Space Tree again, which results in $n$ frames in the Call Stack, so the space complexity is $O (n)$ .

Other Examples

Fibonacci

A naive implementation of the Fibonacci sequence could be something like:

def fib(n: int) -> int:
    if n == 0:
        return 0
    if n == 1 or n == 2:
        return 1

    return fib(n - 1) + fib(n - 2)


print(fib(4))

This would give us the following space tree:

We start from level 0 with 1 node, then level 1 with 2 nodes, level 2 with 4 nodes - in general, $2^{k}$ nodes for the $k$ level. So the time complexity will be $O (2^{k})$ .

Regarding the space, the maximum number of frames we would have is proportional to the height of the tree, which is at most $n$ . Hence, $O (n)$ .

In fact, if we mark the nodes that would be in the call stack until we reach the first base case, we would have the following space tree:

Additional Resources

Time Complexity Analysis by Abdul Bari.
Chapter 2 of Competitive Programmer's Handbook by Antti Laaksonen.
The Complete Guide to Big O Notation & Complexity Analysis for Algorithms by Alvin.
- Part 1.
- Part 2.
Big-O Notation - For Coding Interviews by NeetCode.
Big O myths busted! by strager.

Conclusions

This is just an introduction to the Big O notation. All the LeetCode solutions showed here contain information about both space and time complexities.

However, if you still have some doubts and you want to enforce your theoretical understanding of this topic, you can take a look at the Additional Resources.

Arrays, Strings and Hashing

I have decided to group these three topics, because they are usually used together to solve a large range of problems.

Arrays

The static arrays used in other languages like C and C++ are not available in Python. Instead, you can use dynamic ones, known by the list type:

>>> l = [1, 2, 3]
>>> len(l)
3

>>> for num in l:
...     print(num)
...
1
2
3
>>> l.append(4)  # [1, 2, 3, 4]

>>> l = [1, 2, "three"]  # it can also contain mixed types

Operations

Let's check the complexity of the most common arrays' operations [1]:

Operation	Time Complexity	Space Complexity	Syntax
Size	$O (1)$	$O (1)$	`len(l)`
Get i-th	$O (1)$	$O (1)$	`l[i]`
Set i-th	$O (1)$	$O (1)$	`l[i] = val`
Copy	$O (n)$	$O (n)$	`[val for val in l]`
Append	$O (1)$	$O (1)$	`l.append(val)`
Pop last	$O (1)$	$O (1)$	`l.pop()`
Pop i-th	$O (n)$	$O (n)$	`l.pop(i)`
Delete value	$O (n)$	$O (n)$	`l.remove(val)`
Iterate	$O (n)$	$O (1)$	`for val in l: print(val)`
Search	$O (n)$	$O (1)$	`val in l`
Minimum	$O (n)$	$O (1)$	`min(l)`
Maximum	$O (n)$	$O (1)$	`max(l)`

Best Practices

Iterate over the list instead of using the index

# Good
for num in nums:
    print(num)

# Not-so-good
for i in range(len(nums)):
    print(nums[i])

Use `enumerate`

# Good
for i, num in enumerate(nums):
    print(i, num)

# Not-so-good
for i in range(len(nums)):
    print(i, nums[i])

Use `enumerate` and `start` argument

# Good
for i, num in enumerate(nums, start=1):
    print(i, num)

# Not-so-good
for i, num in enumerate(nums):
    print(i+1, num)

Use `zip` to iterate over multiple lists

# Good
for num_a, num_b, num_c in zip(lst_a, lst_b, lst_c):
    print(num_a, num_b, num_c)

# Not-so-good
assert len(lst_a) == len(lst_b) == len(lst_c)
for i in range(len(lst_a)):
    print(lst_a[i], lst_b[i], lst_c[i])

Use `reversed` for backward iteration

# Good
for num in reversed(nums):
    print(num)

# Not-so-good
for i in range(len(nums) - 1, -1, -1):
    print(nums[i])

References

[1]: https://wiki.python.org/moin/TimeComplexity

Strings

Strings are immutable in Python, which means that we cannot edit them. In fact, the following code will throw an exception:

>>> s = "hello, world!"
>>> s[0] = "H"
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: 'str' object does not support item assignment

The solution is to use a list together with the join function to build up the string. Notice that append takes $O (1)$ and join takes $O (n)$ , where $n$ is the number of elements in the list (considering that each element is a string of length 1).

>>> lst = []

>>> lst.append("h")
>>> lst.append("e")
>>> lst.append("l")
>>> lst.append("l")
>>> lst.append("o")
>>> lst.append("!")

>>> print("".join(lst))
hello!

Operations

Check the ones for the Arrays.

Best Practices

Check the ones for the Arrays.

Hashing

Python includes two data types that use hashing internally to provide fast access to elements: Sets and Dictionaries.

Sets

A set is an unordered collection with no duplicate elements [1].

>>> s = {1, 2, 3}  # NOTE: for an empty set we must use `set()`. `{}` would initialize a `dict`

>>> 1 in s
True
>>> 4 in s
False
>>>

>>> s.add(4)
>>> 4 in s
True

>>> s.add(4)  # `{1, 2, 3, 4}` no duplicates allowed as per definition

>>> s.remove(1)  # {2, 3, 4}
>>> s.remove(1)  # `KeyError: 1`
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
KeyError: 1
>>> s.discard(1)  # it does not raise even if the element does not exist

Operations

Let's check the complexity of the sets' common operations [2]:

Operation	Time Complexity	Space Complexity	Syntax
Add	$O (1)$	$O (1)$	`s.add(val)`
Size	$O (1)$	$O (1)$	`len(s)`
Search value	$O (1)$	$O (1)$	`val in s`
Delete value	$O (1)$	$O (1)$	`s.remove(val)`
Union	$O (l e n (s) + l e n (t))$	$O (l e n (s) + l e n (t))$	`s \| t`
Intersection	$O (min (l e n (s), l e n (t)))$	$O (min (l e n (s), l e n (t)))$	`s & t`

References

Dictionaries

Dictionaries are also known as associative memories or associative arrays. They are indexed by keys that can be any immutable type, so strings and numbers can always be keys as well as tuples (if they contain only strings, numbers, or tuples). In fact, if a tuple contains any mutable object either directly or indirectly, it cannot be used as a key [1].

>>> d = {
...     "a": 1,
...     "b": 2,
... }

>>> "a" in d
True

>>> "c" in d
False

>>> d["e"] + 2
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
KeyError: 'e'
>>> d.get("e", 0) + 2  # use `get` to avoid `KeyError` and
                       # to use a default value for missing keys

>>> d["c"] = 3
>>> "c" in d
True

>>> d[[1, 2]] = 2  # cannot hash a `list` since it is *mutable*
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: unhashable type: 'list'

>>> d[(1, 2)] = 2  # a `tuple` is *immutable*, hence hashable

Operations

Let's check the complexity of dictionaries' common operations [2]:

Operation	Time Complexity	Space Complexity	Syntax
Size	$O (1)$	$O (1)$	`len(d)`
Get item	$O (1)$	$O (1)$	`d[k]` or `d.get(k)`
Set item	$O (1)$	$O (1)$	`d[k] = v`
Delete item	$O (1)$	$O (1)$	`del d[k]`
Key in dict	$O (1)$	$O (1)$	`k in d`
Iterate	$O (n)$	$O (1)$	`for k, v in d.items(): print(k, v)`

Best Practices

Use `items()`

# Good
for k, v in d.items():
    print(f"{k} -> {v}")

# Not-so-good
for k in d:
    print(f"{k} -> {d[k]}")

Use `collections.defaultdict`

# Good
from collections import defaultdict

d = defaultdict(int)
d["a"] += 1  # `defaultdict(<class 'int'>, {'a': 1})`
# ~

# Slightly better
d = {}
d["a"] = d.get("a", 0) + 1

# Not-so-good
d = {}

if "a" not in d:
    d["a"] = 0

d["a"] += 1
# ~

Use `collections.Counter`

from collections import Counter

a = "abcabcz"
c = Counter(a)  # Counter({'a': 2, 'b': 2, 'c': 2, 'z': 1})

c["a"]  # 2
c["b"]  # 2
c["c"]  # 2
c["z"]  # 1
c["x"]  # 0

References

Two Pointers

The Two-pointers technique can be used to solve problems that involve arrays, linked lists, or other sequences.

These sequences are not necessarily ordered, but, if so, you might need additional data structures to keep track of some information.

A Classic Example

Let's take Two Sum II - Input Array Is Sorted as a classic example for a Two-pointers algorithm.

We surely need to take advantage of the fact that the array is ordered¹. If so, we can easily establish if the target cannot fit in the input. Suppose a target of 10, with a list like [1, 2, 3, 4, 5].

Since the list is already ordered, we know that the elements at the positions i-1 and i-2 (last and second to last, respectively) are the two highest values of the list (5 and 4 in our case). The sum of the two values will not reach the target (9 < 10), which means that no other sum between any two other values in the list can reach the target.

In fact, if the sum of the two highest values is less than the target, how can we find two other values big enough to reach the target? Well, we cannot since the list is ordered.

Then, how can we use such information?

Suppose that we start one pointer from the left and one from the right and we sum their values. We will end up by summing up the minimum (element pointed by the left pointer) and maximum (element pointed by the right pointer), having three possibilities:

nums[left] + nums[right] == target: we found the target, there's nothing else to do.
nums[left] + nums[right] < target: the number we found is too small. So, summing up the current minimum and maximum is not enough. What can we do? Logically, we can simply find a new minimum and try with it, since the previous one was too small. We still need the current maximum as it is fundamental to find a target.
nums[left] + nums[right] > target, which means that the number that we found is too big. In this case, the sum between the minimum and maximum is too big, so we can do the opposite: discard the current maximum by trying with a new one, and keep the current minimum.

The implementation would look something like:

from typing import List


class Solution:
    def twoSum(self, numbers: List[int], target: int) -> List[int]:
        left, right = 0, len(numbers) - 1

        while left < right:
            pointers_sum = numbers[left] + numbers[right]
            if pointers_sum == target:
                return [left + 1, right + 1]

            if pointers_sum < target:
                left += 1
            else:
                right -= 1

        assert False, "Unreachable. Expected a single solution for the input."

Which has the following complexity:

Time: $O (n)$
Space: $O (1)$

Where n is the size of the list numbers.

This is one of the numerous questions we should always ask to the interviewer, especially about the inputs of the problem. Having an already sorted list as input can change the way we think of our algorithm a lot.

Sliding Window

The Sliding Window Technique is a method used to efficiently solve problems that involve processing subarrays or substrings within a larger sequence.

It operates by maintaining a window with either a static or a dynamic size, depending on the requirements of the problem.

The advantage of keeping such window is that we can drop the time complexity from $O (n^{2})$ to $O (n)$ . In fact, a typical brute-force approach would require a double pass (hence $O (n^{2})$ ), whereas a Sliding Window one can reduce it to a single one (hence $O (n)$ ).

Let's now look at both the static and dynamic cases.

Sliding Window with a Static Size

Let's use Maximum Number of Vowels in a Substring of Given Length as our example.

A brute-force approach would consist of checking every substring of size $k$ starting at each character of the string, which is $O (nk)$ time complexity, where $n$ is the length of the string and $k$ is the length of the substring.

However, we can easily notice that we have a lot of repeated work. In fact, what if we imagine to have a window of size k?

In this way, once we compute the window for the first time, we can simply shrink it from the left and expand it to the right by one. This -1 (shrinking) and +1 (expansion) would not change the size of the window, which will remain k.

The implementation would look something like:

class Solution:
    VOWELS = {"a", "e", "i", "o", "u"}

    def maxVowels(self, s: str, k: int) -> int:
        def is_vowel(char: str) -> bool:
            assert isinstance(char, str)
            assert len(char) == 1

            return char in Solution.VOWELS

        n = len(s)
        assert k <= n

        ans = 0
        for right in range(k):
            ans += int(is_vowel(s[right]))

        left = 0
        win_res = ans
        for right in range(right + 1, n):
            win_res -= int(is_vowel(s[left]))
            win_res += int(is_vowel(s[right]))

            left += 1
            ans = max(ans, win_res)

        return ans

Which has the following complexity:

Time: $O (n)$
Space: $O (1)$

Where n is the size of the string s.

Sliding Window with a Dynamic Size

Let's use Longest Substring Without Repeating Characters as our example.

A brute-force approach would consist in checking every substring and stop as soon as we find a duplicated characters. We have a total of $O (n^{2})$ substrings and we would require $O (n)$ to look for duplicated characters. So our total time complexity would be $O (n^{3})$ , where $n$ is the length of the string.

In this case, we do not have an explicit window of a certain size as for the previous example. However, we can still apply a sliding window algorithm.

In fact, suppose that we do not have repeated characters in the input string: the window will be the entire string itself.

Now, let's suppose to have an example like abcdaefghi. Here, a is the duplicated character, which is at positions 0 and 4. We can build a window from the first a till the second one, then, we can shrink the window till the a is not repeated anymore, which means that the window will start from b to a (1 till 4). Lastly, we can continue increasing our window till the end, since we do not have other duplicated characters.

The implementation would look something like:

from typing import Set


class Solution:
    def lengthOfLongestSubstring(self, s: str) -> int:
        chars: Set[str] = set()
        ans = left = 0

        for right, char in enumerate(s):
            while char in chars:
                chars.remove(s[left])
                left += 1

            chars.add(char)

            win_size = right - left + 1
            ans = max(ans, win_size)

        return ans

Which has the following complexity:

Time: $O (n)$
Space: $O (m)$

Where $n$ is the size of the string $s$ and $m$ is the number of distinct characters in the string $s$ .

Binary Search

Binary Search is an efficient algorithm used to find a target value within a sorted array. In each iteration, it halves the search space, achieving logarithmic time complexity ( $O (l o g (n))$ ).

Implementation

We can implement this algorithm both recursively and iteratively.

Iterative Binary Search

from typing import List


class Solution:
    def search(self, nums: List[int], target: int) -> int:
        left, right = 0, len(nums) - 1

        while left <= right:
            mid = (left + right) // 2

            if nums[mid] == target:
                return mid

            if nums[mid] < target:
                left = mid + 1
            else:
                right = mid - 1

        return -1

Recursive Binary Search

from typing import List


class Solution:
    def search(self, nums: List[int], target: int) -> int:
        def helper(low: int, high: int) -> int:
            if low > high:
                return -1

            mid = (low + high) // 2
            if nums[mid] == target:
                return mid

            if nums[mid] < target:
                return helper(mid + 1, high)
            else:
                return helper(low, mid - 1)

        return helper(0, len(nums) - 1)

`bisect` module

In Python, we can use the bisect module to look for the insertion point (either left or right) in sorted lists.

These are the signatures of the two functions:

bisect.bisect_left(a, x, lo=0, hi=len(a), *, key=None)
bisect.bisect_right(a, x, lo=0, hi=len(a), *, key=None)

Let's have a look at the following examples:

>>> import bisect

>>> l = [0, 0, 1, 1, 1, 2, 3, 4, 4]
>>> len(l)
9

# no values less than `-100`. Hence, it would be the first element of the list
>>> bisect.bisect_left(l, -100)
0

# smaller value is `0`. Hence, `0` goes first for `bisect_left`
>>> bisect.bisect_left(l, 0)
0
# smaller value is `0`, which is already at position `0` and `1`.
# Hence, the next `0` would go at position `2` for `bisect_right`
>>> bisect.bisect_right(l, 0)
2

# same considerations for the `0` for another existing element in the list
>>> bisect.bisect_left(l, 4)
7
>>> bisect.bisect_right(l, 4)
9
>>>

# no element bigger than `100`. Hence, it will be the new last element of the list
>>> bisect.bisect_right(l, 100)
9

Notice that in the examples above we always used 0 and len(l) as lower and upper bounds, respectively. However, we can also specify any other range.

Additional Usage

Binary Search works quite well on any sorted sequence. So let's now think of an ordered list of booleans:

l = [False, False, False, False, True, True]

It is easy to determine the first False value of the list, since it is ordered. However, what about the first True value?

Again, the list is ordered, which means that we can apply Binary Search:

from typing import List


def search(self, nums: List[int]) -> int:
    left, right = 0, len(nums) - 1
    first_true_idx = -1

    while left <= right:
        mid = (left + right) // 2

        if nums[mid]:
            first_true_idx = mid
            right = mid - 1
        else:
            left = mid + 1

    return first_true_idx

first_true_idx will contain the index of the leftmost True value of the list.

How can we use such pattern?

Suppose that we have a monotonic increasing function:

def mon_inc_func(n):
    return n > 2

For the following input, the function will return an ordered output:

input = [0, 1, 2, 3, 4, 5, 6]
output = [mon_inc_func(i) for i in input]

print(output)
# [False, False, False, True, True, True, True]

Now, suppose that we do not know how this monotonic increasing function is implemented, so we do not know the source code of mon_inc_func. How can we use Binary Search to find the first value of the function that returns True?

We can use the previous snippet and simply replace nums[mid], which is looking for a True value, with the result of the function:

from typing import List


def unknown_function(n: int) -> bool:
    pass


def search(self, nums: List[int]) -> int:
    left, right = 0, len(nums) - 1
    first_true_idx = -1

    while left <= right:
        mid = (left + right) // 2

        if unknown_function(mid):  # instead of `nums[mid]`
            first_true_idx = mid
            right = mid - 1
        else:
            left = mid + 1

    return first_true_idx

Monotonic Function Example

Let's look at a real example, where we can apply this variation of Binary Search. We can use the First Bad Version problem.

Linked Lists

Linked Lists are a linear data structures consisting of nodes, each containing a value and a reference to the next node.

Unlike arrays, linked lists allow an efficient insertion and deletion. However, we cannot access elements by index, so we need to traverse the entire list to do so.

Operations

Let's check the complexity of linked lists' common operations:

Operation	Time Complexity	Space Complexity
Size	$O (1)$	$O (1)$
Get i-th	$O (n)$	$O (1)$
Set i-th	$O (n)$	$O (1)$
Copy	$O (n)$	$O (n)$
Append	$O (1)$	$O (1)$
Pop first	$O (1)$	$O (1)$
Pop last	$O (1)$	$O (1)$
Pop i-th	$O (n)$	$O (1)$
Delete value	$O (n)$	$O (1)$
Iterate	$O (n)$	$O (1)$
Search	$O (n)$	$O (1)$

Types

There are two kinds of Linked Lists:

`collections.deque`

The deque module can also simulate a Linked List. You can find more information in the Queue paragraph.

Singly Linked Lists

Let's use Design Linked List to design our Singly Linked List.

from typing import Generic, Optional, TypeVar

T = TypeVar("T")


class ListNode(Generic[T]):
    def __init__(self, val: T, next_ptr: Optional["ListNode"] = None) -> None:
        self.val = val
        self.next = next_ptr


class MyLinkedList:
    def __init__(self):
        self.size = 0
        self.head = ListNode(-1)

    def get(self, index: int) -> int:
        if index < 0 or index >= self.size:
            return -1

        return self._getNodeAtIndex(index + 1).val

    def addAtHead(self, val: int) -> None:
        self.addAtIndex(0, val)

    def addAtTail(self, val: int) -> None:
        self.addAtIndex(self.size, val)

    def addAtIndex(self, index: int, val: int) -> None:
        if index < 0 or index > self.size:
            return

        p = self._getNodeAtIndex(index)
        node = ListNode(val)
        node.next = p.next
        p.next = node

        self.size += 1

    def deleteAtIndex(self, index: int) -> None:
        if index < 0 or index >= self.size:
            return

        p = self._getNodeAtIndex(index)
        p.next = p.next.next

        self.size -= 1

    def _getNodeAtIndex(self, index: int) -> ListNode[int]:
        p = self.head

        for _ in range(index):
            p = p.next

        return p

The key idea here is to use a dummy head node to avoid to write a lot of code, which would be used to check the validity of the head itself.

Except for addAtHead, which is $O (1)$ for the time complexity, all the other operations are $O (n)$ .

The space complexity is $O (1)$ for each operation, however, after $k$ operations we might end up having a list of $k$ elements (assuming that they are all valid consecutive insert operations).

Doubly Linked Lists

Let's use Design Linked List to design our Doubly Linked List.

from typing import Generic, Optional, TypeVar

T = TypeVar("T")


class ListNode(Generic[T]):
    def __init__(
        self,
        val: T,
        prev_ptr: Optional["ListNode"] = None,
        next_ptr: Optional["ListNode"] = None,
    ) -> None:
        self.val = val
        self.prev = prev_ptr
        self.next = next_ptr


class MyLinkedList:
    def __init__(self):
        self.size = 0
        self.head = ListNode(-1)
        self.tail = ListNode(-1)

        self.head.next = self.tail
        self.tail.prev = self.head

    def get(self, index: int) -> int:
        if index < 0 or index >= self.size:
            return -1

        return self._getNodeAtIndex(index + 1).val

    def addAtHead(self, val: int) -> None:
        self.addAtIndex(0, val)

    def addAtTail(self, val: int) -> None:
        self.addAtIndex(self.size, val)

    def addAtIndex(self, index: int, val: int) -> None:
        if index < 0 or index > self.size:
            return

        p = self._getNodeAtIndex(index)
        n = p.next

        node = ListNode(val)
        node.prev = p
        node.next = n
        p.next = node
        n.prev = node

        self.size += 1

    def deleteAtIndex(self, index: int) -> None:
        if index < 0 or index >= self.size:
            return

        p = self._getNodeAtIndex(index)
        n = p.next.next

        p.next = n
        n.prev = p

        self.size -= 1

    def _getNodeAtIndex(self, index: int) -> ListNode[int]:
        p = self.head

        for _ in range(index):
            p = p.next

        return p

As for the Singly Linked List, besides the dummy head node we need the dummy tail as well. For the same reason, we can use such nodes to avoid to write code to handle edge cases.

Except for addAtHead, which is $O (1)$ for the time complexity, all the other operations are $O (n)$ .

The space complexity is $O (1)$ for each operation, however, after $k$ operations we might end up having a list of $k$ elements (assuming that they are all valid consecutive insert operations).

Stacks and Queues

Stacks and Queues can be used as auxiliary data structures to solve a large set of problems.

Usually, you can use them while iterating over a sequence to store intermediate values that you can use later to compute some results.

Although very similar, there are a few differences between the two:

Stacks follow the LIFO principle, which means Last In First Out. The last element added (push) to the stack will be the first one to be removed (pop). A classic example that I remember from high school (a long time ago 🥲) is the stack of plates: if you add one at the top, you have to remove it first, before reaching the ones at the bottom.
Queues follow the FIFO principle, which means First In First Out. The first element added (enqueue) to the queue will be the the first one to be removed (dequeue). Think of a queue at the post office: the first person who arrives will be served first.

The `deque` object

Before discussing about the Python usage of these two data structures, let me introduce the deque object from the collections container. Deques (pronounced deck) is a generalization of Stacks and Queues and it represents a Double-ended Queue [1].

This object can be used to for both Stacks and Queues.

References

[1]: https://docs.python.org/3/library/collections.html#deque-objects

Stacks in Python

A standard data type list can be used as a Stack in Python:

>>> s = []

>>> s.append(1)
>>> s.append(2)
>>> s.append(3)

>>> s.pop()
3
>>> s.pop()
2

>>> s
[1]
>>> len(s)
1

>>> s[-1]  # top of the stack
1

Both append and pop are $O (1)$ , whereas looking for an element is $O (n)$ , where $n$ is the size of the stack.

The deque can be used as well (even if I prefer the standard list). You can also notice that the methods are actually the same and only the initialization is different:

>>> from collections import deque

>>> s = deque()

>>> s.append("a")
>>> s.append("b")

>>> s.pop()
'b'

>>> s
deque(['a'])
>>> len(s)
1

>>> s[-1]
'a'

Queues in Python

The deque object we discussed about before can be also used to represent a normal Queue.

The usual enqueuing and dequeuing can be done by using the methods append and popleft, respectively.

>>> from collections import deque

>>> q = deque()

>>> q.append(1)
>>> q.append(2)
>>> q.append(3)
>>> q.append(4)

>>> q.popleft()
1
>>> q.popleft()
2

>>> q
deque([3, 4])
>>> len(q)
2

>>> q[0]
3
>>> q[-1]
4

Both append and popleft are $O (1)$ , whereas looking for an element is $O (n)$ , where $n$ is the size of the queue.

Monotonic Stacks and Queues

As described in the Introduction, we can use a stack as an auxiliary data structure to solve problems and improve overall time complexity.

Let's use Daily Temperatures as our example.

A brute force approach would consist of checking every possible temperature:

from typing import List


class Solution:
    def dailyTemperatures(self, temperatures: List[int]) -> List[int]:
        n = len(temperatures)
        res = [0] * len(temperatures)

        for i in range(n):
            for j in range(i + 1, n):
                t1 = temperatures[i]
                t2 = temperatures[j]
                if t1 < t2:
                    res[i] = j - i
                    break

        return res

Which has the following complexity:

Time: $O (n^{2})$
Space: $O (n)$

Where $n$ is the size of the temperatures list.

Can we do better? Of course, yes.

Suppose to have an increasing sequence like [1, 2, 3, 4, 5]. The result would be [1, 1, 1, 1, 0]. In fact, the i-th temperature is always less compared to the i+1 one, except for the last one, which has no higher value.

Another example could be [1, 1, 1, 1, 100]. The corresponding result would be [4, 3, 2, 1, 0]. This example suggests that the last value is the target value for all.

So we could start from the last position and keep a monotonic increasing stack to keep track of the temperatures. In this way, the stack will contain a list of possible candidates for our target temperature.

For example, for [100, 30, 40, 50, 60, 200] we can notice that once we process the first element, the stack is:

In fact, since we iterate backward, we will process the first element in the last iteration. At this point, we can pop from the stack until we find a higher temperature.

Lastly, to calculate the actual distance, we also need to store the index of the i-th temperature. So, the previous stack would look something like:

(30, 1)
(40, 2)
(50, 3)
(60, 4)
(200, 5)

The optimal solution is:

from typing import List, Tuple


class Solution:
    def dailyTemperatures(self, temperatures: List[int]) -> List[int]:
        n = len(temperatures)
        res = [0] * len(temperatures)
        stack: List[Tuple[int, int]] = []

        for i in range(n - 1, -1, -1):
            temperature = temperatures[i]

            while stack and stack[-1][0] <= temperature:
                stack.pop()

            if stack:
                res[i] = stack[-1][1] - i

            stack.append((temperature, i))

        return res

Which has the following complexity:

Time: $O (n)$
Space: $O (n)$

Where $n$ is the size of the temperatures list.

Trees

A Tree is a Connected Acyclic Graph. We will study the Graphs later, so don't worry if you do not get it immediately.

A n-ary tree is a Tree in which each node contains at most n children.

The most common one is the Binary Tree, where each node has at most 2 children ( $n = 2$ ).

As for the Linked Lists, there is no Python data type for trees. However, we can easily define a Binary one:

from typing import Generic, Optional, TypeVar

T = TypeVar("T")


class TreeNode(Generic[T]):
    def __init__(
        self,
        val: T,
        left: Optional["TreeNode"] = None,
        right: Optional["TreeNode"] = None,
    ) -> None:
        self.val = val
        self.left = left
        self.right = right

Or a n-ary tree:

from typing import Generic, List, Optional, TypeVar

T = TypeVar("T")


class TreeNode(Generic[T]):
    def __init__(self, val: T, children: Optional[List["TreeNode"]] = None) -> None:
        self.val = val
        self.children = [] if not children else children

Terminology

These are the most common terms associated to a Tree [1]:

Root: node with no incoming edges.
Children: nodes with some incoming edges.
Sibling: nodes having the same parent.
Leaf: nodes with no children.
Height: number of edges between node A and B.

Types of Binary Trees

Full Binary Tree

A Full Binary Tree has either zero or two children.

Complete Binary Tree

A Complete Binary Tree has all levels filled except the last one, which is filled from left to right.

Perfect Binary Tree

A Perfect Binary Tree has every internal node with two children and all the leaf nodes are at the same level.

This type of tree has exactly $2^{h + 1} -1$ nodes, where $h$ is the height of the tree.

In this case, $h = 2$ , hence $2^{2 + 1} - 1 = 2^{3} - 1 = 7$ nodes.

References

[1]: https://www.geeksforgeeks.org/complete-binary-tree/

Tree Traversal

Breadth-First Search (BFS) and Depth-First Search (DFS) can both be used to traverse a tree. The two algorithms run in $O (n)$ for both time and space, where $n$ is the number of nodes in the tree.

DFS for Trees

There are three ways of traversing a tree by using a DFS approach, and these can be done both iteratively and recursively.

Pre-order traversal

The traversal order is:

Node.
Left.
Right.

The iterative and recursive solutions can be tested in LeetCode.

from typing import List, Optional


class TreeNode:
    def __init__(self, val,
                 left: Optional['TreeNode'] = None,
                 right: Optional['TreeNode'] = None) -> None:
        self.val = val
        self.left = left
        self.right = right


from typing import List, Optional


class Solution:
    def preorderTraversal(self, root: Optional[TreeNode] = None) -> List[int]:
        def dfs(node: Optional[TreeNode] = None) -> None:
            if not node:
                return

            nonlocal res
            res.append(node.val)

            for child in [node.left, node.right]:
                dfs(child)

        res = []
        dfs(root)

        return res

from typing import List, Optional


class TreeNode:
    def __init__(self, val,
                 left: Optional['TreeNode'] = None,
                 right: Optional['TreeNode'] = None) -> None:
        self.val = val
        self.left = left
        self.right = right


from typing import List, Optional


class Solution:
    def preorderTraversal(self, root: Optional[TreeNode]) -> List[int]:
        if not root:
            return []

        res = []
        stack = [root]
        while stack:
            node = stack.pop()
            res.append(node.val)

            for child in [node.right, node.left]:
                if child:
                    stack.append(child)

        return res

In-order traversal

The traversal order is:

Left.
Node.
Right.

The iterative and recursive solutions can be tested in LeetCode.

from typing import List, Optional


class TreeNode:
    def __init__(self, val,
                 left: Optional['TreeNode'] = None,
                 right: Optional['TreeNode'] = None) -> None:
        self.val = val
        self.left = left
        self.right = right


from typing import List, Optional


class Solution:
    def inorderTraversal(self, root: Optional[TreeNode] = None) -> List[int]:
        def dfs(node: Optional[TreeNode] = None) -> None:
            if not node:
                return

            dfs(node.left)

            nonlocal res
            res.append(node.val)

            dfs(node.right)

        res = []
        dfs(root)

        return res

from typing import List, Optional


class TreeNode:
    def __init__(self, val,
                 left: Optional['TreeNode'] = None,
                 right: Optional['TreeNode'] = None) -> None:
        self.val = val
        self.left = left
        self.right = right


from typing import List, Optional


class Solution:
    def inorderTraversal(self, root: Optional[TreeNode] = None) -> List[int]:
        if not root:
            return []

        res = []
        stack = [root]

        while stack:
            node = stack.pop()
            while node:
                stack.append(node)
                node = node.left

            if stack:
                node = stack.pop()
                res.append(node.val)
                stack.append(node.right)

        return res

Post-order traversal

The traversal order is:

Left.
Right.
Node.

The iterative and recursive solutions can be tested in LeetCode.

from typing import List, Optional


class TreeNode:
    def __init__(self, val,
                 left: Optional['TreeNode'] = None,
                 right: Optional['TreeNode'] = None) -> None:
        self.val = val
        self.left = left
        self.right = right


from typing import List, Optional


class Solution:
    def postorderTraversal(self, root: Optional[TreeNode] = None) -> List[int]:
        def dfs(node: Optional[TreeNode] = None) -> None:
            if not node:
                return

            dfs(node.left)
            dfs(node.right)

            nonlocal res
            res.append(node.val)

        res = []
        dfs(root)

        return res

from typing import List, Optional


class TreeNode:
    def __init__(self, val,
                 left: Optional['TreeNode'] = None,
                 right: Optional['TreeNode'] = None) -> None:
        self.val = val
        self.left = left
        self.right = right


from typing import List, Optional


class Solution:
    def postorderTraversal(self, root: Optional[TreeNode]) -> List[int]:
        if not root:
            return []

        res = []
        # Each element of stack is a tuple: (node, visited_flag)
        stack = [(root, False)]

        while stack:
            node, visited = stack.pop()
            if visited:
                res.append(node.val)
            else:
                stack.append((node, True))

                for child in [node.right, node.left]:
                    if child:
                        stack.append((child, False))

        return res

BFS for Trees

The BFS algorithm uses a queue to process all the nodes level-by-level in a tree:

from typing import List, Optional


class TreeNode:
    def __init__(self, val,
                 left: Optional['TreeNode'] = None,
                 right: Optional['TreeNode'] = None) -> None:
        self.val = val
        self.left = left
        self.right = right


from collections import deque
from typing import Deque, List, Optional


class Solution:
    def levelOrder(self, root: Optional[TreeNode] = None) -> List[List[int]]:
        res = []

        queue: Deque[int] = deque([])
        if root:
            queue.append(root)

        while queue:
            res_level = []
            for _ in range(len(queue)):
                node = queue.popleft()
                res_level.append(node.val)

                for child in [node.left, node.right]:
                    if child:
                        queue.append(child)

            res.append(res_level)

        return res

The DFS algorithm can also be used for a level-by-level traversal, even if it is less intuitive than using a queue:

from typing import List, Optional


class TreeNode:
    def __init__(self, val,
                 left: Optional['TreeNode'] = None,
                 right: Optional['TreeNode'] = None) -> None:
        self.val = val
        self.left = left
        self.right = right


from typing import List, Optional


class Solution:
    def levelOrder(self, root: Optional[TreeNode] = None) -> List[List[int]]:
        def dfs(level: int, node: Optional[TreeNode] = None) -> None:
            if not node:
                return

            nonlocal res
            if len(res) == level:
                res.append([])

            res[level].append(node.val)

            for child in [node.left, node.right]:
                if child:
                    dfs(level + 1, child)

        res = []
        dfs(0, root)

        return res

Both solutions can be tested in LeetCode.

Binary Search Tree (BST)

A Binary Search Tree is a type of Binary Tree, where each node has a value greater than all nodes in its left subtree and less than all nodes in its right subtree. This applies to all nodes in the tree.

Compared to a standard Binary Tree, a BST has a logarithmic time complexity for searching, inserting and deleting. This is true if the BST is balanced. In fact, for an unbalanced one, we are going to have a linear time complexity as for a normal Binary Tree.

Trie (Prefix Tree)

A Trie (pronounced as try), also known as Prefix Tree, is a Tree data structure that is used to efficiently store and locate strings with a predefined set.

As the same suggest, we can store prefixes to avoid to store common strings more than once.

For example, suppose that we have the following strings:

strings = [
    "tab",
    "table",
    "tank",
    "tart",
    "task",
]

We can notice that all of them share the same prefix ta. So, we could use a tree to store these single characters and avoid to duplicate them. Lastly, we can have a flag to mark a complete word (green node). In this way, we can distinguish between a normal character and a character that forms an actual string.

Implementation

You can implement this data structure by using a n-ary tree.

from typing import Dict, Optional


class TrieNode:
    def __init__(self, is_word: Optional[bool] = False) -> None:
        self.children: Dict[str, "TrieNode"] = {}
        self.is_word = is_word


class Trie:
    def __init__(self) -> None:
        self.head = TrieNode()

    def insert(self, word: str) -> None:
        p = self.head

        for char in word:
            if char not in p.children:
                p.children[char] = TrieNode()

            p = p.children[char]

        p.is_word = True

    def search(self, word: str) -> bool:
        return self._contains(word=word, match_word=True)

    def startsWith(self, prefix: str) -> bool:
        return self._contains(word=prefix, match_word=False)

    def _contains(self, word: str, match_word: bool) -> bool:
        p = self.head

        for char in word:
            if char not in p.children:
                return False

            p = p.children[char]

        return p.is_word if match_word else True

Operations

Let's check the complexity for the three Trie operations:

insert:
- Time: $O (n)$ .
- Space: $O (n)$ .
search:
- Time: $O (n)$ .
- Space: $O (1)$ .
startsWith:
- Time: $O (p)$ .
- Space: $O (1)$ .

Where $n$ is the size of the word and $p$ is the size of the prefix.

Graphs

A Graph is a collection of Vertices/Nodes (denoted as $V$ ) and Edges (denoted as $E$ ) that connect pairs of vertices.

If the Edges have a direction, then we have a Directed Graph:

Otherwise, we have an Undirected Graph:

For both graphs, the nodes are a, b, c and d. However, the edges differ between the two.

In fact, if we group the edges for the single nodes, we would have the following ones for the Directed graph:

a -> {b, c}
b -> {d}
c -> {d}
d -> {}

And these for the Undirected one:

a -> {b, c}
b -> {a, d}
c -> {a, d}
d -> {b, c}

So, we simply mapped the nodes with a list of so called Neighbors.

Connected and Disconnected Graphs

The previous Graphs were all Connected, since there was a path between each Node. However, we can also have Disconnected ones:

In this case, there is no connection between the nodes a, b and c, d.

Representation

The most common way to represent a Graph is within an Adjacency List, which can be a Python dictionary that maps the Node to the set of Neighbors.

Usually, a list of Edges is provided, where each element is either a list or a tuple of two elements. For example:

edges = [
    ["a", "b"],
    ["a", "c"],
]

Adjacency List for a Directed Graph

Since the Graph is Directed, we just need to consider a single dependency between two nodes (a single add call).

from collections import defaultdict

edges = [
    ("a", "b"),
    ("a", "c"),
]

graph = defaultdict(list)
for a, b in edges:
    graph[a].append(b)

Adjacency List for an Undirected Graph

For the Undirected one, we need a double dependency between the two nodes. In fact, node a has b as Neighbor, and vice versa.

from collections import defaultdict

edges = [
    ("a", "b"),
    ("a", "c"),
]

graph = defaultdict(list)
for a, b in edges:
    graph[a].append(b)
    graph[b].append(a)  # the only difference compared to the other implementation

Graph Traversal

As for Trees, we can traverse a Graph by using DFS or BFS.

The complexity of the two algorithms is $O (∣ V ∣ + ∣ E ∣)$ for both time and space, where:

$∣ V ∣$ ¹ represents the number of the Vertices/Nodes.
$∣ E ∣$ ¹ represents the number of the Edges.

In fact, we need $∣ E ∣$ time to build the Adjacency List and $∣ V ∣$ time to push and pop elements to and from the stack, respectively.

Regarding the space, we need to store $∣ E ∣$ edges in the dictionary and $∣ V ∣$ Vertices/Nodes in the set.

$∣ S ∣$ means the cardinality of the set $S$ , which is the number of elements.

DFS for Graphs

Let's apply a DFS algorithm to an actual problem, like Find if Path Exists in Graph.

The first thing to do is to understand which kind of Graph we have to build. The problem clearly specifies that we have a bi-directional graph, which means that the Graph is Undirected.

Now, we can build the actual Graph. In fact, only a list of Edges was provided, which is not a good data structure to use for representing it. So, let's apply the same approach explained before to build the Adjacency List.

We can either use a Recursive or Iterative approach to perform a DFS traversal to the Graph. You can notice that the code is very similar to the DFS Tree Traversal one. The only difference is in the way you get the adjacent nodes.

For a Binary Tree you have the left and right children (or a list of children for a n-ary Tree), whereas for a Graph you have a sequence of Neighbors.

Moreover, especially for the Directed Graphs, we need to keep a set of visited Nodes to avoid to revisit a Node. For a Tree we do not have such problem, since we cannot go back to the parent node from its children.

This is the Recursive approach:

from collections import defaultdict
from typing import DefaultDict, List, Set


class Solution:
    def validPath(
        self, n: int, edges: List[List[int]], source: int, destination: int
    ) -> bool:
        graph = self.buildAdjacencyList(edges)

        return self.dfs(graph, source, destination, set())

    def dfs(
        self, graph: DefaultDict[int, List], source: int, destination: int, visited: Set
    ) -> bool:
        if source in visited:
            return False

        if source == destination:
            return True

        visited.add(source)
        for neighbor in graph[source]:
            if self.dfs(graph, neighbor, destination, visited):
                return True

        return False

    def buildAdjacencyList(self, edges: List[List[int]]) -> DefaultDict[int, List]:
        graph = defaultdict(list)

        for a, b in edges:
            graph[a].append(b)
            graph[b].append(a)

        return graph

And this is the Iterative one. As usual, we can use a Stack to emulate the recursive calls.

from collections import defaultdict
from typing import DefaultDict, List


class Solution:
    def validPath(
        self, n: int, edges: List[List[int]], source: int, destination: int
    ) -> bool:
        graph = self.buildAdjacencyList(edges)

        return self.dfs(graph, source, destination)

    def dfs(self, graph: DefaultDict[int, List], source: int, destination: int) -> bool:
        visited = {source}
        nodes = [source]

        while nodes:
            node = nodes.pop()
            if node == destination:
                return True

            for neighbor in graph[node]:
                if neighbor not in visited:
                    nodes.append(neighbor)
                    visited.add(neighbor)

        return False

    def buildAdjacencyList(self, edges: List[List[int]]) -> DefaultDict[int, List]:
        graph = defaultdict(list)

        for a, b in edges:
            graph[a].append(b)
            graph[b].append(a)

        return graph

BFS for Graphs

The same assumptions made for the DFS approach apply here too. Here is a summary:

Identify the type of the Graph.
Build the Adjacency List.
For Directed Graphs, keep a visited set to avoid to revisit Nodes.

As for Trees, let's use a Queue to implement the BFS algorithm.

from collections import defaultdict, deque
from typing import DefaultDict, List


class Solution:
    def validPath(
        self, n: int, edges: List[List[int]], source: int, destination: int
    ) -> bool:
        graph = self.buildAdjacencyList(edges)

        return self.bfs(graph, source, destination)

    def bfs(self, graph: DefaultDict[int, List], source: int, destination: int) -> bool:
        visited = {source}
        nodes = deque([source])

        while nodes:
            node = nodes.popleft()
            if node == destination:
                return True

            for neighbor in graph[node]:
                if neighbor not in visited:
                    nodes.append(neighbor)
                    visited.add(neighbor)

        return False

    def buildAdjacencyList(self, edges: List[List[int]]) -> DefaultDict[int, List]:
        graph = defaultdict(list)

        for a, b in edges:
            graph[a].append(b)
            graph[b].append(a)

        return graph

Disjoint Sets (Union-Find)

A Disjoint Set (also known as a Union-Find data structure) is a data structure that stores a collection of disjoint (non-overlapping) Sets [1].

For example:

In this case, we have a total of 3 Disjoint Sets:

{0, 1}.
{2, 3, 4}.
{5}.

From now on, let's only use Union Find.

Common Operations

A Union Find data structure allows two main operations:

union(a, b): merge the sets to which the nodes a and b belong.
find(a): determine the set for the node a.

Implementation

There are two ways¹ to implement Union Find. The complexities below refer to the Time. For the Space, we need to build a list of n nodes to contain information about the nodes, hence it will always be $O (n)$ .

Quick Find:
- $O (1)$ for find operation.
- $O (n)$ for union operation.
Quick Union.
- $O (n)$ find operation.
- $O (n)$ union operation.

At a first look, Quick Find seems better, but there are several Quick Union optimization techniques we can use to drastically reduce the overall complexity.

We are going to use Find if Path Exists in Graph problem to apply such data structures.

References

[1]: https://en.wikipedia.org/wiki/Disjoint-set_data_structure

It is also common to have a third operation, which takes two nodes as input and return True if they belong to the same set, otherwise False.

Quick Find

For this kind of implementation, we store the root vertexes of each node.

For example:

Would have the following values:

index             :  0 1 2 3 4 5
value (root node) : [0 0 0 3 3 3]

Implementation

The implementation would look something like:

class UnionFind:
    def __init__(self, n: int) -> None:
        self.n = n
        self.roots = [i for i in range(self.n)]

    def union(self, a: int, b: int) -> None:
        self._validate(a)
        self._validate(b)

        root_a = self.find(a)
        root_b = self.find(b)
        if root_a == root_b:
            return

        for i, root in enumerate(self.roots):
            if root == root_b:
                self.roots[i] = root_a

    def find(self, a: int) -> int:
        return self.roots[a]

    def connected(self, a: int, b: int) -> bool:
        return self.find(a) == self.find(b)

    def _validate(self, a: int) -> None:
        assert 0 <= a < self.n

Usage

As explained earlier, let's apply the following data structure to Find if Path Exists in Graph.

from typing import List


class Solution:
    def validPath(
        self, n: int, edges: List[List[int]], source: int, destination: int
    ) -> bool:
        union_find = self.build_disjoint_set(n, edges)

        return union_find.connected(source, destination)

    def build_disjoint_set(self, n: int, edges: List[List[int]]) -> UnionFind:
        union_find = UnionFind(n)

        for a, b in edges:
            union_find.union(a, b)

        return union_find

Although this implementation is valid, we will get a Time Limit Exceeded for this problem.

Quick Union

For this kind of implementation, we store the parent vertexes of each node.

For example:

Would have the following values:

index             :  0 1 2 3 4 5
value (root node) : [0 0 1 3 3 4]

Implementation

The implementation would look something like:

class UnionFind:
    def __init__(self, n: int) -> None:
        self.n = n
        self.parents = [i for i in range(self.n)]

    def union(self, a: int, b: int) -> None:
        self._validate(a)
        self._validate(b)

        parent_a = self.find(a)
        parent_b = self.find(b)
        if parent_a != parent_b:
            self.parents[parent_b] = parent_a

    def find(self, a: int) -> int:
        while a != self.parents[a]:
            a = self.parents[a]

        return a

    def connected(self, a: int, b: int) -> bool:
        return self.find(a) == self.find(b)

    def _validate(self, a: int) -> None:
        assert 0 <= a < self.n

Usage

Compared to the Quick Find, we have passed more test cases in Find if Path Exists in Graph, but this is not enough. We are still getting a Time Limit Exceeded.

from typing import List


class Solution:
    def validPath(
        self, n: int, edges: List[List[int]], source: int, destination: int
    ) -> bool:
        union_find = self.build_disjoint_set(n, edges)

        return union_find.connected(source, destination)

    def build_disjoint_set(self, n: int, edges: List[List[int]]) -> UnionFind:
        union_find = UnionFind(n)

        for a, b in edges:
            union_find.union(a, b)

        return union_find

Union by Rank in Quick Union

The Union by Rank is an optimization technique that can be applied to the Quick Union implementation.

In the previous implementations of Quick Find and Quick Union, we always chose the first node as the dominant one.

However, this could lead to a skewed list of nodes, which looks like a Linked List. As we know, for this data structure the time complexity to find an element is $O (n)$ . Therefore, the same applies here.

Instead, we could use the highest height of the set that belongs to either node a or b to choose the root node.

Now you might ask: why should we select the highest height?

Let's look at an example:

If you want to union 1 and 4, you will end up having two possible scenarios.

After calculating the parent nodes of 1 and 4, which are 0 and 4, respectively, we could build a set with 0 (height is 3) as a root node, and another with 4 (height is 0).

We will end up having two different sets:

As you might notice, choosing the root of the node with the lowest height, has given us a skewed tree.

Implementation

Besides keeping a list of parents as we did in Quick Union, we also need a rank value for each node, which indicates its height. Then, we will use this value to pick our root node.

class UnionFind:
    def __init__(self, n: int) -> None:
        self.n = n
        self.parents = [i for i in range(self.n)]
        self.ranks = [1] * self.n

    def union(self, a: int, b: int) -> None:
        self._validate(a)
        self._validate(b)

        parent_a = self.find(a)
        parent_b = self.find(b)
        if parent_a == parent_b:
            return

        rank_a = self.ranks[a]
        rank_b = self.ranks[b]
        same_rank = rank_a == rank_b

        if rank_a <= rank_b:
            self.parents[parent_a] = parent_b
            if same_rank:
                self.ranks[b] += 1
        else:
            self.parents[parent_b] = parent_a

    def find(self, a: int) -> int:
        while a != self.parents[a]:
            a = self.parents[a]

        return a

    def connected(self, a: int, b: int) -> bool:
        return self.find(a) == self.find(b)

    def _validate(self, a: int) -> None:
        assert 0 <= a < self.n

If we apply this approach to the Find if Path Exists in Graph problem, finally our code will be accepted.

from typing import List


class Solution:
    def validPath(
        self, n: int, edges: List[List[int]], source: int, destination: int
    ) -> bool:
        union_find = self.build_disjoint_set(n, edges)

        return union_find.connected(source, destination)

    def build_disjoint_set(self, n: int, edges: List[List[int]]) -> UnionFind:
        union_find = UnionFind(n)

        for a, b in edges:
            union_find.union(a, b)

        return union_find

Complexity

The space complexity is $O (n)$ , since we have to store all parents and ranks in two lists of $n$ elements ( $O (2 n))$ , hence $O (n)$ ).

Regarding the time complexity, the union function just depends on the find one.

This ranking approach ensures that the complexity is $O (h)$ , where $h$ is the height of the set, which is always balanced thanks to this ranking properties. Hence, the final complexity will be $O (l o g (n))$ .

Lastly, since we need to union $m$ edges, before finding out if two nodes are connected, we need $O (m l o g (n))$ time in total.

Path Compression

The Path Compression is an optimization technique, which can be applied to the Quick Union implementation.

Let's suppose to have a skewed scenario like:

A find applied on node 3 would traverse all nodes. Calling the function again would cause the same number of iterations.

However, what if we use the find function to group the nodes into the root one?

Basically, the example above would become the following after calling find(3). The next find(3) will be must faster.

Implementation

As explained above, only the find must be changed to accomplish such optimization:

class UnionFind:
    def __init__(self, n: int) -> None:
        self.n = n
        self.parents = [i for i in range(self.n)]

    def union(self, a: int, b: int) -> None:
        self._validate(a)
        self._validate(b)

        parent_a = self.find(a)
        parent_b = self.find(b)
        if parent_a != parent_b:
            self.parents[parent_b] = parent_a

    def find(self, a: int) -> int:
        if a == self.parents[a]:
            return a

        self.parents[a] = self.find(self.parents[a])

        return self.parents[a]

    def connected(self, a: int, b: int) -> bool:
        return self.find(a) == self.find(b)

    def _validate(self, a: int) -> None:
        assert 0 <= a < self.n

If we apply such approach in the Find if Path Exists in Graph problem, finally our code will be accepted.

from typing import List


class Solution:
    def validPath(
        self, n: int, edges: List[List[int]], source: int, destination: int
    ) -> bool:
        union_find = self.build_disjoint_set(n, edges)

        return union_find.connected(source, destination)

    def build_disjoint_set(self, n: int, edges: List[List[int]]) -> UnionFind:
        union_find = UnionFind(n)

        for a, b in edges:
            union_find.union(a, b)

        return union_find

Complexity

The space complexity is $O (n)$ , since we have to store all parents in a list of $n$ elements.

Regarding the time complexity, the union function just depends on the find one.

This kind of balancing allow us to achieve a similar time complexity compared to Quick Union Find by Rank, hence, $O (l o g (n))$ .

Lastly, since we need to union $m$ edges, before finding out if two nodes are connected, we need $O (m l o g (n))$ time in total.

Union by Rank + Path Compression

These two optimizations can be combined to achieve a $O (α (n))$ , where $α$ is the inverse Ackermann function, which can be considered $O (1)$ on average.

Implementation

class UnionFind:
    def __init__(self, n: int) -> None:
        self.n = n
        self.parents = [i for i in range(self.n)]
        self.ranks = [1] * self.n

    def union(self, a: int, b: int) -> None:
        self._validate(a)
        self._validate(b)

        parent_a = self.find(a)
        parent_b = self.find(b)
        if parent_a == parent_b:
            return

        rank_a = self.ranks[a]
        rank_b = self.ranks[b]
        same_rank = rank_a == rank_b
        if rank_a <= rank_b:
            self.parents[parent_b] = parent_a
            if same_rank:
                self.ranks[b] += 1
        else:
            self.parents[parent_a] = parent_b

    def find(self, a: int) -> int:
        if a == self.parents[a]:
            return a

        self.parents[a] = self.find(self.parents[a])

        return self.parents[a]

    def connected(self, a: int, b: int) -> bool:
        return self.find(a) == self.find(b)

    def _validate(self, a: int) -> None:
        assert 0 <= a < self.n

Again, applied to Find if Path Exists in Graph problem, our solution will be accepted.

from typing import List


class Solution:
    def validPath(
        self, n: int, edges: List[List[int]], source: int, destination: int
    ) -> bool:
        union_find = self.build_disjoint_set(n, edges)

        return union_find.connected(source, destination)

    def build_disjoint_set(self, n: int, edges: List[List[int]]) -> UnionFind:
        union_find = UnionFind(n)

        for a, b in edges:
            union_find.union(a, b)

        return union_find

Topological Sorting

Topological Sorting is a linear ordering of Vertices in a Directed Acyclic Graph (also known as DAG), such that for every directed edge $(u, v)$ , vertex $u$ appears before $v$ [1]. There might be more than one orders. In fact, for the following graph:

We can have these three orders:

[a, b, c, d]
[a, c, b, d]
[a, c, d, b]

Usage

This sorting is widely used to schedule tasks, solve dependencies and plan course prerequisite.

The Kahn's algorithm can be used to implement such sorting. It uses in-degrees of vertices. The vertices with zero in-degrees are repeatedly removed and the corresponding neighbors' in-degrees are updated accordingly.

Example

The Course Schedule is a classic problem where we can apply Topological Sorting by using the Kahn's algorithm.

References

[1]: https://en.wikipedia.org/wiki/Topological_sorting

Weighted Graphs

The type of edges of a Graph allows us to defined either Directed or Undirected graphs. However, the edges could also have weights, which represent the cost of moving between the vertices connected by these edges.

Shortest Path

We cannot use the standard BFS traversal to calculate the shortest path between two vertices. In fact, this implementation simply assumes the same cost for each edge, which is not true for weighted ones.

To overcome this problem, we could use some well-known algorithms.

For all of them, we are using the Network Delay Time problem to apply such algorithms.

Bellman–Ford algorithm

The Bellman–Ford algorithm can be applied to Graphs that have both positive and negative edges. However, it is slower compared to other ones. In fact, the algorithms runs in $O (∣ V ∣∣ E ∣)$ time.

from collections import deque
from typing import Dict, List, Tuple


class Solution:
    def networkDelayTime(self, times: List[List[int]], n: int, k: int) -> int:
        graph = self.get_adjacency_list(times, n)

        return self.bfs(graph, k)

    def bfs(self, graph: Dict[int, List[Tuple[int, int]]], root: int) -> int:
        queue = deque([root])
        # the first node is `1`, not `0`
        # hence, for simplicity, we create a list of `n + 1` values
        distances = [float("inf")] * (len(graph) + 1)
        distances[root] = 0
        # we have to exclude it once we calculate the maximum below
        distances[0] = float("-inf")

        while queue:
            node = queue.popleft()
            for neighbor, weight in graph[node]:
                new_distance = distances[node] + weight
                if distances[neighbor] <= new_distance:
                    continue

                queue.append(neighbor)
                distances[neighbor] = new_distance

        ans = max(distances)
        return -1 if ans == float("inf") else ans

    def get_adjacency_list(
        self, times: List[List[int]], n: int
    ) -> Dict[int, List[Tuple[int, int]]]:
        graph: Dict[int, List[Tuple[int, int]]] = {i: [] for i in range(1, n + 1)}

        for a, b, weight in times:
            graph[a].append((b, weight))

        return graph

Dijkstra's algorithm

The Dijkstra's algorithm can be applied to Graphs that have only positive edges. However, it is faster compared to the Bellman–Ford algorithm. In fact, the algorithms runs in $O ((∣ V ∣ + ∣ E ∣) l o g (∣ V ∣))$ time.

import heapq
from typing import Dict, List, Tuple


class Solution:
    def networkDelayTime(self, times: List[List[int]], n: int, k: int) -> int:
        graph = self.get_adjacency_list(times, n)

        return self.bfs(graph, k)

    def bfs(self, graph: Dict[int, List[Tuple[int, int]]], root: int) -> int:
        queue = [(0, root)]
        # the first node is `1`, not `0`
        # hence, for simplicity, we create a list of `n + 1` values
        distances = [float("inf")] * (len(graph) + 1)
        distances[root] = 0
        # we have to exclude it once we calculate the maximum below
        distances[0] = float("-inf")

        while queue:
            _, node = heapq.heappop(queue)
            for neighbor, weight in graph[node]:
                new_distance = distances[node] + weight
                if distances[neighbor] <= new_distance:
                    continue

                heapq.heappush(queue, (new_distance, neighbor))
                distances[neighbor] = new_distance

        ans = max(distances)
        return -1 if ans == float("inf") else ans

    def get_adjacency_list(
        self, times: List[List[int]], n: int
    ) -> Dict[int, List[Tuple[int, int]]]:
        graph: Dict[int, List[Tuple[int, int]]] = {i: [] for i in range(1, n + 1)}

        for a, b, weight in times:
            graph[a].append((b, weight))

        return graph

Intervals

Interval problems are often provided as a List as input, where each element is either a List or a Tuple of two elements.

      «-----»
      a     b

            «--------»
            c        d

                          «--------»
                          e        f

The most common operations we can perform on a list of intervals are:

Checking if they overlap.
Merging overlapping intervals.

Overlapping intervals

Let's take a simple interval (a, b) as an example:

      «-----»
      a     b

For another interval (c, d) to overlap, the c point must be somewhere between a and b:

      «-----»
      a     b

        «---...
        c

Which forms the first condition $b \geq c$ . However, this not sufficient for the overlapping. In fact, let's have a look at this example:

      «-----»
      a     b

«-»
c d

So, also the d must be in a certain position compared to the other interval, which is behind a:

      «-----»
      a     b

«--------»
c        d

Then, our overlapping condition is: $b \geq c \land d \geq a$ ¹.

Merging intervals

Again, let's draw some intervals to get the idea:

    «-------»
    a       b

«--------»
c        d

    «-------»
    a       b

        «--------»
        c        d

It looks way simpler now:

x, y = [
    max(a, c),
    min(b, d),
]

You can use this simple trick to memorize the formula. The symbol is always $\geq$ , so nothing special here. For the actual condition, just memorize that you start from b and then go till a, thus bc and da. Yes, this is not rocket science. 🚀

Heaps

A Heap is a Tree-based data structure that satisfies the heap property [1]:

Min Heap: for any node $n$ , its parent $p$ is $\leq n$ .
Max Heap: for any node $n$ , its parent $p$ is $\geq n$ .

The Heap is an extremely efficient implementation of the Priority Queue abstract data type. In fact, priority queues are often simply called "heaps" regardless of their specific implementation [1].

A Priority Queue allows very similar operations compared to Queues, but with the difference that we can also assign a priority to the items, which could prioritize or de-prioritize the item based on its value.

Operations

Let's check the complexity of heaps' common operations:

Operation	Time Complexity	Space Complexity	Syntax
Size	$O (1)$	$O (1)$	`len(heap)`
Heapify¹	$O (n)$	$O (n)$	`heapq.heapify(lst)`
Minimum	$O (l o g (n))$	$O (1)$	`min_heap[0]`
Maximum	$O (l o g (n))$	$O (1)$	`max_heap[0]`

Usage

Check the heap section in the Python cheatsheet.

References

[1]: https://en.wikipedia.org/wiki/Heap_(data_structure)

Heapify is the process of converting a list into a heap.

Backtracking

Backtracking algorithms heavily use DFS¹ to enumerate all the possible solutions to a problem. These candidates are expressed as the nodes of a tree structure, which is the potential Search Tree [1].

Examples

The Subsets is a classic problem suitable for Backtracking.

At each call, we can decide whether or not to include the current element. This decision requires two different function calls that will derive the following Search Tree from the input [1, 2, 3].

If we collect the results of the 8 leaf nodes, we will get all our subsets.

Analyze the Complexity

The Time Complexity is proportional to the number of nodes in the Search Tree, multiplied by the amount of work we do in each call.

We can use the formula $∣ c hi l d re n ∣^{h e i g h t}$ to derive the number of nodes. If we apply this formula to the previous problem, we get $∣ c hi l d re n ∣ = 2$ and $h e i g h t = 3$ , where the height is actually the number of nodes in the input. Plus, we need to consider the work we do to append the solutions to the final result list, which is an additional $O (n)$ time. This would make the entire complexity to $O (n \cdot 2^{n})$ .

The Space Complexity is proportional to the height of the Search Tree. In fact, once we reach a leaf node, we simply terminate our search and go back to the parent nodes.

References

[1]: https://en.wikipedia.org/wiki/Backtracking

We already saw the DFS algorithm applied to both Trees and Graphs.

Dynamic Programming

Dynamic Programming problems can be solved by a Bottom-up approach and Top-down one.

In the former, we solve the sub-problems first and then build up the solution. In the latter, we use a very similar approach compared to Backtracking: we still build our Search Tree, but we also avoid to re-calculate overlapping problems. In order to do so, we can adopt a technique called Memoization, which is a simple container of our solutions.

I personally prefer the Top-down one, since it comes more naturally to implement once you have derived the Search Tree.

Example

A classic problem to apply Dynamic Programming is the Fibonacci one. The naive solution would be something like:

def fib(n: int) -> int:
    if n <= 1:
        return n

    return fib(n - 1) + fib(n - 2)

Which has a Time and Space complexity of $O (2^{n})$ and $O (n)$ , respectively.

To understand how to apply Dynamic Programming, let's derive the Search Tree of the previous snippet for fib(4).

As you can see, there are two calls for fib, but do we actually need both of them?

The answer is no, and we can use Memoization to store the result of fib(2). A Dictionary suits very well to perform such task.

from typing import Dict


def fib(n: int) -> int:
    def dfs(n: int, memo: Dict[int, int]) -> int:
        assert n >= 0

        if n <= 1:
            return n
        if n in memo:
            return memo[n]

        res = 0
        res += dfs(n - 1, memo)
        res += dfs(n - 2, memo)

        memo[n] = res
        return memo[n]

    return dfs(n, {})

Thanks to this kind of cache, we are able to avoid some function calls and this reduces the previous Space Tree to:

Which means that we also reduce our Time complexity to $O (n)$ .

`functools.lru_cache`

The same result can be achieved by using the lru_cache decorator from functools.

from functools import lru_cache


@lru_cache
def fib(n: int) -> int:
    if n <= 1:
        return n

    return fib(n - 1) + fib(n - 2)

Divide and Conquer

A Divide and Conquer algorithm recursively breaks down a problem into two or more sub-problems of the same or related type, until these become simple enough to be solved. The solutions to the sub-problems are then combined to give a solution to the original problem [1].

Merge Sort

The Merge Sort is a classic Divide and Conquer algorithm [2]. This algorithm can be divided into two steps [2]:

Divide the input list into n sub-lists, each containing one element.
Repeatedly merge these sub-lists to produce new sorted sub-lists. This merge process is repeated until there is only one sub-list remaining.

from typing import List


class Solution:
    def sortArray(self, nums: List[int]) -> List[int]:
        return self.merge_sort(nums)

    def merge_sort(self, nums: List[int]) -> List[int]:
        """
        Step 1) described above
        """
        if len(nums) <= 1:
            return nums

        mid = len(nums) // 2
        left = self.merge_sort(nums[:mid])
        right = self.merge_sort(nums[mid:])

        return self.merge(left, right)

    def merge(self, a: List[int], b: List[int]) -> List[int]:
        """
        Step 2) described above
        """
        res = []
        n, m = len(a), len(b)
        i = j = 0

        while i < n and j < m:
            if a[i] < b[j]:
                res.append(a[i])
                i += 1
            else:
                res.append(b[j])
                j += 1

        # add remained elements from `a` (if any)
        res.extend(a[i:])
        # add remained elements from `b` (if any)
        res.extend(b[j:])

        return res

The solution can be verified here.

References

Python for Coding Interviews

This section summarizes all the Python constructs you need to solve any LeetCode problem. I would be quite surprised if you need something else to do so.

All the information that you see here comes from a repository I created a few years back, which is python-for-coding-interviews.

Whatever you find here, it is also in the repository, where you can find a PDF. Feel free to open PRs in the repository if you think that something is missing.

In any case, I will also include a Python section in every topic, where I will explain the actual things you need to know and likely use.

🐍 Enjoy! 🐍

Primitive Types

Booleans (bool).
Integers (int).
Floats (float).
Strings (str).

# Define variables
>>> b, i, f, s = True, 12, 8.31, 'Hello, world!'
>>> type(b)  # <class 'bool'>
>>> type(i)  # <class 'int'>   ~ Unbounded
>>> type(f)  # <class 'float'> ~ Bounded
>>> type(s)  # <class 'str'>

# Type Conversion
>>> str(i)
'12'
>>> float(i)
12.0
>>> str(b)
'True'
>>> int('10')
10
>>> int('10a')  # `ValueError: invalid literal for int() with base 10: '10a'`

# Operations
>>> 5 * 2
10
>>> 5 * 2.
10.0
>>> 5 / 2
2.5
>>> 5 // 2  # `//` is the integer division
2
>>> 5 % 2
1

# `min` and `max`
>>> min(4, 2)
2
>>> max(21, 29)
29

# Some useful math functions
>>> abs(-1.2)
1.2
>>> divmod(9, 4)
(2, 1)
>>> 2 ** 3  # Equivalent to `pow(2, 3)`
8

# Math functions from the `math` package
>>> import math
>>> math.ceil(7.2)
8
>>> math.floor(7.2)
7
>>> math.sqrt(4)
2.0

# Pseudo lower and upper bounds
>>> float('-inf')  # Pseudo min-int
-inf
>>> float('inf')  # Pseudo max-int
inf

# Pseudo lower and upper bounds (Python >= 3.5)
>>> import math

>>> math.inf
inf
>>> -math.inf
-inf

`range` and `enumerate`

# `range`
>>> list(range(3))  # Equivalent to `range(0, 3)`
[0, 1, 2]
>>> list(range(1, 10, 2))
[1, 3, 5, 7, 9]
>>> for i in range(3): print(i)
0
1
2
>>> for i in range(2, -1, -1): print(i)  # Equivalent to `reversed(range(3))`
2
1
0

# `enumerate`
>>> for i, v in enumerate(range(3)): print(i, v)
0 0
1 1
2 2
>>> for i, v in enumerate(range(3), start=10): print(i, v)
10 0
11 1
12 2

# Reversed `enumerate`
>>> for i, v in reversed(list(enumerate(['a', 'b', 'c']))): print(i, v)
2 c
1 b
0 a

Tuples

>>> t = (1, 2, 'str')
>>> type(t)
<class 'tuple'>
>>> t
(1, 2, 'str')
>>> len(t)
3

>>> t[0] = 10  # Tuples are immutable: `TypeError: 'tuple' object does not support item assignment`

>>> a, b, c = t  # Unpacking
>>> a
1
>>> b
2
>>> c
'str'
>>> a, _, _ = t  # Unpacking: ignore second and third elements
>>> a
1

Lists

Python uses Timsort algorithm in sort and sorted (https://en.wikipedia.org/wiki/Timsort).

# Define a list
>>> l = [1, 2, 'a']
>>> type(l)  # <class 'list'>
>>> len(l)
3
>>> l[0]  # First element of the list
1
>>> l[-1]  # Last element of the list (equivalent to `l[len(l) - 1]`)
'a'

# Slicing
>>> l[:]  # `l[start:end]` which means `[start, end)`
[1, 2, 'a']
>>> l[0:len(l)]  # `start` is 0 and `end` is `len(l)` if omitted
[1, 2, 'a']

# Some useful methods
>>> l.append('b')  # `O(1)`
>>> l.pop()  # `O(1)` just for the last element
'b'
>>> l.pop(0)  # `O(n)` since list must be shifted
1
>>> l
[2, 'a']
>>> l.remove('a')  # `O(n)`
>>> l.remove('b')  # `ValueError: list.remove(x): x not in list`
>>> l
[2]
>>> l.index(2)  # It returns first occurrence (`O(n)`)
0
>>> l.index(12)  # `ValueError: 12 is not in list`

# More compact way to define a list
>>> l = [0] * 5
>>> l
[0, 0, 0, 0, 0]
>>> len(l)
5
>>> [k for k in range(5)]
[0, 1, 2, 3, 4]
>>> [k for k in reversed(range(5))]
[4, 3, 2, 1, 0]

# Compact way to define 2D arrays
>>> rows, cols = 2, 3
>>> m = [[0] * cols for _ in range(rows)]
>>> len(m) == rows
True
>>> all(len(m[k]) == cols for k in range(rows))
True

# Built-in methods
>>> l = [3, 1, 2, 0]
>>> len(l)
4
>>> min(l)
0
>>> max(l)
3
>>> sum(l)
6
>>> any(v == 3 for v in l)
True
>>> any(v == 5 for v in l)
False
>>> all(v >= 0 for v in l)
True

# Sort list in-place (`sort`)
>>> l = [10, 2, 0, 1]
>>> l
[10, 2, 0, 1]
>>> l.sort()  # It changes the original list
>>> l
[0, 1, 2, 10]
>>> l.sort(reverse=True)  # It changes the original list
>>> l
[10, 2, 1, 0]

# Sort a list a return a new one (`sorted`)
>>> l = [10, 2, 0, 1]
>>> sorted(l)  # It returns a new list
[0, 1, 2, 10]
>>> l  # Original list is not sorted
[10, 2, 0, 1]

# Sort by a different key
>>> students = [
    ('Mark', 21),
    ('Luke', 20),
    ('Anna', 18),
]
>>> sorted(students, key=lambda s: s[1])  # It returns a new list
[('Anna', 18), ('Luke', 20), ('Mark', 21)]
>>> students.sort(key=lambda s: s[1])  # In-place
>>> students
[('Anna', 18), ('Luke', 20), ('Mark', 21)]

Strings

>>> s = 'Hello, world!'
>>> type(s)  # <class 'str'>
>>> len(s)
13

>>> s[0] = 'h'  # Strings are immutable: `TypeError: 'str' object does not support item assignment`
>>> s += ' Another string'  # A new string will be created, so concatenation is quite slow

>>> s = 'Hello'
>>> l = list(s)
>>> l
['H', 'e', 'l', 'l', 'o']
>>> l[0] = 'h'
>>> ''.join(l)
'hello'

>>> 'lo' in s
True
>>> ord('a')
97
>>> chr(97)
'a'

Stacks

>>> stack = []  # We can use a normal list to simulate a stack

>>> stack.append(0)  # `O(1)`
>>> stack.append(1)
>>> stack.append(2)

>>> len(stack)
3

>>> stack[0]  # Bottom of the stack
0
>>> stack[-1]  # Top of the stack
2

>>> stack.pop()  # `O(1)`
2
>>> stack.pop()
1

>>> len(stack)
1

>>> stack.pop()
0

>>> stack.pop()  # `IndexError: pop from empty list`
>>> stack[-1]    # `IndexError: pop from empty list`

Queues

>>> from collections import deque

>>> queue = deque()

# Enqueue -> append()
>>> queue.append(0)  # `O(1)`
>>> queue.append(1)
>>> queue.append(2)

>>> len(queue)
3

>>> queue[0]  # Head of the queue
0
>>> queue[-1]  # Tail of the queue
2

# Dequeue -> popleft()
>>> queue.popleft()  # `O(1)`
0
>>> queue.popleft()
1

>>> len(queue)
1

>>> queue.popleft()
2

>>> len(queue)
0

>>> queue.popleft()  # `IndexError: pop from an empty deque`
>>> queue[0]  # `IndexError: pop from an empty deque`

Sets

>>> s = set()
>>> s.add(1)
>>> s.add(2)
>>> s
{1, 2}
>>> len(s)
2
>>> s.add(1)  # Duplicate elements are not allowed per definition
>>> s
{1, 2}
>>> s.add('a')  # We can mix types
>>> s
{1, 2, 'a'}
>>> 1 in s  # `O(1)`
True
>>> s.remove(1)
>>> s
{2, 'a'}
>>> s.remove(1)  # `KeyError: 1`
>>> s.discard(1)  # It won't raise `KeyError` even if the element does not exist
>>> s.pop()  # Remove and return an arbitrary element from the set
2

>>> s0 = {1, 2, 'a'}
>>> s0
{1, 2, 'a'}
>>> s1 = set([1, 2, 'a'])
>>> s1
{1, 2, 'a'}

>>> s0 = {1, 2}
>>> s1 = {1, 3}
>>> s0 | s1
{1, 2, 3}
>>> s0.union(s1)  # New set will be returned
{1, 2, 3}

>>> s0 = {1, 2}
>>> s1 = {1, 3}
>>> s0 & s1
{1}
>>> s0.intersection(s1)  # New set will be returned
{1}

>>> s0 = {1, 2}
>>> s1 = {1, 3}
>>> s0 - s1
{2}
>>> s0.difference(s1)
{2}

Hash Tables

>>> d = {'a': 'hello, world', 'b': 11}  # Equivalent to `dict(a='hello, world', b=11)`
>>> type(d)
<class 'dict'>
>>> d
{'a': 'hello, world', 'b': 11}

>>> d.keys()
dict_keys(['a', 'b'])
>>> d.values()
dict_values(['hello, world', 11])
>>> for k, v in d.items(): print(k, v)
a hello, world
b 11

>>> 'a' in d  # `O(1)`
True
>>> 1 in d
False
>>> d['a'] += '!'
>>> d
{'a': 'hello, world!', 'b': 11}
>>> d[1] = 'a new element'
>>> d
{'a': 'hello, world!', 'b': 11, 1: 'a new element'}

>>> d[0] += 10  # `KeyError: 0`
>>> d.get(0, 1)  # Return `1` as default value since key `0` does not exist
1
>>> d.get(1, '?')  # Key `1` exists, so the actual value will be returned
'a new element'
>>> d.get(10) is None
True

Heaps

The following commands show how to work with a min heap. Currently, Python does not have public methods for the max heap. You can overcome this problem by applying one of the following strategies:

Invert the value of each number. So, for example, if you want to add 1, 2 and 3 in the min heap, you can heappush -3, -2 and -1. When you heappop you invert the number again to get the proper value. This solution clearly works if your domain is composed by numbers >= 0.
Invert your object comparison.

>>> import heapq

>>> min_heap = [3, 2, 1]
>>> heapq.heapify(min_heap)
>>> min_heap
[1, 2, 3]

>>> min_heap = []
>>> heapq.heappush(min_heap, 3)  # `O(log n)`
>>> heapq.heappush(min_heap, 2)
>>> heapq.heappush(min_heap, 1)

>>> min_heap
[1, 3, 2]
>>> len(min_heap)
>>> min_heap[0]
1
>>> heapq.heappop(min_heap)  # `O(log n)`
1
>>> min_heap
[2, 3]

>>> heapq.heappop(min_heap)
2
>>> heapq.heappop(min_heap)
3
>>> heapq.heappop(min_heap)  # `IndexError: index out of range`

collections

Container data types in the (collections package).

collections.namedtuple

>>> from collections import namedtuple

>>> Point = namedtuple('Point', 'x y')

>>> p0 = Point(1, 2)
>>> p0
Point(x=1, y=2)
>>> p0.x
1
>>> p0.y
2

>>> p1 = Point(x=1, y=2)
>>> p0 == p1
True

# Python >= 3.6.1
>>> from typing import NamedTuple

>>> class Point(NamedTuple):
        x: int
        y: int

>>> p0 = Point(1, 2)
>>> p1 = Point(x=1, y=2)
>>> p0 == p1
True

collections.defaultdict

>>> from collections import defaultdict

>>> d = defaultdict(int)
>>> d['x'] += 1
>>> d
defaultdict(<class 'int'>, {'x': 1})
>>> d['x'] += 2
>>> d
defaultdict(<class 'int'>, {'x': 3})
>>> d['y'] += 10
>>> d
defaultdict(<class 'int'>, {'x': 3, 'y': 10})

>>> d = defaultdict(list)
>>> d['x'].append(1)
>>> d['x'].append(2)
>>> d
defaultdict(<class 'list'>, {'x': [1, 2]})

collections.Counter

>>> from collections import Counter

>>> c = Counter('abcabcaa')
>>> c
Counter({'a': 4, 'b': 2, 'c': 2})
>>> c.keys()
dict_keys(['a', 'b', 'c'])
>>> c.items()
dict_items([('a', 4), ('b', 2), ('c', 2)])
>>> for k, v in c.items(): print(k, v)
a 4
b 2
c 2
>>> c['d']  # It acts as a `defaultdict` for missing keys
0

collections.OrderedDict

Please, notice that, since Python 3.6, the order of items in a dictionary is guaranteed to be preserved.

>>> from collections import OrderedDict

>>> d = OrderedDict()

>>> d['first'] = 1
>>> d['second'] = 2
>>> d['third'] = 3
>>> d
OrderedDict([('first', 1), ('second', 2), ('third', 3)])

>>> for k, v in d.items(): print(k, v)
first 1
second 2
third 3

Grind 75

Grind 75 is an updated list of problems from the author of Blind 75.

This list actually contains a total of 169 questions. However, I will start from the first 75 and then add the other ones gradually.

Lastly, I have just replaced String to Integer (atoi) with Two Sum II - Input Array Is Sorted, which is much more useful.

Approach a Problem

The Tech Interview Handbook provides a great list of steps that could be used to approach and solve coding interview questions.

Be sure to familiarize with such steps and try to apply them also when you solve LeetCode questions by yourself.

Two Sum

The problem description can be found in LeetCode.

Approach

A brute force approach would consist of generating all possible pairs in the input list and calculate the sum. This would require $O (n^{2})$ time and $O (1)$ space. The code would look something like this:

from typing import List


class Solution:
    def twoSum(self, nums: List[int], target: int) -> List[int]:
        n = len(nums)

        for i in range(n):
            for j in range(i + 1, n):
                curr_sum = nums[i] + nums[j]
                if curr_sum == target:
                    return [i, j]

        raise Exception("expected one solution")

Of course, we can do better in term of time. For sure, we have to iterate until the end of the list, but then we need an auxiliary data structure to use to memorize something.

We could store the complement of the i-th number and then look for it the next iteration. If the i-th number is already in our collection, then it means that we found the other addend.

For a quick look-up table, we can use a Python dictionary.

Let's take an example:

nums = [1, 2, 10]
target = 3

Initially, our dictionary will be empty. The first number to check is 1, which means that our complement is 2 (target - current num). So our dictionary will be the following after the first iteration:

d = {
    2: 0,
}

Where the key is the complement, and the value is the index (remember that we have to return the index of the two numbers).

Then, in the second iteration, we can see that number 2 is already in the dictionary, which means that we found the second addend. Hence, we can return the index in the dictionary entry and the current one.