Algorithms & Data Structures Crash Course

Master Essential Algorithms and Data Structures

Lesson 1: Big O Notation & Complexity Analysis

What is Big O Notation?

Big O notation describes the performance or complexity of an algorithm. It specifically describes the worst-case scenario, and can be used to describe the execution time required or the space used by an algorithm.

Key Concept: Big O measures how runtime or space requirements grow as input size increases, not the actual runtime in seconds.

Common Time Complexities

Notation Name Example Description
O(1) Constant Array access Always takes the same time
O(log n) Logarithmic Binary search Divides problem in half each time
O(n) Linear Linear search Time grows linearly with input
O(n log n) Linearithmic Merge sort Efficient sorting algorithms
O(n²) Quadratic Bubble sort Nested loops over input
O(2ⁿ) Exponential Recursive Fibonacci Doubles with each addition to input

Examples of Time Complexity

# O(1) - Constant Time def get_first_element(arr): return arr[0] # Always one operation # O(n) - Linear Time def find_max(arr): max_val = arr[0] for num in arr: # n iterations if num > max_val: max_val = num return max_val # O(n²) - Quadratic Time def bubble_sort(arr): n = len(arr) for i in range(n): # n iterations for j in range(n-1): # n iterations if arr[j] > arr[j+1]: arr[j], arr[j+1] = arr[j+1], arr[j] return arr # O(log n) - Logarithmic Time def binary_search(arr, target): left, right = 0, len(arr) - 1 while left <= right: mid = (left + right) // 2 if arr[mid] == target: return mid elif arr[mid] < target: left = mid + 1 # Eliminate half else: right = mid - 1 # Eliminate half return -1 # O(n log n) - Linearithmic Time def merge_sort(arr): if len(arr) <= 1: return arr mid = len(arr) // 2 left = merge_sort(arr[:mid]) # log n divisions right = merge_sort(arr[mid:]) return merge(left, right) # n work per level

Space Complexity

Space complexity measures the amount of memory an algorithm uses relative to the input size.

# O(1) Space - Constant space def sum_array(arr): total = 0 # Only one variable for num in arr: total += num return total # O(n) Space - Linear space def create_copy(arr): new_arr = [] for item in arr: new_arr.append(item) # New array of size n return new_arr # O(n) Space - Recursive call stack def factorial(n): if n <= 1: return 1 return n * factorial(n - 1) # n stack frames

Analyzing Complex Code

# What's the time complexity? def example1(arr): for i in range(len(arr)): # O(n) print(arr[i]) # O(1) # Total: O(n) def example2(arr): for i in range(len(arr)): # O(n) for j in range(len(arr)): # O(n) print(arr[i], arr[j]) # O(1) # Total: O(n²) def example3(arr): for i in range(len(arr)): # O(n) print(arr[i]) for i in range(len(arr)): # O(n) for j in range(len(arr)): # O(n) print(arr[i], arr[j]) # Total: O(n) + O(n²) = O(n²) - we keep the dominant term
Important Rule: When combining complexities, drop constants and keep only the dominant (fastest-growing) term. O(2n + 100) becomes O(n), and O(n² + n) becomes O(n²).

Best, Average, and Worst Case

  • Best Case: Minimum time required (often not useful)
  • Average Case: Expected time for typical input
  • Worst Case: Maximum time required (Big O focuses on this)
Rule of Thumb: For competitive programming, if n ≤ 10,000: O(n²) is acceptable. If n ≤ 1,000,000: need O(n log n) or better. If n ≤ 100,000,000: need O(n) or better.

Test Your Knowledge - Lesson 1

1. What is the time complexity of accessing an element in an array by index?

2. Which complexity is better (faster) for large inputs?

3. What is the time complexity of two nested loops that both iterate n times?

Lesson 2: Searching Algorithms

Linear Search

The simplest search algorithm: check every element until you find the target.

# Linear Search - O(n) time, O(1) space def linear_search(arr, target): """ Search for target in array sequentially. Returns index if found, -1 otherwise. """ for i in range(len(arr)): if arr[i] == target: return i return -1 # Example usage numbers = [64, 34, 25, 12, 22, 11, 90] result = linear_search(numbers, 22) print(f"Found at index: {result}") # Output: 4 # Use case: Unsorted arrays, small datasets
When to use Linear Search: Works on unsorted data, simple to implement, efficient for small arrays (n < 100).

Binary Search

Efficient search algorithm for sorted arrays. Divides search space in half each iteration.

# Binary Search - O(log n) time, O(1) space def binary_search(arr, target): """ Search for target in SORTED array using divide and conquer. Returns index if found, -1 otherwise. """ left = 0 right = len(arr) - 1 while left <= right: mid = (left + right) // 2 if arr[mid] == target: return mid elif arr[mid] < target: left = mid + 1 # Search right half else: right = mid - 1 # Search left half return -1 # Example usage sorted_numbers = [11, 12, 22, 25, 34, 64, 90] result = binary_search(sorted_numbers, 25) print(f"Found at index: {result}") # Output: 3 # Recursive version def binary_search_recursive(arr, target, left, right): if left > right: return -1 mid = (left + right) // 2 if arr[mid] == target: return mid elif arr[mid] < target: return binary_search_recursive(arr, target, mid + 1, right) else: return binary_search_recursive(arr, target, left, mid - 1)
Prerequisite: Binary search ONLY works on sorted arrays. If the array is not sorted, you must sort it first or use linear search.

Binary Search Variations

# Find first occurrence def find_first(arr, target): left, right = 0, len(arr) - 1 result = -1 while left <= right: mid = (left + right) // 2 if arr[mid] == target: result = mid right = mid - 1 # Continue searching left elif arr[mid] < target: left = mid + 1 else: right = mid - 1 return result # Find last occurrence def find_last(arr, target): left, right = 0, len(arr) - 1 result = -1 while left <= right: mid = (left + right) // 2 if arr[mid] == target: result = mid left = mid + 1 # Continue searching right elif arr[mid] < target: left = mid + 1 else: right = mid - 1 return result # Find insert position def search_insert(arr, target): """Find index where target should be inserted to maintain sorted order""" left, right = 0, len(arr) - 1 while left <= right: mid = (left + right) // 2 if arr[mid] == target: return mid elif arr[mid] < target: left = mid + 1 else: right = mid - 1 return left # Insert position

Jump Search

Jump Search is a searching algorithm for sorted arrays that jumps ahead by fixed steps.

# Jump Search - O(√n) time import math def jump_search(arr, target): """ Jump ahead by √n steps, then do linear search. Works on sorted arrays. """ n = len(arr) step = int(math.sqrt(n)) prev = 0 # Find block where element may be present while arr[min(step, n) - 1] < target: prev = step step += int(math.sqrt(n)) if prev >= n: return -1 # Linear search in identified block while arr[prev] < target: prev += 1 if prev == min(step, n): return -1 # If element found if arr[prev] == target: return prev return -1 # Example arr = [0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, 233, 377] print(jump_search(arr, 55)) # Output: 10

Interpolation Search

Better than binary search for uniformly distributed sorted data.

# Interpolation Search - O(log log n) for uniform data def interpolation_search(arr, target): """ Like binary search but calculates position based on value. Best for uniformly distributed data. """ left = 0 right = len(arr) - 1 while left <= right and target >= arr[left] and target <= arr[right]: # Avoid division by zero if left == right: if arr[left] == target: return left return -1 # Estimate position pos = left + int(((right - left) / (arr[right] - arr[left])) * (target - arr[left])) if arr[pos] == target: return pos if arr[pos] < target: left = pos + 1 else: right = pos - 1 return -1 # Works well on uniformly distributed data arr = [10, 20, 30, 40, 50, 60, 70, 80, 90, 100] print(interpolation_search(arr, 70)) # Output: 6
Choosing the Right Search:
• Unsorted data: Linear Search
• Sorted data (general): Binary Search
• Sorted, uniformly distributed: Interpolation Search
• Very large sorted data: Jump Search

Test Your Knowledge - Lesson 2

1. What is a prerequisite for binary search to work?

2. What is the time complexity of binary search?

3. Which search algorithm works on unsorted arrays?

Lesson 3: Sorting Algorithms

Bubble Sort

Simple sorting algorithm that repeatedly steps through the list, compares adjacent elements and swaps them if they're in wrong order.

# Bubble Sort - O(n²) time, O(1) space def bubble_sort(arr): """ Repeatedly swap adjacent elements if in wrong order. Largest element "bubbles" to the end each pass. """ n = len(arr) for i in range(n): swapped = False # Last i elements are already sorted for j in range(0, n - i - 1): if arr[j] > arr[j + 1]: arr[j], arr[j + 1] = arr[j + 1], arr[j] swapped = True # If no swaps, array is sorted if not swapped: break return arr # Example numbers = [64, 34, 25, 12, 22, 11, 90] print(bubble_sort(numbers)) # Output: [11, 12, 22, 25, 34, 64, 90]

Selection Sort

Divides array into sorted and unsorted regions. Repeatedly selects the smallest element from unsorted region.

# Selection Sort - O(n²) time, O(1) space def selection_sort(arr): """ Find minimum element and swap with first unsorted element. """ n = len(arr) for i in range(n): # Find minimum in unsorted portion min_idx = i for j in range(i + 1, n): if arr[j] < arr[min_idx]: min_idx = j # Swap minimum with first unsorted element arr[i], arr[min_idx] = arr[min_idx], arr[i] return arr # Example numbers = [64, 25, 12, 22, 11] print(selection_sort(numbers)) # Output: [11, 12, 22, 25, 64]

Insertion Sort

Builds final sorted array one item at a time. Efficient for small datasets and nearly sorted data.

# Insertion Sort - O(n²) time, O(1) space # Best case O(n) for nearly sorted data def insertion_sort(arr): """ Take each element and insert it into correct position in the sorted portion. """ for i in range(1, len(arr)): key = arr[i] j = i - 1 # Move elements greater than key one position ahead while j >= 0 and arr[j] > key: arr[j + 1] = arr[j] j -= 1 arr[j + 1] = key return arr # Example numbers = [12, 11, 13, 5, 6] print(insertion_sort(numbers)) # Output: [5, 6, 11, 12, 13]
Simple Sorts Summary: Bubble, Selection, and Insertion sorts are all O(n²) but easy to implement. Good for small arrays (n < 50) or nearly sorted data.

Merge Sort

Divide and conquer algorithm. Divides array into halves, recursively sorts them, and merges results.

# Merge Sort - O(n log n) time, O(n) space def merge_sort(arr): """ Divide array in half, recursively sort, then merge. Guaranteed O(n log n) performance. """ if len(arr) <= 1: return arr # Divide mid = len(arr) // 2 left = merge_sort(arr[:mid]) right = merge_sort(arr[mid:]) # Conquer (merge) return merge(left, right) def merge(left, right): """Merge two sorted arrays into one sorted array""" result = [] i = j = 0 # Merge smaller elements first while i < len(left) and j < len(right): if left[i] <= right[j]: result.append(left[i]) i += 1 else: result.append(right[j]) j += 1 # Add remaining elements result.extend(left[i:]) result.extend(right[j:]) return result # Example numbers = [38, 27, 43, 3, 9, 82, 10] print(merge_sort(numbers)) # Output: [3, 9, 10, 27, 38, 43, 82]

Quick Sort

Divide and conquer algorithm using a pivot. Average case O(n log n), worst case O(n²).

# Quick Sort - O(n log n) average, O(n²) worst, O(log n) space def quick_sort(arr): """ Pick pivot, partition array so elements < pivot are left, elements > pivot are right. Recursively sort partitions. """ if len(arr) <= 1: return arr pivot = arr[len(arr) // 2] left = [x for x in arr if x < pivot] middle = [x for x in arr if x == pivot] right = [x for x in arr if x > pivot] return quick_sort(left) + middle + quick_sort(right) # In-place version (more efficient) def quick_sort_inplace(arr, low, high): if low < high: # Partition and get pivot index pi = partition(arr, low, high) # Recursively sort before and after partition quick_sort_inplace(arr, low, pi - 1) quick_sort_inplace(arr, pi + 1, high) def partition(arr, low, high): pivot = arr[high] i = low - 1 for j in range(low, high): if arr[j] <= pivot: i += 1 arr[i], arr[j] = arr[j], arr[i] arr[i + 1], arr[high] = arr[high], arr[i + 1] return i + 1 # Example numbers = [10, 7, 8, 9, 1, 5] print(quick_sort(numbers)) # Output: [1, 5, 7, 8, 9, 10]

Heap Sort

Uses a binary heap data structure. O(n log n) time, O(1) space.

# Heap Sort - O(n log n) time, O(1) space def heap_sort(arr): """ Build max heap, then repeatedly extract maximum. """ n = len(arr) # Build max heap for i in range(n // 2 - 1, -1, -1): heapify(arr, n, i) # Extract elements one by one for i in range(n - 1, 0, -1): arr[0], arr[i] = arr[i], arr[0] # Swap heapify(arr, i, 0) return arr def heapify(arr, n, i): """Maintain heap property""" largest = i left = 2 * i + 1 right = 2 * i + 2 if left < n and arr[left] > arr[largest]: largest = left if right < n and arr[right] > arr[largest]: largest = right if largest != i: arr[i], arr[largest] = arr[largest], arr[i] heapify(arr, n, largest) # Example numbers = [12, 11, 13, 5, 6, 7] print(heap_sort(numbers)) # Output: [5, 6, 7, 11, 12, 13]

Sorting Algorithm Comparison

Algorithm Best Average Worst Space Stable
Bubble Sort O(n) O(n²) O(n²) O(1) Yes
Selection Sort O(n²) O(n²) O(n²) O(1) No
Insertion Sort O(n) O(n²) O(n²) O(1) Yes
Merge Sort O(n log n) O(n log n) O(n log n) O(n) Yes
Quick Sort O(n log n) O(n log n) O(n²) O(log n) No
Heap Sort O(n log n) O(n log n) O(n log n) O(1) No
When to Use Which Sort:
• Small arrays: Insertion Sort
• Need guaranteed O(n log n): Merge Sort or Heap Sort
• Average case, in-place: Quick Sort
• Nearly sorted data: Insertion Sort
• Need stability: Merge Sort or Bubble Sort

Test Your Knowledge - Lesson 3

1. Which sorting algorithm has guaranteed O(n log n) time complexity in all cases?

2. What is the space complexity of Merge Sort?

3. Which sort is best for nearly sorted data?

Lesson 4: Data Structures - Part 1

Arrays and Lists

Contiguous memory locations storing elements of the same type.

# Arrays in Python (using lists) # Access: O(1), Search: O(n), Insert/Delete: O(n) # Creating and accessing arr = [1, 2, 3, 4, 5] print(arr[0]) # O(1) - Direct access print(arr[-1]) # O(1) - Last element # Insertion arr.append(6) # O(1) - Add to end arr.insert(0, 0) # O(n) - Insert at beginning (shifts all) # Deletion arr.pop() # O(1) - Remove from end arr.pop(0) # O(n) - Remove from beginning # Searching if 3 in arr: # O(n) - Linear search print("Found") # Common operations arr.reverse() # O(n) arr.sort() # O(n log n) length = len(arr) # O(1)
Arrays vs Linked Lists: Arrays provide O(1) access but O(n) insertion/deletion. Linked lists provide O(1) insertion/deletion but O(n) access.

Linked Lists

Linear data structure where elements are stored in nodes, each pointing to the next.

# Singly Linked List Implementation class Node: def __init__(self, data): self.data = data self.next = None class LinkedList: def __init__(self): self.head = None def insert_at_beginning(self, data): """O(1) - Insert at head""" new_node = Node(data) new_node.next = self.head self.head = new_node def insert_at_end(self, data): """O(n) - Insert at tail""" new_node = Node(data) if not self.head: self.head = new_node return current = self.head while current.next: current = current.next current.next = new_node def delete_node(self, key): """O(n) - Delete node with given value""" current = self.head # If head needs to be deleted if current and current.data == key: self.head = current.next return # Search for node to delete prev = None while current and current.data != key: prev = current current = current.next # If key not found if not current: return # Unlink node prev.next = current.next def search(self, key): """O(n) - Search for value""" current = self.head while current: if current.data == key: return True current = current.next return False def print_list(self): """O(n) - Print all elements""" current = self.head while current: print(current.data, end=" -> ") current = current.next print("None") # Usage ll = LinkedList() ll.insert_at_end(1) ll.insert_at_end(2) ll.insert_at_end(3) ll.insert_at_beginning(0) ll.print_list() # 0 -> 1 -> 2 -> 3 -> None

Stacks

LIFO (Last In, First Out) data structure. Like a stack of plates.

# Stack Implementation - All operations O(1) class Stack: def __init__(self): self.items = [] def push(self, item): """Add item to top""" self.items.append(item) def pop(self): """Remove and return top item""" if not self.is_empty(): return self.items.pop() return None def peek(self): """View top item without removing""" if not self.is_empty(): return self.items[-1] return None def is_empty(self): """Check if stack is empty""" return len(self.items) == 0 def size(self): """Return number of items""" return len(self.items) # Usage stack = Stack() stack.push(1) stack.push(2) stack.push(3) print(stack.pop()) # 3 (LIFO) print(stack.peek()) # 2 # Practical applications # - Function call stack # - Undo/Redo operations # - Expression evaluation # - Backtracking algorithms # Example: Check balanced parentheses def is_balanced(expression): stack = [] matching = {'(': ')', '[': ']', '{': '}'} for char in expression: if char in matching: stack.append(char) elif char in matching.values(): if not stack or matching[stack.pop()] != char: return False return len(stack) == 0 print(is_balanced("({[]})")) # True print(is_balanced("({[})")) # False

Queues

FIFO (First In, First Out) data structure. Like a line at a store.

# Queue Implementation class Queue: def __init__(self): self.items = [] def enqueue(self, item): """Add item to rear - O(1)""" self.items.append(item) def dequeue(self): """Remove and return front item - O(n) due to shift""" if not self.is_empty(): return self.items.pop(0) return None def front(self): """View front item""" if not self.is_empty(): return self.items[0] return None def is_empty(self): return len(self.items) == 0 def size(self): return len(self.items) # Better implementation using collections.deque (O(1) for both ends) from collections import deque class EfficientQueue: def __init__(self): self.items = deque() def enqueue(self, item): self.items.append(item) # O(1) def dequeue(self): if not self.is_empty(): return self.items.popleft() # O(1) return None def is_empty(self): return len(self.items) == 0 # Usage queue = EfficientQueue() queue.enqueue(1) queue.enqueue(2) queue.enqueue(3) print(queue.dequeue()) # 1 (FIFO) print(queue.dequeue()) # 2 # Practical applications # - Breadth-First Search (BFS) # - Print queue # - Process scheduling # - Buffer for data streams

Hash Tables (Dictionaries)

Key-value pairs with O(1) average case access, insert, and delete.

# Hash Table in Python (dict) # Average case: O(1) for access, insert, delete # Worst case: O(n) if many collisions # Basic operations hash_table = {} # Insert/Update - O(1) average hash_table['key1'] = 'value1' hash_table['key2'] = 'value2' # Access - O(1) average value = hash_table['key1'] value = hash_table.get('key3', 'default') # Safer # Delete - O(1) average del hash_table['key1'] # Check existence - O(1) if 'key2' in hash_table: print("Found") # Practical example: Count word frequency def word_frequency(text): freq = {} words = text.split() for word in words: freq[word] = freq.get(word, 0) + 1 return freq text = "hello world hello python world" print(word_frequency(text)) # {'hello': 2, 'world': 2, 'python': 1} # Example: Find first non-repeating character def first_non_repeating(string): char_count = {} # Count occurrences for char in string: char_count[char] = char_count.get(char, 0) + 1 # Find first with count 1 for char in string: if char_count[char] == 1: return char return None print(first_non_repeating("aabbcdeff")) # 'c'
Choosing the Right Data Structure:
• Need order + fast access: Array/List
• Need fast insert/delete: Linked List
• Need LIFO: Stack
• Need FIFO: Queue
• Need fast lookup by key: Hash Table

Test Your Knowledge - Lesson 4

1. What is the time complexity of accessing an element in a linked list?

2. Which data structure follows LIFO (Last In, First Out)?

3. What is the average time complexity for hash table lookup?

Lesson 5: Data Structures - Part 2

Trees

Hierarchical data structure with a root node and child nodes forming a parent-child relationship.

# Binary Tree Node class TreeNode: def __init__(self, value): self.value = value self.left = None self.right = None class BinaryTree: def __init__(self, root_value): self.root = TreeNode(root_value) # Tree Traversals def inorder(self, node): """Left -> Root -> Right""" if node: self.inorder(node.left) print(node.value, end=" ") self.inorder(node.right) def preorder(self, node): """Root -> Left -> Right""" if node: print(node.value, end=" ") self.preorder(node.left) self.preorder(node.right) def postorder(self, node): """Left -> Right -> Root""" if node: self.postorder(node.left) self.postorder(node.right) print(node.value, end=" ") def level_order(self): """Level by level (BFS)""" if not self.root: return queue = [self.root] while queue: node = queue.pop(0) print(node.value, end=" ") if node.left: queue.append(node.left) if node.right: queue.append(node.right) # Example # 1 # / \ # 2 3 # / \ # 4 5 tree = BinaryTree(1) tree.root.left = TreeNode(2) tree.root.right = TreeNode(3) tree.root.left.left = TreeNode(4) tree.root.left.right = TreeNode(5) tree.inorder(tree.root) # 4 2 5 1 3 tree.preorder(tree.root) # 1 2 4 5 3 tree.postorder(tree.root) # 4 5 2 3 1 tree.level_order() # 1 2 3 4 5

Binary Search Tree (BST)

Binary tree where left child < parent < right child. Enables O(log n) search on average.

# Binary Search Tree class BST: def __init__(self): self.root = None def insert(self, value): """Insert value - O(log n) average, O(n) worst""" if not self.root: self.root = TreeNode(value) else: self._insert_recursive(self.root, value) def _insert_recursive(self, node, value): if value < node.value: if node.left is None: node.left = TreeNode(value) else: self._insert_recursive(node.left, value) else: if node.right is None: node.right = TreeNode(value) else: self._insert_recursive(node.right, value) def search(self, value): """Search for value - O(log n) average""" return self._search_recursive(self.root, value) def _search_recursive(self, node, value): if node is None or node.value == value: return node if value < node.value: return self._search_recursive(node.left, value) return self._search_recursive(node.right, value) def find_min(self, node): """Find minimum value in tree""" current = node while current.left: current = current.left return current def delete(self, value): """Delete node with given value""" self.root = self._delete_recursive(self.root, value) def _delete_recursive(self, node, value): if not node: return node if value < node.value: node.left = self._delete_recursive(node.left, value) elif value > node.value: node.right = self._delete_recursive(node.right, value) else: # Node with only one child or no child if not node.left: return node.right elif not node.right: return node.left # Node with two children # Get inorder successor (smallest in right subtree) temp = self.find_min(node.right) node.value = temp.value node.right = self._delete_recursive(node.right, temp.value) return node # Usage bst = BST() for value in [50, 30, 70, 20, 40, 60, 80]: bst.insert(value) print(bst.search(40)) # Found print(bst.search(25)) # None
BST Properties: Left subtree values < node value < right subtree values. Inorder traversal gives sorted order.

Heaps

Complete binary tree where parent is always greater (max heap) or smaller (min heap) than children.

# Min Heap Implementation using array class MinHeap: def __init__(self): self.heap = [] def parent(self, i): return (i - 1) // 2 def left_child(self, i): return 2 * i + 1 def right_child(self, i): return 2 * i + 2 def insert(self, value): """Insert and maintain heap property - O(log n)""" self.heap.append(value) self._heapify_up(len(self.heap) - 1) def _heapify_up(self, i): parent = self.parent(i) if i > 0 and self.heap[i] < self.heap[parent]: self.heap[i], self.heap[parent] = self.heap[parent], self.heap[i] self._heapify_up(parent) def extract_min(self): """Remove and return minimum - O(log n)""" if len(self.heap) == 0: return None if len(self.heap) == 1: return self.heap.pop() min_val = self.heap[0] self.heap[0] = self.heap.pop() self._heapify_down(0) return min_val def _heapify_down(self, i): min_index = i left = self.left_child(i) right = self.right_child(i) if left < len(self.heap) and self.heap[left] < self.heap[min_index]: min_index = left if right < len(self.heap) and self.heap[right] < self.heap[min_index]: min_index = right if min_index != i: self.heap[i], self.heap[min_index] = self.heap[min_index], self.heap[i] self._heapify_down(min_index) def peek(self): """View minimum without removing - O(1)""" return self.heap[0] if self.heap else None # Usage heap = MinHeap() for num in [5, 3, 7, 1, 9, 2]: heap.insert(num) print(heap.extract_min()) # 1 print(heap.extract_min()) # 2 print(heap.peek()) # 3 # Python's built-in heap (heapq) import heapq # Min heap heap = [5, 3, 7, 1, 9, 2] heapq.heapify(heap) print(heapq.heappop(heap)) # 1 heapq.heappush(heap, 0) print(heap[0]) # 0 (minimum)

Graphs

Collection of nodes (vertices) connected by edges. Can be directed or undirected, weighted or unweighted.

# Graph using Adjacency List class Graph: def __init__(self): self.graph = {} def add_vertex(self, vertex): """Add a vertex""" if vertex not in self.graph: self.graph[vertex] = [] def add_edge(self, v1, v2): """Add undirected edge""" if v1 not in self.graph: self.graph[v1] = [] if v2 not in self.graph: self.graph[v2] = [] self.graph[v1].append(v2) self.graph[v2].append(v1) def bfs(self, start): """Breadth-First Search - O(V + E)""" visited = set() queue = [start] visited.add(start) result = [] while queue: vertex = queue.pop(0) result.append(vertex) for neighbor in self.graph.get(vertex, []): if neighbor not in visited: visited.add(neighbor) queue.append(neighbor) return result def dfs(self, start): """Depth-First Search - O(V + E)""" visited = set() result = [] def dfs_recursive(vertex): visited.add(vertex) result.append(vertex) for neighbor in self.graph.get(vertex, []): if neighbor not in visited: dfs_recursive(neighbor) dfs_recursive(start) return result def has_path(self, start, end): """Check if path exists between two vertices""" if start == end: return True visited = set() queue = [start] visited.add(start) while queue: vertex = queue.pop(0) if vertex == end: return True for neighbor in self.graph.get(vertex, []): if neighbor not in visited: visited.add(neighbor) queue.append(neighbor) return False # Usage g = Graph() g.add_edge('A', 'B') g.add_edge('A', 'C') g.add_edge('B', 'D') g.add_edge('C', 'D') g.add_edge('D', 'E') print(g.bfs('A')) # ['A', 'B', 'C', 'D', 'E'] print(g.dfs('A')) # ['A', 'B', 'D', 'C', 'E'] print(g.has_path('A', 'E')) # True

Tries (Prefix Trees)

Tree structure for storing strings, efficient for prefix searches.

# Trie Implementation class TrieNode: def __init__(self): self.children = {} self.is_end_of_word = False class Trie: def __init__(self): self.root = TrieNode() def insert(self, word): """Insert word - O(m) where m is word length""" node = self.root for char in word: if char not in node.children: node.children[char] = TrieNode() node = node.children[char] node.is_end_of_word = True def search(self, word): """Search for exact word - O(m)""" node = self.root for char in word: if char not in node.children: return False node = node.children[char] return node.is_end_of_word def starts_with(self, prefix): """Check if any word starts with prefix - O(m)""" node = self.root for char in prefix: if char not in node.children: return False node = node.children[char] return True def get_all_words_with_prefix(self, prefix): """Get all words starting with prefix""" node = self.root for char in prefix: if char not in node.children: return [] node = node.children[char] words = [] self._collect_words(node, prefix, words) return words def _collect_words(self, node, current_word, words): if node.is_end_of_word: words.append(current_word) for char, child in node.children.items(): self._collect_words(child, current_word + char, words) # Usage trie = Trie() words = ["apple", "app", "application", "apply", "banana"] for word in words: trie.insert(word) print(trie.search("app")) # True print(trie.search("appl")) # False print(trie.starts_with("app")) # True print(trie.get_all_words_with_prefix("app")) # ['app', 'apple', 'application', 'apply']
Advanced Data Structures Summary:
• Trees: Hierarchical data, O(log n) operations in BST
• Heaps: Priority queue, O(1) peek, O(log n) insert/delete
• Graphs: Relationships, BFS/DFS for traversal
• Tries: String prefix matching, autocomplete

Test Your Knowledge - Lesson 5

1. In a Binary Search Tree, where are values less than the root stored?

2. What is the time complexity of inserting into a heap?

3. Which graph traversal uses a queue?

Lesson 6: Common Algorithm Patterns

Two Pointers Technique

Use two pointers to iterate through data structure, often from different ends or at different speeds.

# Two Pointers - Find pair that sums to target def two_sum_sorted(arr, target): """ Find two numbers that add up to target in sorted array. O(n) time, O(1) space """ left = 0 right = len(arr) - 1 while left < right: current_sum = arr[left] + arr[right] if current_sum == target: return [left, right] elif current_sum < target: left += 1 else: right -= 1 return None # Example: Reverse array in place def reverse_array(arr): left, right = 0, len(arr) - 1 while left < right: arr[left], arr[right] = arr[right], arr[left] left += 1 right -= 1 return arr # Example: Remove duplicates from sorted array def remove_duplicates(arr): if not arr: return 0 write = 1 # Position to write next unique element for read in range(1, len(arr)): if arr[read] != arr[read - 1]: arr[write] = arr[read] write += 1 return write # Length of array with unique elements # Example: Check palindrome def is_palindrome(s): left, right = 0, len(s) - 1 while left < right: if s[left] != s[right]: return False left += 1 right -= 1 return True print(two_sum_sorted([1, 2, 3, 4, 6], 6)) # [1, 3] print(is_palindrome("racecar")) # True

Sliding Window

Maintain a window of elements and slide it through the array.

# Sliding Window - Maximum sum of k consecutive elements def max_sum_subarray(arr, k): """ Find maximum sum of k consecutive elements. O(n) time, O(1) space """ if len(arr) < k: return None # Calculate sum of first window window_sum = sum(arr[:k]) max_sum = window_sum # Slide window: add next, remove first for i in range(k, len(arr)): window_sum = window_sum - arr[i - k] + arr[i] max_sum = max(max_sum, window_sum) return max_sum # Variable size window - Smallest subarray with sum >= target def min_subarray_len(arr, target): """ Find length of smallest subarray with sum >= target. O(n) time """ min_length = float('inf') window_sum = 0 left = 0 for right in range(len(arr)): window_sum += arr[right] while window_sum >= target: min_length = min(min_length, right - left + 1) window_sum -= arr[left] left += 1 return min_length if min_length != float('inf') else 0 # Longest substring with k distinct characters def longest_substring_k_distinct(s, k): char_count = {} left = 0 max_length = 0 for right in range(len(s)): # Expand window char_count[s[right]] = char_count.get(s[right], 0) + 1 # Contract window if too many distinct characters while len(char_count) > k: char_count[s[left]] -= 1 if char_count[s[left]] == 0: del char_count[s[left]] left += 1 max_length = max(max_length, right - left + 1) return max_length print(max_sum_subarray([2, 1, 5, 1, 3, 2], 3)) # 9 print(min_subarray_len([2, 3, 1, 2, 4, 3], 7)) # 2

Divide and Conquer

Break problem into smaller subproblems, solve recursively, and combine results.

# Classic example: Merge Sort (already covered) # Finding maximum subarray sum (Kadane's variation) def max_subarray_divide_conquer(arr, left, right): """ Find maximum subarray sum using divide and conquer. O(n log n) time """ if left == right: return arr[left] mid = (left + right) // 2 # Maximum in left half left_max = max_subarray_divide_conquer(arr, left, mid) # Maximum in right half right_max = max_subarray_divide_conquer(arr, mid + 1, right) # Maximum crossing the middle cross_max = max_crossing_sum(arr, left, mid, right) return max(left_max, right_max, cross_max) def max_crossing_sum(arr, left, mid, right): # Left side from mid left_sum = float('-inf') temp_sum = 0 for i in range(mid, left - 1, -1): temp_sum += arr[i] left_sum = max(left_sum, temp_sum) # Right side from mid+1 right_sum = float('-inf') temp_sum = 0 for i in range(mid + 1, right + 1): temp_sum += arr[i] right_sum = max(right_sum, temp_sum) return left_sum + right_sum # Better approach: Kadane's Algorithm - O(n) def max_subarray_kadane(arr): """ Find maximum subarray sum using dynamic programming. O(n) time, O(1) space """ max_ending_here = max_so_far = arr[0] for i in range(1, len(arr)): max_ending_here = max(arr[i], max_ending_here + arr[i]) max_so_far = max(max_so_far, max_ending_here) return max_so_far arr = [-2, 1, -3, 4, -1, 2, 1, -5, 4] print(max_subarray_kadane(arr)) # 6 ([4,-1,2,1])

Dynamic Programming

Break problem into overlapping subproblems and store results to avoid recomputation.

# Fibonacci - Classic DP example # Naive recursive - O(2^n) - VERY SLOW def fib_recursive(n): if n <= 1: return n return fib_recursive(n-1) + fib_recursive(n-2) # Memoization (Top-Down DP) - O(n) def fib_memo(n, memo={}): if n in memo: return memo[n] if n <= 1: return n memo[n] = fib_memo(n-1, memo) + fib_memo(n-2, memo) return memo[n] # Tabulation (Bottom-Up DP) - O(n) time, O(n) space def fib_tabulation(n): if n <= 1: return n dp = [0] * (n + 1) dp[1] = 1 for i in range(2, n + 1): dp[i] = dp[i-1] + dp[i-2] return dp[n] # Space optimized - O(n) time, O(1) space def fib_optimized(n): if n <= 1: return n prev2, prev1 = 0, 1 for i in range(2, n + 1): current = prev1 + prev2 prev2 = prev1 prev1 = current return prev1 # Climbing Stairs Problem def climb_stairs(n): """ Can climb 1 or 2 steps. How many ways to reach top? dp[i] = dp[i-1] + dp[i-2] """ if n <= 2: return n dp = [0] * (n + 1) dp[1] = 1 dp[2] = 2 for i in range(3, n + 1): dp[i] = dp[i-1] + dp[i-2] return dp[n] # Coin Change Problem def coin_change(coins, amount): """ Minimum coins needed to make amount. """ dp = [float('inf')] * (amount + 1) dp[0] = 0 for i in range(1, amount + 1): for coin in coins: if coin <= i: dp[i] = min(dp[i], dp[i - coin] + 1) return dp[amount] if dp[amount] != float('inf') else -1 print(fib_optimized(10)) # 55 print(climb_stairs(5)) # 8 print(coin_change([1, 2, 5], 11)) # 3 (5+5+1)

Greedy Algorithms

Make locally optimal choice at each step, hoping to find global optimum.

# Activity Selection Problem def activity_selection(start, finish): """ Select maximum number of non-overlapping activities. Sort by finish time, always pick next compatible activity. """ n = len(finish) # Sort by finish time activities = sorted(zip(start, finish), key=lambda x: x[1]) selected = [activities[0]] last_finish = activities[0][1] for i in range(1, n): if activities[i][0] >= last_finish: selected.append(activities[i]) last_finish = activities[i][1] return selected # Fractional Knapsack def fractional_knapsack(values, weights, capacity): """ Can take fractions of items. Greedy: sort by value/weight ratio, take highest first. """ n = len(values) # Calculate value/weight ratio items = [] for i in range(n): items.append((values[i] / weights[i], values[i], weights[i])) # Sort by ratio (descending) items.sort(reverse=True) total_value = 0 for ratio, value, weight in items: if capacity >= weight: # Take whole item capacity -= weight total_value += value else: # Take fraction total_value += ratio * capacity break return total_value # Huffman Coding (greedy compression) import heapq def huffman_encoding(freq): """ Build optimal prefix-free binary code. Greedy: always combine two least frequent. """ heap = [[weight, [char, ""]] for char, weight in freq.items()] heapq.heapify(heap) while len(heap) > 1: lo = heapq.heappop(heap) hi = heapq.heappop(heap) for pair in lo[1:]: pair[1] = '0' + pair[1] for pair in hi[1:]: pair[1] = '1' + pair[1] heapq.heappush(heap, [lo[0] + hi[0]] + lo[1:] + hi[1:]) return sorted(heapq.heappop(heap)[1:], key=lambda p: (len(p[-1]), p)) start = [1, 3, 0, 5, 8, 5] finish = [2, 4, 6, 7, 9, 9] print(activity_selection(start, finish)) values = [60, 100, 120] weights = [10, 20, 30] print(fractional_knapsack(values, weights, 50)) # 240.0
Pattern Recognition:
• Two pointers: Sorted array, palindrome, pair sum
• Sliding window: Subarray/substring problems
• Divide & Conquer: Sorting, searching
• DP: Overlapping subproblems, optimal substructure
• Greedy: Local optimum leads to global optimum

Test Your Knowledge - Lesson 6

1. Which technique is best for finding a pair in a sorted array that sums to a target?

2. What is the key characteristic of Dynamic Programming?

3. What is the time complexity of the optimized Fibonacci solution?