Graph Pattern Recognition

Medium 25 min read

The Decision Tree: BFS vs DFS

🎯

Pattern Recognition is Key

Most graph interview problems boil down to: "Which traversal do I use?" This decision tree will help you pick the right approach in seconds.

When to Use Which Algorithm

What does the problem ask? Shortest path? Unweighted: BFS Weighted: Dijkstra Explore all / cycles? DFS Union-Find Ordering / dependencies? Topo Sort (DFS) Kahn's (BFS) Level-by-level Priority queue Recursion + visited Connected comps Post-order stack In-degree queue

Adjacency List (preferred)

Space: O(V + E). Best for sparse graphs. Most interview problems use this.

graph = {0: [1,2], 1: [2], 2: [3], 3: []}

Adjacency Matrix

Space: O(V^2). Best for dense graphs or when checking edge existence in O(1).

matrix = [[0,1,1,0], [0,0,1,0], [0,0,0,1], [0,0,0,0]]

BFS Template

Template
BFS Template (Python)
from collections import deque def bfs(graph, start): visited = set([start]) queue = deque([start]) while queue: node = queue.popleft() print(node) # process node for neighbor in graph[node]: if neighbor not in visited: visited.add(neighbor) # mark BEFORE pushing! queue.append(neighbor)

DFS Template

Template
DFS Template (Python)
def dfs(graph, node, visited): visited.add(node) print(node) # process node for neighbor in graph[node]: if neighbor not in visited: dfs(graph, neighbor, visited) # Usage visited = set() dfs(graph, start_node, visited)

BFS vs DFS Traversal Comparison

BFS (Level-by-level) 1 2 3 4 5 6 Visit: 1 -> 2 -> 3 -> 4 -> 5 -> 6 Queue-based, layer by layer DFS (Depth-first) 1 2 5 3 4 6 Visit: 1 -> 2 -> 3 -> 4 -> 5 -> 6 Stack/recursion, goes deep first

Classic Graph Problems

Level 2

Problem 1: Number of Islands (LeetCode 200) -- DFS Flood Fill

Number of Islands
def num_islands(grid): if not grid: return 0 rows, cols = len(grid), len(grid[0]) count = 0 def dfs(r, c): if r < 0 or r >= rows or c < 0 or c >= cols or grid[r][c] != "1": return grid[r][c] = "0" # mark visited (sink the island) dfs(r + 1, c) dfs(r - 1, c) dfs(r, c + 1) dfs(r, c - 1) for r in range(rows): for c in range(cols): if grid[r][c] == "1": count += 1 dfs(r, c) return count

Problem 2: Course Schedule (LeetCode 207) -- Topological Sort

Course Schedule -- cycle detection via topological sort
from collections import deque, defaultdict def can_finish(num_courses, prerequisites): graph = defaultdict(list) in_degree = [0] * num_courses for course, prereq in prerequisites: graph[prereq].append(course) in_degree[course] += 1 # Start with courses that have no prerequisites queue = deque(i for i in range(num_courses) if in_degree[i] == 0) taken = 0 while queue: course = queue.popleft() taken += 1 for next_course in graph[course]: in_degree[next_course] -= 1 if in_degree[next_course] == 0: queue.append(next_course) return taken == num_courses # True if no cycle

Problem 3: Clone Graph (LeetCode 133) -- BFS/DFS with Hash Map

Clone Graph
def clone_graph(node): if not node: return None cloned = {} # original -> clone mapping def dfs(n): if n in cloned: return cloned[n] copy = Node(n.val) cloned[n] = copy for neighbor in n.neighbors: copy.neighbors.append(dfs(neighbor)) return copy return dfs(node)

Problem 4: Word Ladder (LeetCode 127) -- BFS Shortest Transformation

Word Ladder -- BFS for shortest path
from collections import deque def ladder_length(begin_word, end_word, word_list): word_set = set(word_list) if end_word not in word_set: return 0 queue = deque([(begin_word, 1)]) visited = set([begin_word]) while queue: word, steps = queue.popleft() for i in range(len(word)): for c in "abcdefghijklmnopqrstuvwxyz": next_word = word[:i] + c + word[i+1:] if next_word == end_word: return steps + 1 if next_word in word_set and next_word not in visited: visited.add(next_word) queue.append((next_word, steps + 1)) return 0

Problem 5: Pacific Atlantic Water Flow (LeetCode 417) -- Multi-source DFS

Pacific Atlantic Water Flow
def pacific_atlantic(heights): if not heights: return [] rows, cols = len(heights), len(heights[0]) pacific = set() atlantic = set() def dfs(r, c, visited, prev_height): if (r, c) in visited or r < 0 or r >= rows or c < 0 or c >= cols: return if heights[r][c] < prev_height: return visited.add((r, c)) for dr, dc in [(1,0),(-1,0),(0,1),(0,-1)]: dfs(r + dr, c + dc, visited, heights[r][c]) # Start DFS from ocean borders (reverse flow) for c in range(cols): dfs(0, c, pacific, 0) # top row = Pacific dfs(rows - 1, c, atlantic, 0) # bottom row = Atlantic for r in range(rows): dfs(r, 0, pacific, 0) # left col = Pacific dfs(r, cols - 1, atlantic, 0) # right col = Atlantic return list(pacific & atlantic) # cells reaching both

BFS Word Ladder Step-by-Step

begin="hit", end="cog", words=["hot","dot","dog","lot","log","cog"] Step 1: queue=["hit"] -> try all 1-char changes Step 2: queue=["hot"] -> found! distance=2 Step 3: queue=["dot","lot"] -> 1-char from "hot" Step 4: queue=["dog","log"] -> 1-char from "dot"/"lot" Step 5: queue=["cog"] -> found! distance=5 Answer: hit -> hot -> dot -> dog -> cog (5 steps)

DFS Islands Trace

Grid: After DFS from (0,0): 1 1 0 0 0 0 0 0 0 0 (island 1 sunk) 1 1 0 0 0 0 0 0 0 0 0 0 1 0 0 --> 0 0 1 0 0 (island 2 still here) 0 0 0 1 1 0 0 0 1 1 (island 3 still here) Total islands: 3

Deep Dives

Advanced

Deep Dive 1: Bidirectional BFS

For shortest-path problems where you know both start and end (like Word Ladder), bidirectional BFS searches from both ends simultaneously. This reduces time from O(b^d) to O(b^(d/2)) where b is the branching factor and d is the depth.

  • Maintain two frontiers (sets), alternating expansion from the smaller one.
  • When frontiers intersect, you have found the shortest path.

Deep Dive 2: A* Algorithm

A* extends Dijkstra with a heuristic function h(n) estimating distance to goal. It explores nodes with smallest f(n) = g(n) + h(n), where g(n) is actual cost from start. With an admissible heuristic (never overestimates), A* is optimal and often faster than plain Dijkstra.

Common Mistakes

Mistake #1: Not Marking Visited Before Pushing to Queue

# WRONG: mark visited when popping -- same node pushed multiple times! while queue: node = queue.popleft() visited.add(node) # Too late! Already in queue multiple times # CORRECT: mark visited when adding to queue if neighbor not in visited: visited.add(neighbor) # Mark NOW, before push queue.append(neighbor)

Mistake #2: Infinite Loop in Undirected Graphs

# WRONG: undirected edge A-B means A->B and B->A # Without visited set, DFS bounces back and forth forever! # CORRECT: always maintain a visited set for undirected graphs visited = set() def dfs(node): visited.add(node) for neighbor in graph[node]: if neighbor not in visited: dfs(neighbor)

Practice Problems

Practice Time!
E

Easy: Flood Fill (LeetCode 733)

Given an image (2D array), a starting pixel, and a new color, flood-fill the connected region.

Hint

DFS from the starting pixel. Only visit pixels with the same original color.

M

Medium: Rotting Oranges (LeetCode 994) -- Multi-source BFS

In a grid, rotten oranges spread to adjacent fresh oranges each minute. How many minutes until all oranges are rotten?

Hint

Add all initially rotten oranges to the BFS queue at once (multi-source). Track levels for time.

H

Hard: Alien Dictionary (LeetCode 269)

Given a sorted dictionary of an alien language, derive the character ordering.

Hint

Compare adjacent words to build a directed graph of character ordering, then topological sort.

Quick Reference Guide

Key Takeaways

  • BFS = shortest path in unweighted graphs, level-order traversal. Uses a queue.
  • DFS = explore all paths, cycle detection, connected components. Uses recursion/stack.
  • Always mark nodes as visited before adding to the queue (BFS) to avoid duplicates.

Time & Space Complexities

  • BFS/DFS: O(V + E) time, O(V) space
  • Dijkstra: O((V + E) log V)
  • Topological Sort: O(V + E)
  • Grid BFS/DFS: O(rows * cols)