DSA Blog Series - Part 4: Algorithms That Shape Thinking
- Start with pointers at array's start and end
- Check the middle element
- If target equals middle: Found it!
- If target is smaller: Search left half
- If target is larger: Search right half
- Repeat until found or search space is empty
- Dictionary lookups
- Database indexing (B-trees use binary search concepts)
- Git bisect (find which commit introduced a bug)
- Finding thresholds in machine learning
- Resource allocation optimisation
- Integer overflow: mid = (left + right) / 2 can overflow. Use mid = left + (right - left) / 2
- Off-by-one errors: Be careful with inclusive/exclusive bounds
- Infinite loops: Ensure search space shrinks each iteration
- Divide array into two halves
- Recursively sort each half
- Merge the sorted halves
- Stable: Equal elements maintain relative order
- Predictable: Always O(n log n)—no worst case degradation
- Space: Requires O(n) auxiliary space for merging
- Use case: External sorting (data on disk), linked lists, when stability matters
- Choose a pivot element
- Partition: move smaller elements left, larger right
- Recursively sort both partitions
- Unstable: Equal elements may not maintain order
- Fast in practice: Excellent cache performance, in-place
- Variable: O(n log n) average, O(n²) worst case (bad pivot choices)
- Space: O(log n) recursion stack
- Use case: General-purpose sorting, when average case matters more than worst case
- Random pivot selection
- Median-of-three (first, middle, last elements)
- Introsort (switch to heapsort if recursion depth exceeds threshold)
- In-place operation (better cache locality)
- Fewer data movements
- Partitioning is faster than merging
- Modern CPUs optimise sequential access patterns
- Guaranteed O(n log n) needed
- Stability required
- External sorting (limited RAM)
- Linked lists (no random access needed)
- Timsort (Python): Merge sort + insertion sort for small subarrays
- Introsort (C++): Quick sort + heap sort fallback + insertion sort for small ranges
- Java: Dual-pivot quicksort for primitives, Timsort for objects
- 67 → take 25 → 42 left
- 42 → take 25 → 17 left
- 17 → take 10 → 7 left
- 7 → take 5 → 2 left
- 2 → take 1 → 1 left
- 1 → take 1 → done
- Greedy choice property: Local optimum leads to global optimum
- Optimal substructure: Optimal solution contains optimal solutions to subproblems
- Fast—usually O(n log n) or better
- Simple to implement
- Sometimes provably optimal
- Good starting point—if greedy doesn't work, try DP
- Identify the choice that seems locally best
- Prove (or test) that local choices lead to global optimum
- If proof fails, consider dynamic programming
- Divide: Break problem into smaller instances of the same problem
- Conquer: Solve subproblems recursively (or directly if small enough)
- Combine: Merge subproblem solutions into the original problem's solution
- Divide: Split array in half
- Conquer: Sort each half recursively
- Combine: Merge sorted halves
- Divide: Check middle element, choose half
- Conquer: Search chosen half
- Combine: No combining needed—answer is direct
- Divide: Partition around pivot
- Conquer: Sort partitions recursively
- Combine: No combining—partition already positioned elements
- Divide: Split array at middle
- Conquer: Find max subarray in left, right, and crossing middle
- Combine: Return maximum of the three
- If f(n) = O(n^c) where c < log_b(a): T(n) = Θ(n^(log_b(a)))
- If f(n) = Θ(n^c log^k(n)) where c = log_b(a): T(n) = Θ(n^c log^(k+1)(n))
- If f(n) = Ω(n^c) where c > log_b(a): T(n) = Θ(f(n))
- a=2, b=2, c=1
- log_2(2) = 1 = c
- Result: T(n) = Θ(n log n)
- Reduces complex problems to simpler ones
- Often yields optimal algorithms
- Parallelizable—subproblems are independent
- Natural for recursive thinking
- Problem can be broken into similar smaller problems
- Subproblems are independent
- Combining solutions is efficient
- Requires stack space (recursion depth)
- Overhead for very small subproblems
- Not all problems decompose nicely
- Place queens row by row
- For each row, try each column
- Check if placement is safe (no conflicts with existing queens)
- Recursively place next queen
- If no valid placement in a row, backtrack to previous row
- Find empty cell
- Try digits 1-9
- Check if digit is valid
- Recursively fill next cell
- If no valid digit works, backtrack
- Maze solving: Try each path, backtrack at dead ends
- Subset sum: Find subset summing to target
- Graph coloring: Assign colors to vertices
- Permutations/combinations: Generate all arrangements
- Crossword puzzles: Fill words that fit constraints
- Constraint satisfaction problems: Job scheduling, course timetabling
- Exhaustive search is required
- Problem has constraints that can eliminate many possibilities
- Solution space is structured (tree-like)
- Optimal substructure: Optimal solution contains optimal solutions to subproblems
- Overlapping subproblems: Same subproblems are solved multiple times
- Divide and Conquer: Subproblems are independent (merge sort)
- DP: Subproblems overlap (fibonacci, shortest paths)
- Greedy: Make best local choice, never reconsider
- DP: Consider all possibilities, store results
- "Maximum/minimum"
- "Count number of ways"
- "Is it possible to..."
- Problem can be broken into similar smaller problems
- Brute force has repeated computation
- Define the state (what do you need to remember?)
- Identify the transition (how do states relate?)
- Determine base cases (smallest subproblems)
- Decide computation order (ensure dependencies are met)
- Natural recursive code
- Only computes needed subproblems
- Uses recursion stack (space overhead, stack overflow risk)
- Easier to write initially
- Iterative code
- Computes all subproblems (sometimes wasteful)
- No recursion overhead
- Often more space-optimised
- Can be harder to formulate initially
- Recursive formulation is more natural
- Only a subset of subproblems are needed
- Rapid prototyping (quicker to write initially)
- Recursion depth could be problematic
- All subproblems will be needed anyway
- Space optimisation is critical
- Production code (more predictable performance)
- Start with memoization if the recursive structure is clear
- Convert to tabulation for better performance
- Optimise space by identifying which states you actually need
- In interviews, either approach is fine—choose what you're comfortable with
- Don't take item i: dp[i][w] = dp[i-1][w]
- Take item i (if weight[i] ≤ w): dp[i][w] = dp[i-1][w-weight[i]] + value[i]
- Choose maximum: dp[i][w] = max(don't take, take)
- If string1[i] == string2[j]: dp[i][j] = dp[i-1][j-1] + 1
- Else: dp[i][j] = max(dp[i-1][j], dp[i][j-1])
- Knapsack: Decision problems (take or skip)
- LCS: Two-sequence problems with character matching
- LIS: Single-sequence problems with element comparison
- Draw examples to see overlapping subproblems
- Define states carefully (what information must you track?)
- Write transition equations before coding
- Test on small examples
- Look for space optimisation opportunities
- Longest Palindromic Substring: dp[i][j] = whether substring from i to j is palindrome
- Count palindromic substrings: Use expand-around-center or DP
- Palindrome partitioning: Minimum cuts to partition into palindromes
- If characters match: dp[i][j] = dp[i-1][j-1] (no operation needed)
- Else: minimum of:
- Insert: dp[i][j-1] + 1
- Delete: dp[i-1][j] + 1
- Replace: dp[i-1][j-1] + 1
- Single pointer: dp[i] for substring ending/starting at i
- Two pointers: dp[i][j] for substring from i to j
- Two strings: dp[i][j] for prefixes of length i and j
- Base cases often involve empty strings
- dp[0][0] typically represents both strings empty
- Many string DP problems only need previous row/column
- Can reduce O(n²) space to O(n)
- DP often finds optimal value
- Store choices to reconstruct actual string/sequence
- Draw the DP table: Visualise for small examples
- Start with recursive solution: Then add memoization
- Check boundary conditions: Empty strings, single characters
- Verify transitions: Do they cover all cases?
- Test with examples: Walk through manually
- Appears in 30%+ of DP interview questions
- Teaches pattern matching and sequence analysis
- Direct applications in text processing, bioinformatics, and data analysis
- Implement binary search variants (first occurrence, last occurrence, search in rotated array)
- Code merge sort and quicksort from scratch
- Solve 10 DP problems—start with Fibonacci, climb stairs, house robber
- Recognise when problems fit greedy vs DP
- Practice backtracking with N-Queens and Sudoku solver
DSA Blog Series - Part 4: Algorithms That Shape Thinking
⚡ Algorithms That Shape Thinking
31. Binary Search and Its Variants
Binary search is the art of finding a target in a sorted array by repeatedly halving the search space. It's one of the most elegant algorithms in computer science.
Classic binary search:
Time complexity: O(log n)—each step eliminates half the remaining elements. For 1 million elements, you need at most 20 comparisons.
The key insight: You can only binary search on sorted (or monotonic) data. The algorithm depends on the property: if the target isn't at position mid, you know exactly which half to discard.
Variants and applications:
1. Find first/last occurrence: In [1, 2, 2, 2, 3], find the first 2. Modify binary search to continue searching left even after finding a match.
2. Find insertion position: Where would you insert target to keep array sorted? This is used in many sorting algorithms.
3. Search in rotated sorted array: [4, 5, 6, 7, 0, 1, 2]. Determine which half is properly sorted, then decide where to search.
4. Binary search on answer: When the answer isn't in an array but lies in a range. "What's the minimum capacity needed?" Try different capacities and check if they work—binary search the solution space.
5. Square root calculation: Find sort(x) by binary searching between 0 and x.
Real-world applications:
Common mistakes:
Binary search transforms O(n) problems into O(log n) solutions. Master it, and you'll see opportunities to apply it everywhere.
32. Sorting Algorithms (Bubble to Quick Sort)
Sorting is fundamental—organising data for efficient searching, processing, and analysis. Different algorithms offer different trade-offs.
Bubble Sort: Repeatedly swap adjacent elements if they're in wrong order. Simple but inefficient—O(n²).
Why learn it? Educational value. Shows how simple comparisons can eventually sort data. Useful for nearly-sorted data where it can be O(n) with optimisation.
Selection Sort: Find the minimum element and move it to the beginning. Repeat for the remaining array—O(n²).
Key property: Minimises number of swaps (only n swaps). Useful when writing to memory is expensive.
Insertion Sort: Build sorted array one element at a time by inserting each element into its correct position—O(n²) worst case, O(n) best case.
Strength: Excellent for small arrays (< 10 elements) and nearly-sorted data. Used in hybrid algorithms like Timsort. Stable and in-place.
Merge Sort: Divide array in half, recursively sort both halves, then merge them—O(n log n) in all cases.
Advantages: Stable, predictable performance, excellent for linked lists and external sorting (data on disk). Disadvantage: O(n) extra space for merging.
Quick Sort: Choose a pivot, partition array so elements smaller than pivot are left, larger are right. Recursively sort both sides—O(n log n) average, O(n²) worst case.
Why popular? Fast in practice, in-place (O(log n) stack space), cache-friendly. Used in most standard libraries with optimizations.
Heap Sort: Build a max-heap, repeatedly extract maximum—O(n log n) guaranteed, in-place.
Trade-off: Not stable, slower in practice than quicksort due to poor cache locality.
Counting Sort / Radix Sort: Non-comparison sorts for integers with limited range—O(n + k) where k is range.
When to use: Small integer ranges. Extremely fast for specific cases like sorting millions of integers between 0-1000.
Practical advice: Don't implement sorting from scratch in production—use language built-ins. But understanding sorting algorithms teaches algorithm design principles: divide-and-conquer, space-time trade-offs, stability considerations.
33. Merge Sort vs Quick Sort
Merge Sort and Quick Sort are the two titans of comparison-based sorting. Both are O(n log n) on average, but their philosophies differ fundamentally.
Merge Sort (Divide and Conquer):
Philosophy: Split first, merge later. Always divides perfectly in half.
Process:
Characteristics:
Quick Sort (Partition and Conquer):
Philosophy: Solve now, split later. Partition around a pivot.
Process:
Characteristics:
The pivot problem:
Bad pivot choices (always smallest/largest) create unbalanced partitions, degrading to O(n²). Solutions:
Performance comparison:
Why quicksort is often faster in practice:
When merge sort wins:
Modern reality: Most standard library sorts use hybrid approaches:
Understanding both algorithms teaches fundamental computer science concepts: recursion, divide-and-conquer, trade-offs between time and space, and why theoretical complexity doesn't always match practical performance.
34. Greedy Algorithms: When Local Choices Work
Greedy algorithms make the locally optimal choice at each step, hoping to find a global optimum. They're fast, elegant, and surprisingly effective—when they work.
The greedy principle: At each decision point, choose what looks best right now without worrying about future consequences.
Classic examples:
1. Coin change (specific denominations): Make change for 67 cents using fewest coins with denominations [25, 10, 5, 1].
Greedy approach: Always take the largest coin possible.
Result: 6 coins. This works for standard denominations but fails for arbitrary ones (try [25, 10, 1] for 30 cents—greedy gives 6 coins, optimal is 3).
2. Activity selection: Schedule maximum non-overlapping activities.
Greedy approach: Always pick the activity that finishes earliest, then repeat for remaining time.
Why it works: Finishing early leaves maximum time for other activities.
3. Huffman coding: Create optimal prefix-free binary codes for data compression.
Greedy approach: Repeatedly combine two lowest-frequency symbols. Provably optimal.
4. Fractional knapsack: Maximise value with weight limit when you can take fractions of items.
Greedy approach: Sort by value-per-weight ratio, take items in order (taking fractions of last item if needed). Optimal.
When greedy works:
Greedy algorithms succeed when the problem has two properties:
When greedy fails:
0/1 Knapsack: Can't take fractions. Greedy by value-per-weight fails. Need dynamic programming.
Shortest path with negative weights: Greedy (Dijkstra's) fails. Need Bellman-Ford.
Coin change with arbitrary denominations: Greedy may give suboptimal results.
Why learn greedy algorithms?
Approach to greedy problems:
Greedy thinking is powerful intuition-building. Even when greedy isn't optimal, it often provides good approximations quickly.
35. Divide and Conquer Strategy
Divide and Conquer is a fundamental algorithm design paradigm: break a problem into smaller subproblems, solve them independently, and combine their solutions.
Three steps:
Classic examples:
Merge Sort:
Binary Search:
Quick Sort:
Maximum Subarray (Kadane variant): Find contiguous subarray with largest sum.
Strassen's Matrix Multiplication: Multiply matrices faster than O(n³) by cleverly dividing matrices and reducing multiplications needed.
Master Theorem for complexity:
For recurrences of form T(n) = aT(n/b) + f(n):
Example: Merge sort: T(n) = 2T(n/2) + O(n)
Why it's powerful:
When to use:
Limitations:
Divide and Conquer is one of the most important algorithmic paradigms. Master it, and you'll recognise opportunities to break down seemingly intractable problems into manageable pieces.
36. Backtracking (N-Queens, Sudoku)
Backtracking explores all possible solutions by incrementally building candidates and abandoning them ("backtracking") when they can't lead to a valid solution.
The core idea: Try a choice. If it leads nowhere, undo it and try another. It's exhaustive search with pruning.
Generic algorithm:
solve(state):
if state is solution:
return state
for each possible choice:
if choice is valid:
make choice
result = solve(new_state)
if result is not failure:
return result
undo choice // backtrack
return failure
Classic problems:
N-Queens: Place N queens on N×N chessboard so no two attack each other.
Approach:
Sudoku: Fill 9×9 grid so each row, column, and 3×3 box contains digits 1-9.
Approach:
Other applications:
Optimisation techniques:
1. Constraint propagation: When you make a choice, immediately eliminate invalid future choices. In Sudoku, placing a 5 eliminates 5 from that row, column, and box.
2. Heuristics: Choose which variable to assign next strategically. "Most constrained variable first" often reduces search space.
3. Pruning: Detect failure early. If current partial solution can't possibly lead to success, backtrack immediately.
Complexity: Often exponential—O(b^d) where b is branching factor and d is depth. Pruning significantly reduces practical runtime.
When to use:
Real-world note: For large problem instances, backtracking may be impractical. Consider heuristics, approximation algorithms, or constraint programming systems.
Backtracking is brute force made smarter. It won't always be efficient, but it's often the most straightforward way to explore possibility spaces systematically.
🧠Dynamic Programming (The Art of Remembering)
37. Introduction to Dynamic Programming
Dynamic Programming (DP) is an optimisation technique that solves complex problems by breaking them into overlapping subproblems, solving each once, and storing results for reuse.
The key insight: Many problems recompute the same subproblems repeatedly. DP avoids this by remembering solutions.
Classic example—Fibonacci:
Naive recursion:
fib(n):
if n ≤ 1: return n
return fib(n-1) + fib(n-2)
This recomputes fib(5) multiple times when calculating fib(7). Complexity: O(2^n)—exponential!
DP solution: Store computed values.
memo = {}
fib(n):
if n ≤ 1: return n
if n in memo: return memo[n]
memo[n] = fib(n-1) + fib(n-2)
return memo[n]
Now complexity: O(n)—linear! Each fib(k) computed once.
When to use DP:
Problems need two properties:
DP vs Divide and Conquer:
DP vs Greedy:
Common DP patterns:
1. Linear DP: Problems on sequences (arrays, strings) Example: Longest Increasing Subsequence
2. 2D DP: Problems on two sequences or grids Example: Edit Distance, Longest Common Subsequence
3. Interval DP: Problems on ranges Example: Matrix Chain Multiplication
4. Subset DP: Problems on subsets Example: Subset Sum, Knapsack
5. Tree DP: Problems on tree structures Example: Diameter of tree, maximum path sum
Identifying DP problems:
Look for these keywords:
The DP mindset:
DP is challenging because it requires seeing the recursive structure and managing state carefully. But once you internalise it, a whole class of problems becomes tractable.
38. Memoization vs Tabulation
DP has two implementation approaches: top-down (memoization) and bottom-up (tabulation). Understanding both gives you flexibility.
Memoization (Top-Down):
Start with the original problem and recursively solve subproblems, storing results in a cache.
Characteristics:
Example—Fibonacci:
memo = {}
def fib(n):
if n <= 1:
return n
if n in memo:
return memo[n]
memo[n] = fib(n-1) + fib(n-2)
return memo[n]
Tabulation (Bottom-Up):
Start with smallest subproblems and iteratively build up to the solution.
Characteristics:
Example—Fibonacci:
def fib(n):
if n <= 1:
return n
dp = [0] * (n+1)
dp[1] = 1
for i in range(2, n+1):
dp[i] = dp[i-1] + dp[i-2]
return dp[n]
Comparison:
Aspect | Memoization | Tabulation |
Direction | Top-down | Bottom-up |
Style | Recursive | Iterative |
Subproblems | Only needed ones | All subproblems |
Space | Recursion stack + cache | DP table only |
Intuition | Often more natural | Requires planning |
Debugging | Can be harder | Usually easier |
When to use which:
Choose memoization when:
Choose tabulation when:
Space optimisation:
Often you only need a few previous states, not the entire table.
Fibonacci space-optimised:
def fib(n):
if n <= 1:
return n
prev2, prev1 = 0, 1
for i in range(2, n+1):
current = prev1 + prev2
prev2, prev1 = prev1, current
return prev1
Now O(1) space instead of O(n)!
Practical advice:
Both techniques are tools in your DP toolkit. Master both, and you'll handle DP problems with confidence.
39. Classic DP Problems (Knapsack, LCS, LIS)
Certain DP problems appear repeatedly and teach fundamental patterns. Master these, and you'll recognise similar structures everywhere.
1. 0/1 Knapsack:
Problem: Given items with weights and values, and a knapsack with weight capacity W, maximise value without exceeding capacity. Each item can be taken once (0 or 1 times).
State: dp[i][w] = maximum value using first i items with capacity w
Transition:
Complexity: O(n × W) time and space
Applications: Resource allocation, budget planning, cutting stock problems
2. Longest Common Subsequence (LCS):
Problem: Find longest subsequence common to two strings. "ABCDGH" and "AEDFHR" → "ADH" (length 3).
State: dp[i][j] = LCS length of first i characters of string1 and first j characters of string2
Transition:
Complexity: O(n × m) time and space
Applications: Diff tools, DNA sequence alignment, version control, plagiarism detection
3. Longest Increasing Subsequence (LIS):
Problem: Find longest strictly increasing subsequence in an array. [10, 9, 2, 5, 3, 7, 101, 18] → [2, 3, 7, 101] (length 4).
State: dp[i] = length of LIS ending at index i
Transition: For each j < i, if arr[j] < arr[i]: dp[i] = max(dp[i], dp[j] + 1)
Complexity: O(n²) time, O(n) space. Can be optimised to O(n log n) with binary search.
Applications: Patience sorting, box stacking problems, scheduling
Pattern recognition:
These three problems teach key DP patterns:
Other important classics:
Edit Distance (Levenshtein): Minimum operations to transform one string to another. Used in spell checkers.
Coin Change: Minimum coins to make amount. Both "minimum count" and "count ways" variants exist.
Matrix Chain Multiplication: Optimal parenthesization to minimize scalar multiplications.
House Robber: Rob houses for maximum profit without robbing adjacent houses. Teaches state optimisation.
Understanding these problems:
These classics form the foundation of DP problem-solving. Once you understand their patterns, you'll recognise variations and tackle new problems more confidently.
40. DP on Strings
String DP problems are incredibly common—appearing in interviews, competitive programming, and real-world applications like text editors and bioinformatics.
Core patterns:
1. Single string problems:
Palindrome problems:
Pattern: Often use 2D DP where dp[i][j] represents substring from index i to j.
2. Two string problems:
Longest Common Subsequence (LCS): Already discussed—the foundation of string DP.
Edit Distance: Transform string1 to string2 using insertions, deletions, substitutions.
State: dp[i][j] = minimum operations to transform first i characters of string1 to first j of string2
Transitions:
Applications: Spell checkers, DNA alignment, version control diffs
Distinct Subsequences: Count distinct subsequences of string2 in string1.
Example: "rabbbit" contains "rabbit" 3 times.
3. String with other constraints:
Word Break: Can string be segmented into dictionary words?
State: dp[i] = can substring [0...i) be segmented?
Transition: dp[i] = true if there exists j where dp[j] = true AND substring[j...i) is in dictionary
Regular Expression Matching: Does string match pattern with '.' (any char) and '*' (zero or more of previous)?
Complex state: dp[i][j] = whether first i chars of string match first j chars of pattern
Common techniques:
1. Define states clearly:
2. Consider empty strings:
3. Space optimisation:
4. Reconstruct solution:
Interview tips:
Why string DP is important:
String DP may seem intimidating initially, but the patterns repeat. Practice a few problems, and you'll start recognising the structures instinctively.
What's Next?
You've completed Part 4, covering fundamental algorithms that shape how programmers think: Binary Search, Sorting, Greedy Methods, Divide and Conquer, Backtracking, and Dynamic Programming fundamentals.
Practice Tips:
Comments
Post a Comment