Welcome, Maxime! Your 4-Day Exam Plan

You just added DSC 40B and exam 1 is in 4 days. That sounds scary, but here is the good news: this guide covers EVERYTHING you need. The material is about how fast code runs — and you can learn it!

What is DSC 40B About? (30-Second Version)

In DSC 40A, you learned the MATH (like: "the answer is the mean"). In DSC 40B, you learn: How does a computer actually CALCULATE that answer? And how FAST can it do it?

Think of a restaurant. DSC 40A is the MENU (what food to make). DSC 40B is the KITCHEN (how to cook it, and how fast each recipe takes).

Your 4-Day Study Schedule

Day	What to Study	Time
Day 1 (Today)	Sections 1-4: Algorithms, Loop Invariants, Counting Operations	~2 hours
Day 2	Sections 5-7: Big-Theta proofs, Best/Worst case, Expected time	~2.5 hours
Day 3	Sections 8-10: Sorting, Log properties, Homework 2 walkthrough	~2 hours
Day 4	Section 11: Full practice exam, then review mistakes	~2 hours

How to Use This Guide (ADHD-Friendly Tips)

Read out loud. When you see a "SAY IT OUT LOUD" box, actually say the words. Research shows speaking helps your brain remember 2x better.

Take breaks. When you see a break box, stand up, stretch, get water. Your brain needs 5-minute breaks every 25 minutes.

Do every quiz. Click the answers, don't just read. Active practice beats passive reading by 3x.

Use the points system. Try to get as many points as possible. The sidebar tracks your score!

"I can learn 7 lectures in 4 days. The material is about how fast code runs. I will take breaks and do every quiz."

Companies like Google, Amazon, and Netflix ask algorithm questions in data science interviews. Learning this material NOW gives you a big advantage for internships!

Lecture 1: What is an Algorithm?

The Big Picture

In DSC 40A, you learned to find answers using math. For example, "the number that minimizes the total absolute error is the median." But 40A stopped there. DSC 40B asks the NEXT question:

OK, the answer is the median. But HOW does a computer CALCULATE the median?

To find the median, you need to: (1) sort all the numbers, (2) pick the middle one. Sorting is the hard part — it is an algorithm. And different sorting algorithms have different speeds!

What is an Algorithm?

An algorithm is a step-by-step set of instructions to solve a problem. Like a recipe for cooking.

You want to make a smoothie. The algorithm is: Step 1: Put fruit in blender. Step 2: Add milk. Step 3: Press blend for 30 seconds. Step 4: Pour into glass. That's an algorithm — clear steps that always work!

The 4 Parts of a Data Science Problem

1. ObjectiveWhat do you WANT? (Example: predict house prices)

2. Computational ProblemTurn the goal into a MATH problem (Example: minimize squared error)

3. AlgorithmStep-by-step PROCEDURE to solve it (Example: compute the mean)

4. ImplementationWrite actual CODE in Python (Example: for loop that adds numbers)

Example: Computing the Mean

def mean(data): # Algorithm: add all numbers, divide by count total = 0 for x in data: total += x # add each number return total / len(data) # divide by how many

This algorithm is correct (it always gives the right answer) and efficient (it only looks at each number once).

Two Big Questions in DSC 40B

Question 1: CORRECTNESS — Does the algorithm always give the right answer? For EVERY possible input?

Question 2: EFFICIENCY — How fast is it? If you have 1,000 numbers, how many steps? What about 1,000,000 numbers?

At a data internship, you might need to process millions of customer records. If your algorithm is slow, it could take DAYS instead of SECONDS. That's why efficiency matters in real jobs!

"An algorithm is step-by-step instructions. DSC 40B asks: is it CORRECT and is it FAST?"

How Fast is Fast? A Taste of What's Coming

Imagine 3 algorithms processing n = 1,000,000 data points:

Algorithm	Speed Type	Operations	Time
A (Linear)	n	1,000,000	1 second
B (Quadratic)	n²	1,000,000,000,000	11.57 days!
C (Cubic)	n³	10¹&sup8;	31,709 years!

See the difference? A slow algorithm on big data is USELESS. This is why we study efficiency!

Search YouTube for: "DSC 40B lecture 1 Marina Langlois" or "What is an algorithm CS explained simply" for video explanations.

Quiz: Lecture 1 (10 points each)

Q1: What does DSC 40B focus on that 40A did not?

Finding the best loss function How to compute answers efficiently with algorithms Building machine learning models Collecting data from users

Q2: Which is NOT part of a data science problem?

Objective Algorithm Database schema Implementation

Stand up, stretch your arms, take 3 deep breaths. You finished Lecture 1!

Lecture 2: Loop Invariants (Proving Algorithms Work)

The Problem: How Do We PROVE an Algorithm is Correct?

Testing with a few examples is not enough. Maybe your algorithm works on [1,2,3] but fails on [999, -5, 0]. We need PROOF that it works on ALL inputs.

What is a Loop Invariant?

A loop invariant is a statement (a fact) that is TRUE:

• Before the loop starts (we call this "after iteration 0")

• After every single iteration of the loop

• After the loop ends (and this tells us the algorithm is correct!)

You are counting money in a cash register. Your loop invariant is: "After I count each bill, my running total equals the real total of bills I've counted so far." This is true before you start (total = $0, counted 0 bills), true after each bill, and true at the end (total = all money).

The 3-Step Proof Method

INITIALIZE — Show the invariant is true BEFORE the loop starts (iteration 0). Usually the variables start at 0 or empty.

MAINTAIN — IF the invariant is true before iteration α, show it is STILL true after iteration α. (Like dominoes: if one falls, the next one falls too.)

TERMINATE — The loop ends. Plug in the FINAL iteration number into the invariant. This gives you the correctness statement!

Worked Example: Proving mean() is Correct

def mean(data): total = 0 # line 1 for x in data: # line 2: loops n times total += x # line 3 return total / len(data)

Invariant: "After the αth iteration, total contains the sum of the first α elements of data."

INITIALIZE (α = 0): Before the loop, total = 0. Sum of first 0 elements = 0. ✓ TRUE! MAINTAIN (assume true for α-1, prove for α): Assume: after iteration α-1, total = sum of first (α-1) elements. On iteration α, we add x (the αth element) to total. So total becomes: sum of first (α-1) elements + αth element = sum of first α elements. ✓ TRUE! TERMINATE (α = n): Loop runs n times total. By the invariant: total = sum of ALL n elements. Then we return total/n = the mean. ✓ CORRECT!

"A loop invariant is a fact that stays true before and after every iteration. I prove it with: Initialize, Maintain, Terminate."

In software engineering interviews, they sometimes ask you to PROVE your solution works. Loop invariants are the formal tool for this. Even if they don't ask for a formal proof, understanding invariants helps you debug code!

Search YouTube: "loop invariant proof explained" or "how to prove algorithm correct loop invariant" for visual walkthroughs.

Quiz: Loop Invariants (10 points each)

Q1: What is a loop invariant?

A variable that never changes inside a loop A fact that is true before and after every loop iteration The number of times a loop runs A loop that runs forever

Q2: In the mean() proof, what is the initialization step?

Show total = sum of all elements after the loop Show total = 0 before the loop starts (sum of 0 elements = 0) Show the loop runs n times Show total increases by x each iteration

Get some water! You just learned a proof technique that CS students struggle with for weeks. You're doing great!

Lecture 3: How We Measure Speed (Time Complexity)

Why Can't We Just Use a Stopwatch?

You might think: run the code and time it! But this doesn't work because:

• A fast computer gives different times than a slow computer

• Other apps running in the background change the time

• Different inputs give different times

Solution: COUNT how many basic operations the code does!

What is a "Basic Operation"?

A basic operation takes the same amount of time no matter what. We call this constant time or Θ(1).

Basic Operations Θ(1)	NOT Basic (takes more time)
`arr[i]` — access one element	`sum(arr)` — adds ALL n elements = Θ(n)
`x + y` — add two numbers	`max(arr)` — checks ALL n elements = Θ(n)
`x == y` — compare two numbers	`sorted(arr)` — sorts = Θ(n log n)
`x = 5` — assign a value	`arr.copy()` — copies ALL n elements = Θ(n)
`len(arr)` — Python stores this*	`"hello" in arr` — searches ALL = Θ(n)

*In Python, len() is constant time because the length is stored. But in this class, follow what the professor says.

T(n): The Total Operation Count

T(n) = the total number of basic operations when input size is n.

The Table Method (Step by Step)

For each line of code, ask: (1) How much work does ONE execution cost? (2) How many times does this line execute? (3) Multiply them.

Example: Count how many positive numbers are in a list def count_positives(nums): # n = len(nums) count = 0 # cost: c₁, runs: 1 time for i in range(n): # cost: c₂, runs: n times if nums[i] > 0: # cost: c₃, runs: n times count += 1 # cost: c₄, runs: ≤ n times return count # cost: c₅, runs: 1 time T(n) = c₁ + c₂·n + c₃·n + c₄·n + c₅ = (c₂ + c₃ + c₄)·n + (c₁ + c₅) = some_constant · n + another_constant = Θ(n) ← LINEAR TIME!

A cashier scanning groceries: if you have n items, the cashier scans each item once. That's n scans = linear time. If you have 100 items, it takes about 100 scans. If you have 1000 items, about 1000 scans.

Simple Loop = Θ(n)

for i in range(n): # runs n times print(i) # constant work each time # Total: n × c = Θ(n)

Nested Independent Loops = Θ(n²)

for i in range(n): # outer: n times for j in range(n): # inner: n times for EACH i print(i, j) # constant work # Total: n × n = n² = Θ(n²)

A hotel has n floors and n rooms per floor. To check every room: n floors × n rooms = n² total rooms. That's quadratic time!

Nested Dependent Loops = Still Θ(n²)!

for i in range(n): for j in range(i): # depends on i! print(i, j)

When i =	Inner loop runs
0	0 times
1	1 time
2	2 times
...	...
n-1	n-1 times

Total = 0 + 1 + 2 + ... + (n-1) = n(n-1)/2 = Θ(n²)

The formula 1 + 2 + 3 + ... + k = k(k+1)/2 is called the arithmetic series formula. MEMORIZE IT for the exam!

When you see for j in range(i) inside for i in range(n), the total is always n(n-1)/2 = Θ(n²). This pattern appears A LOT on exams.

"Simple loop = Theta n. Nested loops = Theta n-squared. Dependent nested loops = still Theta n-squared because 1+2+...+n = n(n+1)/2."

Quiz: Time Complexity (10 points each)

Q1: What is T(n) for this code?

for i in range(n): for j in range(n): for k in range(n): print(i,j,k)

Θ(n) Θ(n²) Θ(n³) Θ(n log n)

Q2: What is 1+2+3+...+100?

5,000 5,050 10,000 10,100

Walk around for 2 minutes! You've learned the most important building block of the course.

Lectures 4-5: Big-Θ, Big-O, Big-Ω

Why Do We Need Special Notation?

When we count operations, we get things like T(n) = 5n² + 3n + 42. But the "5" and the "+3n+42" don't really matter when n is HUGE. We want to say "this grows like n²" without worrying about details.

Θ(n²) means "grows at the same RATE as n²"

The Three Notations: A Simple Comparison

Θ (Theta) = "Exactly"f(n) grows at EXACTLY the same rate as g(n). Like a sandwich: g(n) is both above AND below f(n).

O (Big-O) = "At Most"f(n) grows NO FASTER than g(n). Upper bound only. Like Amazon saying "delivery in at most 7 days."

Ω (Omega) = "At Least"f(n) grows AT LEAST as fast as g(n). Lower bound only. Like a restaurant saying "cooking takes at least 15 minutes."

KEY RULE:Θ = O AND Ω together! If something is exactly n², then it's also at-most n² AND at-least n².

Formal Definitions (You MUST Know These for the Exam)

Θ (Theta) — The Sandwich

f(n) = Θ(g(n)) means: there exist positive constants c&sub1;, c&sub2;, N such that
for all n ≥ N: c&sub1; · g(n) ≤ f(n) ≤ c&sub2; · g(n)

Think of it as: when n is big enough, f(n) is always SANDWICHED between two scaled copies of g(n).

O (Big-O) — The Ceiling

f(n) = O(g(n)) means: there exist positive constants c, N such that
for all n ≥ N: f(n) ≤ c · g(n)

Ω (Omega) — The Floor

f(n) = Ω(g(n)) means: there exist positive constants c, N such that
for all n ≥ N: c · g(n) ≤ f(n)

Imagine you're running a database query at work. Your boss asks "how long will this take?" If you say "Θ(n²)" you mean "it grows exactly like n-squared." If you say "O(n²)" you mean "at worst, it grows like n-squared, but maybe faster." If you say "Ω(n)" you mean "it takes at LEAST linear time, no matter what."

Critical Rules for the Exam

Rule 1: Drop Constants

5n² = Θ(n²) — the "5" disappears because for big n, constants don't change the growth rate.

Rule 2: Keep Only the Biggest Term

n² + n + 100 = Θ(n²) — when n = 1,000,000: n² = 10¹² but n = 10&sup6;. The n² term completely dominates!

Rule 3: Θ Implies Both O and Ω

If f = Θ(n²) then f = O(n²) AND f = Ω(n²).

Rule 4: O Can Be Looser Upward

If f = Θ(n²), then f = O(n²), f = O(n³), f = O(n¹&sup0;&sup0;) — all TRUE! (Saying "at most n³" is true even if it's really n².)

Rule 5: Ω Can Be Looser Downward

If f = Θ(n²), then f = Ω(n²), f = Ω(n), f = Ω(1) — all TRUE!

EXAM FAVORITE: For f(n) = 5n² - 100n + 100, which bounds are TRUE?
• Θ(n²) YES • O(n²) YES • Ω(n²) YES • O(n³) YES (looser upper) • Ω(n) YES (looser lower)
• Θ(n) NO • O(n) NO • Θ(n³) NO • Ω(n³) NO

The Growth Rate Ladder (Memorize This!)

Θ(1)
constant

Θ(log n)
logarithmic

Θ(n)
linear

Θ(n log n)
linearithmic

Θ(n²)
quadratic

Θ(n³)
cubic

← FAST SLOW →

Tricky True/False (Exam Favorites!)

Trap 1: Can You Divide Θ by O?

Q: If f&sub1; = Θ(g&sub1;) and f&sub2; = O(g&sub2;), is f&sub1;/f&sub2; = Θ(g&sub1;/g&sub2;)?

A: FALSE! O only gives an upper bound. f&sub2; could be MUCH smaller than g&sub2;, making the division MUCH bigger.

Trap 2: Does Θ(n²) Always Beat Θ(n³) at n=100?

A: FALSE! Θ hides constants. Algorithm A could be 1,000,000·n² and Algorithm B could be 0.001·n³. At n=100, B is faster!

Trap 3: If f = Ω(n&sup5;) and f = O(n&sup7;), must f be Θ(n&sup5;), Θ(n&sup6;), or Θ(n&sup7;)?

A: FALSE! f could be Θ(n&sup5;.³) or Θ(n&sup6;.&sup7;) or any power between 5 and 7. Or f might not even HAVE a Θ bound!

"Theta means EXACTLY. Big-O means AT MOST. Omega means AT LEAST. Theta equals Big-O AND Omega together."

Proving Θ with Constants (Chain of Inequalities)

What the Exam Asks You to Do

The exam gives you a function like f(n) = 3n² - 8n + 4 and asks you to prove it is Θ(n²) by finding actual numbers for c&sub1;, c&sub2;, and N.

You need to find c&sub1;, c&sub2;, N so that:
c&sub1; · n² ≤ 3n² - 8n + 4 ≤ c&sub2; · n² for all n ≥ N

Step 1: Prove the Upper Bound (f(n) ≤ c&sub2; · g(n))

The trick: make f(n) BIGGER until it looks like c·n².

Two tools to make something bigger:

• "Promote" lower terms: replace smaller powers with bigger powers. Example: replace +4 with +4n² (since n² ≥ 1 when n ≥ 1, so 4n² ≥ 4)

• "Drop" negative terms: removing a subtraction makes things bigger. Example: 3n² - 8n ≤ 3n² (because we removed -8n)

Example: Upper bound for 3n² - 8n + 4

3n² - 8n + 4 ≤ 3n² + 4 (drop -8n, making it bigger) ≤ 3n² + 4n² (replace 4 with 4n², since n²≥1 when n≥1) = 7n² So: 3n² - 8n + 4 ≤ 7n² for all n ≥ 1 Choose c₂ = 7, works when N ≥ 1 ✓

Step 2: Prove the Lower Bound (c&sub1; · g(n) ≤ f(n))

The trick: make f(n) SMALLER until it looks like c·n².

Two tools to make something smaller:

• "Drop" positive terms: removing an addition makes things smaller.

• "Absorb" negative terms: split the leading term to "eat" the negative parts.

Example: Lower bound for 3n² - 8n + 4

3n² - 8n + 4 ≥ 3n² - 8n (drop +4, making it smaller) = 2n² + (n² - 8n) (split 3n² into 2n² + n²) ≥ 2n² (because n² - 8n ≥ 0 when n ≥ 8) So: 2n² ≤ 3n² - 8n + 4 for all n ≥ 8 Choose c₁ = 2, N = 8 ✓

Putting It Together

CONCLUSION: With c₁ = 2, c₂ = 7, N = 8: 2n² ≤ 3n² - 8n + 4 ≤ 7n² for all n ≥ 8 Therefore: 3n² - 8n + 4 = Θ(n²) ✓

The constants don't have to be "perfect." ANY valid c&sub1;, c&sub2;, N that work will get full credit! Different students can have different constants and ALL be correct.

In database optimization, you prove that a query plan runs in Θ(n log n) instead of Θ(n²). This is the SAME skill — bounding functions with inequalities. Companies like Meta and Uber need this for their billion-row databases.

Another Full Example: 4n³ - 5n² + 50 = Θ(n³)

Upper bound: 4n³ - 5n² + 50 ≤ 4n³ + 50 (drop -5n²) ≤ 4n³ + 50n³ (replace 50 with 50n³, since n³≥1) = 54n³ c₂ = 54, N = 1 ✓ Lower bound: 4n³ - 5n² + 50 ≥ 4n³ - 5n² (drop +50) = 3n³ + (n³ - 5n²) (split 4n³) ≥ 3n³ (because n³ - 5n² = n²(n-5) ≥ 0 when n ≥ 5) c₁ = 3, N = 5 ✓ Final: c₁=3, c₂=54, N=5 3n³ ≤ 4n³ - 5n² + 50 ≤ 54n³ for all n ≥ 5 Therefore: 4n³ - 5n² + 50 = Θ(n³) ✓

"Upper bound: make it bigger by dropping negatives and promoting small terms. Lower bound: make it smaller by dropping positives and absorbing negatives into the leading term."

Quiz: Proving Theta (10 points each)

Q1: To prove f(n) ≤ c&sub2;·n² (upper bound), what can you do?

Drop positive terms and add negative terms Drop negative terms and promote smaller positive terms to n² Set c&sub2; = 1 always Just use limits

Q2: For f(n)=2n²+3n, which is a valid upper bound proof?

2n²+3n ≤ 2n²+3n² = 5n², so c&sub2;=5 2n²+3n ≤ n², so c&sub2;=1 2n²+3n ≥ 2n², so c&sub2;=2

Great job! This is the HARDEST part of the course for most people. Take a 5 minute break and come back strong!

Lecture 6: Best Case, Worst Case & Expected Time

Why Do Some Algorithms Have Different Speeds?

Some code has if statements or early exits that can make it stop early. Depending on the input, the SAME code can be fast or slow!

Insertion Sort — The Classic Example

def insertion_sort(arr): n = len(arr) for i in range(1, n): # outer: always n-1 times x = arr[i] j = i - 1 while j >= 0 and x < arr[j]: # THIS varies! arr[j+1] = arr[j] j -= 1 arr[j+1] = x

Best Case: Array Already Sorted [1, 2, 3, 4, 5]

Each new element is already bigger than everything before it. The while loop condition x < arr[j] is FALSE immediately every time. While loop runs 0 times!

Best case: outer loop n-1 times × constant work = Θ(n)

Worst Case: Array in Reverse [5, 4, 3, 2, 1]

Each new element is SMALLER than everything before it. The while loop runs the MAXIMUM number of times:

i=1: 1 time, i=2: 2 times, ..., i=n-1: n-1 times

Worst case: 1+2+...+(n-1) = n(n-1)/2 = Θ(n²)

Imagine sorting a deck of cards. If the deck is already sorted, you just check each card once (fast!). If the deck is backwards, every card has to be moved to the front (slow!).

The foo() Function — Another Exam Favorite

def foo(arr): for x in arr: # outer: n times for y in arr: # inner: n times if sum([x,y]) == 5: # sum of 2 = Θ(1)! return sum(arr) # sum of n = Θ(n)! return False

CRITICAL: sum([x,y]) adds only 2 numbers = Θ(1). But sum(arr) adds ALL n numbers = Θ(n)!

Best case: The if-condition is true on the FIRST check. We do 1 outer + 1 inner + return sum(arr) which is Θ(n). Answer: Θ(n)

Worst case: The if-condition is NEVER true. We check all n×n pairs, each taking Θ(1), then return False. Answer: Θ(n²)

Expected Time (Probability-Weighted Average)

When code uses RANDOMNESS (like coin flips or random numbers), we compute the expected time:

E[T(n)] = ∑ P(case i) × T(case i)

Example: Coin Toss Code

def double_coin_toss(n): coin_1 = np.random.rand() > 0.5 coin_2 = np.random.rand() > 0.5 if coin_1 and coin_2: # P = 1/4 (both heads) # runs n times = Θ(n) i = n while i > 0: print("Two heads!"); i -= 1 else: # P = 3/4 for i in range(n**2): print("Nope") # Θ(n²)

Cases: • Both heads: P = 1/2 × 1/2 = 1/4, Time = Θ(n) • Not both heads: P = 3/4, Time = Θ(n²) E[T(n)] = (1/4)·n + (3/4)·n² = Θ(n²) (the n² term dominates)

Example: Random Integer Code

def foo(n): k = np.random.randint(n) # random 0 to n-1 if k < np.sqrt(n): # P = √n/n = 1/√n for i in range(n**2): print(i) # Θ(n²) else: # P = 1 - 1/√n print('Never mind...') # Θ(1)

E[T(n)] = (1/√n)·n² + (1 - 1/√n)·1 = n²/√n + constant = n^(3/2) + constant = Θ(n^(3/2)) = Θ(n√n)

For expected time problems: (1) List ALL possible cases. (2) Find the probability of each case. (3) Find the time for each case. (4) Multiply and add. (5) Simplify.

Lecture 7: Brute Force & Selection Sort

Brute Force = Try Everything

A brute force algorithm tries ALL possible answers and picks the best one. It's always correct but often SLOW.

You forgot your 4-digit phone passcode. Brute force: try 0000, 0001, 0002, ..., 9999. You'll find it, but it could take 10,000 tries!

Linear Search

def linear_search(arr, target): for i in range(len(arr)): if arr[i] == target: return i # found it! return -1 # not found

Best case: target is first element → Θ(1)

Worst case: target is not in array → Θ(n)

Selection Sort

Idea: find the smallest element, put it first. Then find the second smallest, put it second. Repeat.

def selection_sort(arr): n = len(arr) for i in range(n): min_idx = i for j in range(i+1, n): # find minimum in rest if arr[j] < arr[min_idx]: min_idx = j arr[i], arr[min_idx] = arr[min_idx], arr[i] # swap

Time complexity: The inner loop runs (n-1) + (n-2) + ... + 1 + 0 = n(n-1)/2 times. ALWAYS Θ(n²), no matter what the input is! (No early exit possible.)

Theoretical Lower Bounds

A lower bound answers: "What is the MINIMUM work ANY algorithm must do for this problem?"

Example: Given n numbers, find if True or False is more common. Lower bound: Ω(n) because you MUST look at every element. You can't know the answer without seeing all n values! Example: Given n integers between 1-100, sort them. Lower bound: Ω(n) because you must at least READ all n numbers. (Note: comparison sorts need Ω(n log n), but counting sort can do Θ(n)!)

Selection sort is too slow for big data (n² is terrible for millions of rows). In real jobs, you use faster sorts like merge sort Θ(n log n) or Python's built-in sorted() which uses Timsort. But understanding selection sort helps you understand WHY faster algorithms matter!

Quiz: Sorting (10 points each)

Q1: Selection sort time complexity?

Best: Θ(n), Worst: Θ(n²) Always Θ(n²) Always Θ(n log n)

Log Properties Cheat Sheet

What is a Logarithm? (Simple Explanation)

log&sub2;(n) asks: "2 to WHAT power gives me n?"

log₂(8) = 3 because 2³ = 8 log₂(16) = 4 because 2⁴ = 16 log₂(1024) = 10 because 2¹⁰ = 1024 log₁₀(1000) = 3 because 10³ = 1000

In computer science, "log" usually means log&sub2; unless stated otherwise.

The 5 Properties You Need

#	Property	Example
1	log(a·b) = log(a) + log(b)	log&sub2;(8·4) = log&sub2;(8)+log&sub2;(4) = 3+2 = 5
2	log(a^k) = k·log(a) VERY IMPORTANT!	log&sub2;(8³) = 3·log&sub2;(8) = 3·3 = 9
3	a^log_a(n) = n	2^log&sub2;(8) = 2³ = 8
4	log_b(c) = 1/log_c(b) (given as hint on exam!)	1/log&sub2;(n) = log_n(2)
5	All logs grow at the same rate: log&sub2;(n) = Θ(log&sub1;&sub0;(n)) = Θ(ln(n))	For Θ, the base doesn't matter!

Exam Trick: Simplify n^{1/log&sub2;(n)} × n

Step 1: Use Property 4: 1/log₂(n) = log_n(2) Step 2: n^(log_n(2)) = 2 (by Property 3, with a=n, x=2) Step 3: 2 × n = 2n = Θ(n) ✓

Exam Trick: Simplify 12·log&sub2;(3^n²-2n)

Use Property 2: log(a^k) = k·log(a) = 12·(n²-2n)·log₂(3) = 12·log₂(3)·n² - 24·log₂(3)·n Dominant term: Θ(n²)

"Log of a power: bring the exponent DOWN in front. This is Property 2 and it's the most important log property for this exam."

Quick Quiz (10 points)

Q1: What is log&sub2;(2ⁿ) simplified?

2n n n² 2ⁿ

Homework 2 Walkthrough

Due: Wednesday April 15 (TODAY!) These are the types of problems and how to solve them.

Problem Type 1: Prove Θ-notation

Given f(n) = 3n² - 8n + 4, prove f(n) = Θ(n²).

Strategy: 1. Identify dominant term → n² 2. Upper bound: drop negatives, promote positives 3n² - 8n + 4 ≤ 3n² + 4 ≤ 3n² + 4n² = 7n² c₂ = 7, N = 1 3. Lower bound: drop positives, absorb negatives 3n² - 8n + 4 ≥ 3n² - 8n ≥ 2n² + (n² - 8n) ≥ 2n² when n ≥ 8 c₁ = 2, N = 8 4. Conclusion: 2n² ≤ f(n) ≤ 7n² for n ≥ 8, so f(n) = Θ(n²)

Practice: f(n) = (n² + 2n - 5)/(n - 10)

Hint: For large n, this is approximately n²/n = n. Actually: (n² + 2n - 5)/(n - 10) Long division or factor: For n ≥ 20: n-10 ≥ n/2, so 1/(n-10) ≤ 2/n Upper: (n²+2n-5)/(n-10) ≤ (n²+2n)/(n-10) For n≥20: ≤ (n²+2n)/(n/2) = 2n+4 ≤ 6n So O(n) with c₂=6, N=20 Lower: (n²+2n-5)/(n-10) ≥ n²/(n-10) - 5/(n-10) For n≥20: ≥ n²/(2n) = n/2 So Ω(n) with c₁=1/2, N=20 Therefore: Θ(n)

Problem Type 2: Combining Runtime Bounds

Given T&sub1;(n) = Θ(n².&sup5;), T&sub2;(n) = O(n log n), etc.

Key Rules: • Adding: f + g → keep the BIGGER one Θ(n²) + Θ(n) = Θ(n²) • If BOTH are Θ: Θ(f) + Θ(g) = Θ(max(f,g)) • If one is O: you can only get O for the sum • If one is Ω: you can only get Ω for the sum Example: T₁(n) = Θ(n^2.5) + T₂(n) = O(n log n) Sum: T₁ dominates → Θ(n^2.5) (Because Θ(n^2.5) is bigger than O(n log n), and Θ gives exact bound) Multiplication: Θ(f) × Θ(g) = Θ(f·g) But Θ(f) × O(g) = O(f·g) (can only get upper bound)

Problem Type 3: Best/Worst Case of Code

Strategy: 1. Identify: are there early exits, if-statements, or while loops? 2. If YES → best and worst case may differ 3. If NO (just for loops) → only one case Best case: What input makes code finish FASTEST? Worst case: What input makes code take LONGEST? For kth_largest (nested for loops, no early exit): Both best and worst = Θ(n²)

Problem Type 4: Expected Time

Template: 1. List all cases and their probabilities 2. Find time for each case 3. E[T(n)] = Σ P(case) × T(case) 4. Simplify: keep dominant term For baz(n): x = random integer 0 to n-1 Inner loops: for i in range(x), for j in range(i) → Θ(x²) E[T(n)] = (1/n) × Σ_{x=0}^{n-1} x² = (1/n) × n(n-1)(2n-1)/6 [sum of squares formula] = Θ(n²)

Problem Type 5: Theoretical Lower Bounds

Strategy: Ask "what is the MINIMUM work ANY algorithm must do?" Example: Given n Trues/Falses, which is more common? → Ω(n) because you must check every element. If you skip even one, it could change the answer! Example: Sort n integers between 1-100. → Ω(n) because you must read all n numbers. (Comparison sorts need Ω(n log n), but since values are bounded 1-100, counting sort achieves Θ(n)!) Example: Find max in √n × n array with sorted rows. → Ω(√n) because you must check the last element of each of the √n rows (each row's max is its last element).

For lower bound arguments, think about what information you MUST gather. If you need to see every element, it's Ω(n). If you can skip some, think about the minimum you must see.

Practice Exam (20 Questions)

Time yourself! Try to finish in 50 minutes. Check answers AFTER attempting each one.

PE-1: What is an algorithm? (5 pts)

A programming language A step-by-step procedure to solve a problem A type of data structure

PE-2: What are the 3 steps to prove a loop invariant? (5 pts)

Design, Code, Test Initialize, Maintain, Terminate Input, Process, Output

PE-3: sum([x, y]) where x and y are single numbers is: (5 pts)

Θ(n) Θ(1) — constant time Θ(n²)

PE-4: sum(arr) where arr has n elements is: (5 pts)

Θ(1) Θ(n) Θ(n²)

PE-5: 1 + 2 + 3 + ... + n = ? (5 pts)

n² n(n+1)/2 2ⁿ

PE-6: 7n³ + 2n² - 5n + 100 = Θ(?) (5 pts)

Θ(n²) Θ(n³) Θ(7n³)

PE-7: If f = Θ(n²), is f = O(n³) true? (5 pts)

Yes — looser upper bound is valid No — O must match exactly

PE-8: If f = Θ(n²), is f = Ω(n³) true? (5 pts)

Yes No — n³ is too big to be a lower bound for n²

PE-9: Best case of insertion sort? (5 pts)

Θ(1) Θ(n) Θ(n²)

PE-10: Worst case of insertion sort? (5 pts)

Θ(n) Θ(n log n) Θ(n²)

PE-11: Selection sort is always: (5 pts)

Θ(n) Θ(n²) Θ(n log n)

PE-12: log&sub2;(a^k) = ? (5 pts)

a · log&sub2;(k) k · log&sub2;(a) log&sub2;(a) + log&sub2;(k)

PE-13: Θ(n²) + Θ(n) = ? (5 pts)

Θ(n³) Θ(n²) Θ(n² + n)

PE-14: Θ(n) × Θ(n) = ? (5 pts)

Θ(n) Θ(2n) Θ(n²)

PE-15: TRUE or FALSE: Θ(n²) always beats Θ(n³) at n=100 (5 pts)

TRUE FALSE — constants matter at specific n

PE-16: Coin flip: heads → Θ(n), tails → Θ(n²). Expected time? (5 pts)

Θ(n) Θ(n²) Θ(n¹.&sup5;)

PE-17: Best case of linear_search? (5 pts)

Θ(1) Θ(n) Θ(log n)

PE-18: Lower bound to find if True or False is more common in n items? (5 pts)

Ω(1) Ω(log n) Ω(n)

PE-19: What does the "N" in the Θ definition represent? (5 pts)

The input size The threshold after which the inequality holds for all larger n The number of operations

PE-20: If f=Ω(n&sup5;) and f=O(n&sup7;), must f be Θ(n&sup5;), Θ(n&sup6;), or Θ(n&sup7;)? (5 pts)

Yes, one of those must be true No — f could be any growth rate between n&sup5; and n&sup7;

Quick Reference & Cheat Sheet

Print this page or screenshot it for last-minute review!

Growth Rate Ladder (Slow to Fast)

Notation	Name	Example Code Pattern	n=1M time
Θ(1)	Constant	arr[0], x+y	instant
Θ(log n)	Logarithmic	binary search	~20 steps
Θ(n)	Linear	single for loop	1 second
Θ(n log n)	Linearithmic	merge sort	~20 seconds
Θ(n²)	Quadratic	nested for loops	~12 days
Θ(n³)	Cubic	triple nested loops	~31,709 years

Key Formulas

1 + 2 + 3 + ... + n = n(n+1)/2

1² + 2² + ... + n² = n(n+1)(2n+1)/6

log(a^k) = k · log(a) | log(a·b) = log(a) + log(b)

Definitions

Θ: c&sub1;·g(n) ≤ f(n) ≤ c&sub2;·g(n) for n ≥ N
O: f(n) ≤ c·g(n) | Ω: c·g(n) ≤ f(n)

Algorithm Complexities

Algorithm	Best Case	Worst Case
Linear Search	Θ(1)	Θ(n)
Insertion Sort	Θ(n)	Θ(n²)
Selection Sort	Θ(n²)	Θ(n²)

Proving Θ Cheat Sheet

For Upper Bound (f ≤ c&sub2;·g)	For Lower Bound (c&sub1;·g ≤ f)
Drop negative terms (makes bigger)	Drop positive lower-order terms (makes smaller)
Promote small positives: replace 5 with 5n²	Split leading term: 4n² = 3n² + n²
Add together: 3n² + 5n² = 8n²	Show leftover ≥ 0: n²-8n ≥ 0 when n ≥ 8

Expected Time Template

E[T(n)] = P(case 1) × T(case 1) + P(case 2) × T(case 2) + ... Then simplify: keep only the dominant term.

Common Traps

• Cannot divide Θ by O — only do arithmetic when BOTH are Θ

• Θ(n²) does NOT always beat Θ(n³) at specific n — constants matter!

• sum([x,y]) = Θ(1) but sum(arr) = Θ(n)

• Ω(n&sup5;) and O(n&sup7;) does NOT force Θ to be n&sup5;, n&sup6;, or n&sup7;

"I know the definitions. I know the formulas. I can prove Theta. I can analyze best/worst case. I am READY for this exam!"

You made it through the entire guide! Now go back and redo any quizzes you got wrong. Good luck on the exam, Maxime!

DSC 40B Exam 1