Navigation

Programming

Big O Notation: Why My AWS Bill Made Me Finally Understand Algorithm Efficiency

How a $30,000 AWS bill taught a developer the real-world importance of Big O notation, with practical examples from Amazon, Microsoft, and a fintech startup in Seattle.
Jul 05, 2025
9 min read

Big O Notation: Why My AWS Bill Made Me Finally Understand Algorithm Efficiency

Last year at Amazon, I wrote what I thought was a clever piece of code to match user preferences. It worked perfectly in testing. Then we deployed it to production with 10 million users, and my manager called me into a meeting with a printout of our AWS bill. That’s when Big O notation stopped being “that theoretical stuff from school” and became very, very real.

The $30,000 Learning Experience

Here’s the code that nearly gave our finance team a heart attack:

# My "clever" code
def find_matching_users(user, all_users):
    matches = []
    for potential_match in all_users:  # 10 million users
        for interest in user.interests:  # ~50 interests
            for other_interest in potential_match.interests:  # ~50 interests
                if interest == other_interest:
                    matches.append(potential_match)
                    break
    return matches

See the problem? That’s O(n × m × k) where n is 10 million. My laptop handled 100 test users fine. AWS handled 10 million users… expensively.

Big O: Your Code’s Appetite

I finally understood Big O when my dad explained it using his computer repair shop analogy. Different repair strategies have different time costs:

  • O(1) - Looking up a part by its exact shelf location
  • O(log n) - Finding a part using organized categories
  • O(n) - Checking every shelf one by one
  • O(n²) - Comparing every part with every other part

In coding terms:

# O(1) - Constant time (instant bubble tea)
def get_user_by_id(user_id, users_dict):
    return users_dict[user_id]  # Direct lookup

# O(log n) - Logarithmic time (finding a coffee shop in Seattle)
def binary_search(sorted_array, target):
    left, right = 0, len(sorted_array) - 1
    while left <= right:
        mid = (left + right) // 2
        if sorted_array[mid] == target:
            return mid
        elif sorted_array[mid] < target:
            left = mid + 1
        else:
            right = mid - 1
    return -1

# O(n) - Linear time (reading all my Slack messages)
def find_user_by_email(email, users_list):
    for user in users_list:
        if user.email == email:
            return user
    return None

# O(n²) - Quadratic time (comparing everyone's lunch orders)
def find_duplicates(items):
    duplicates = []
    for i in range(len(items)):
        for j in range(i + 1, len(items)):
            if items[i] == items[j]:
                duplicates.append(items[i])
    return duplicates

The Seattle Traffic Analogy

Living in Seattle taught me to think about Big O in terms of traffic:

  • O(1): Taking the light rail - always 20 minutes
  • O(log n): Smart routing with traffic apps - gets worse slowly
  • O(n): Driving through downtown - time proportional to distance
  • O(n²): Checking every possible route - exponentially worse

Real-World Big O: Tales from Production

The Microsoft Search Incident

At Microsoft, I optimized a search feature:

// Before: O(n × m) - checking every tag against every search term
function searchPosts(posts, searchTerms) {
    return posts.filter(post => {
        for (let term of searchTerms) {
            for (let tag of post.tags) {
                if (tag.includes(term)) return true;
            }
        }
        return false;
    });
}

// After: O(n + m) - using a Set for lookups
function searchPostsOptimized(posts, searchTerms) {
    const termSet = new Set(searchTerms);
    return posts.filter(post => 
        post.tags.some(tag => termSet.has(tag))
    );
}

Result: Search time went from 3 seconds to 50ms. My team lead bought me bubble tea.

The Payment Processing Optimization

At RainCity FinTech, we had a reconciliation nightmare:

# The problem: O(n²) - comparing every transaction
def reconcile_payments_slow(bank_transactions, our_records):
    matched = []
    for bank_txn in bank_transactions:  # 100,000 daily
        for our_txn in our_records:     # 100,000 daily
            if (bank_txn.amount == our_txn.amount and 
                bank_txn.date == our_txn.date):
                matched.append((bank_txn, our_txn))
    return matched

# The solution: O(n) - using a hash map
def reconcile_payments_fast(bank_transactions, our_records):
    # Create lookup map
    our_map = {}
    for txn in our_records:
        key = (txn.amount, txn.date)
        if key not in our_map:
            our_map[key] = []
        our_map[key].append(txn)
    
    matched = []
    for bank_txn in bank_transactions:
        key = (bank_txn.amount, bank_txn.date)
        if key in our_map:
            matched.append((bank_txn, our_map[key].pop(0)))
    return matched

Processing time: 6 hours → 3 minutes. I still have the thank-you card from our ops team.

Common Big O Patterns in the Wild

Arrays and Lists

# Different operations, different complexities
class ArrayOperations:
    def __init__(self):
        self.data = []
    
    def append_item(self, item):  # O(1) amortized
        self.data.append(item)
    
    def insert_at_beginning(self, item):  # O(n)
        self.data.insert(0, item)
    
    def find_item(self, item):  # O(n)
        return item in self.data
    
    def get_by_index(self, index):  # O(1)
        return self.data[index]

Nested Loops: The Performance Killer

Last week’s code review:

// Junior dev's code - O(n³) 😱
function findTriples(numbers, target) {
    const results = [];
    for (let i = 0; i < numbers.length; i++) {
        for (let j = 0; j < numbers.length; j++) {
            for (let k = 0; k < numbers.length; k++) {
                if (numbers[i] + numbers[j] + numbers[k] === target) {
                    results.push([numbers[i], numbers[j], numbers[k]]);
                }
            }
        }
    }
    return results;
}

// My refactored version - O(n²)
function findTriplesOptimized(numbers, target) {
    const results = [];
    numbers.sort((a, b) => a - b);  // O(n log n)
    
    for (let i = 0; i < numbers.length - 2; i++) {
        let left = i + 1;
        let right = numbers.length - 1;
        
        while (left < right) {
            const sum = numbers[i] + numbers[left] + numbers[right];
            if (sum === target) {
                results.push([numbers[i], numbers[left], numbers[right]]);
                left++;
                right--;
            } else if (sum < target) {
                left++;
            } else {
                right--;
            }
        }
    }
    return results;
}

Space Complexity: The Forgotten Sibling

My bubble tea addiction taught me about space complexity. Sure, I can buy 50 cups at once (time efficient), but where do I store them? My tiny Seattle apartment says no.

# High space complexity - O(n)
def get_all_substrings(s):
    substrings = []
    for i in range(len(s)):
        for j in range(i + 1, len(s) + 1):
            substrings.append(s[i:j])  # Storing everything
    return substrings

# Low space complexity - O(1)
def print_all_substrings(s):
    for i in range(len(s)):
        for j in range(i + 1, len(s) + 1):
            print(s[i:j])  # Not storing, just printing

The Startup Reality Check

At my fintech startup, we often choose “worse” Big O for practical reasons:

# Theoretically better: O(1) lookup
class TheoreticallyBetter:
    def __init__(self):
        self.cache = {}  # Could grow to millions
    
    def get_user(self, user_id):
        if user_id not in self.cache:
            self.cache[user_id] = database.fetch(user_id)
        return self.cache[user_id]

# Practically better: O(log n) but bounded memory
class PracticallyBetter:
    def __init__(self):
        self.cache = OrderedDict()
        self.max_size = 10000
    
    def get_user(self, user_id):
        if user_id in self.cache:
            self.cache.move_to_end(user_id)
            return self.cache[user_id]
        
        user = database.fetch(user_id)
        self.cache[user_id] = user
        
        if len(self.cache) > self.max_size:
            self.cache.popitem(last=False)
        
        return user

Common Misconceptions (I Had Them Too)

“Big O Is Always About Worst Case”

Not true! There’s also:

  • Best case: When your crush texts back immediately - O(1)
  • Average case: Normal Seattle traffic - O(n)
  • Worst case: I-5 during rain after a Seahawks game - O(n²)

“Lower Big O Is Always Better”

My hash table disaster:

# O(1) average... but
def store_user_preferences(user_id, preferences):
    # Using user_id as array index
    preferences_array[user_id] = preferences  
    # User IDs are 10-digit numbers... 10GB array for 5 users 😅

Practical Big O Tips

The 80/20 Rule

Focus on the hot paths. At RainCity FinTech:

# This runs once a day - O(n²) is fine
def daily_report(transactions):
    # Complex nested logic
    pass

# This runs on every API call - must be O(1) or O(log n)
def validate_payment(payment_id):
    # Optimized lookup
    pass

Know Your Data Structures

My cheat sheet taped to my monitor:

Array:
  - Access: O(1)
  - Search: O(n)
  - Insert: O(n)
  - Delete: O(n)

Hash Table:
  - Access: N/A
  - Search: O(1) average
  - Insert: O(1) average
  - Delete: O(1) average

Binary Search Tree:
  - Access: O(log n)
  - Search: O(log n)
  - Insert: O(log n)
  - Delete: O(log n)

The Cultural Perspective

In Taiwan, we have a saying: “慢工出細活” (slow work produces fine goods). But in tech, it’s more like “smart work produces scalable goods.” Big O helps us work smart, not just hard.

My American colleagues often optimize first, ask questions later. My Taiwanese upbringing taught me to understand the problem deeply first. The combination? Measure, understand, then optimize.

Debugging Performance

My performance debugging toolkit:

import time
import matplotlib.pyplot as plt

def measure_performance(func, inputs):
    """My go-to performance measurement"""
    sizes = []
    times = []
    
    for input_data in inputs:
        start = time.time()
        func(input_data)
        end = time.time()
        
        sizes.append(len(input_data))
        times.append(end - start)
    
    # Plot it - visual learning!
    plt.plot(sizes, times)
    plt.xlabel('Input Size')
    plt.ylabel('Time (seconds)')
    plt.title(f'Performance of {func.__name__}')
    plt.show()

Final Thoughts

That $30,000 AWS bill? Best tuition I ever paid. It taught me that Big O isn’t academic - it’s the difference between a service that scales and one that fails.

Now, whether I’m optimizing payment processing at work or deciding the fastest way to get bubble tea during lunch, I think in Big O. It’s not about premature optimization - it’s about understanding the implications of your choices.

To new developers: Don’t fear Big O. Start simple:

  • Count your loops
  • Nested loops multiply
  • Hash tables are your friend
  • When in doubt, measure

And remember - the goal isn’t always the lowest Big O. It’s the right Big O for your situation. Sometimes O(n) with simple code beats O(log n) with complexity that no one understands.


Writing this from Gas Works Park, watching the sunset over Lake Union, realizing that finding the perfect sunset spot is probably O(n) where n is the number of Seattle parks. Got a Big O horror story or optimization win? Hit me up @maya_codes_pnw. May your algorithms be efficient and your AWS bills be reasonable! 🌅

Share this article

Add Comment

No comments yet. Be the first to comment!

More from Programming