Table Of Contents
- Conditional Logic at Warp Speed
- Basic Conditional Selection
- Advanced Selection Patterns
- Getting Indices Instead of Values
- Performance Comparison
- Real-World Applications
- Pro Tips for where()
- Explore Advanced Techniques
Conditional Logic at Warp Speed
Traditional if-else statements crawl when handling arrays. NumPy's where() function makes conditional selection lightning-fast, handling millions of elements in microseconds.
Basic Conditional Selection
import numpy as np
# Simple condition: positive vs negative
data = np.array([-3, -1, 0, 2, 5, -2])
result = np.where(data > 0, data, 0) # Keep positive, replace negative with 0
print(result) # [0 0 0 2 5 0]
# Boolean mask alternative
positive_mask = data > 0
result_mask = np.where(positive_mask, data, 0)
print(result_mask) # Same result
# Three-way condition using nested where
temp_data = np.array([15, 25, 35, 5, 45])
comfort = np.where(temp_data < 20, 'Cold',
np.where(temp_data > 30, 'Hot', 'Perfect'))
print(comfort) # ['Cold' 'Perfect' 'Hot' 'Cold' 'Hot']
Advanced Selection Patterns
# Matrix conditional replacement
matrix = np.array([[1, -2, 3],
[-4, 5, -6],
[7, -8, 9]])
# Replace negatives with their absolute value
abs_matrix = np.where(matrix < 0, -matrix, matrix)
print(abs_matrix)
# [[1 2 3]
# [4 5 6]
# [7 8 9]]
# Complex conditions with multiple arrays
scores = np.array([85, 92, 78, 95, 88])
attempts = np.array([1, 2, 3, 1, 2])
# Bonus points for first attempt high scores
final_scores = np.where((scores > 90) & (attempts == 1),
scores + 5, scores)
print(final_scores) # [85 92 78 100 88]
Getting Indices Instead of Values
# Find indices where condition is true
data = np.array([10, 25, 30, 15, 35, 20])
high_indices = np.where(data > 20)
print(high_indices[0]) # [1 2 4] - indices where data > 20
# Multiple conditions for indices
matrix = np.random.randint(1, 10, (4, 4))
row_indices, col_indices = np.where(matrix > 5)
print(f"Elements > 5 at positions: {list(zip(row_indices, col_indices))}")
# Get the actual values at those positions
high_values = matrix[row_indices, col_indices]
print(f"Values > 5: {high_values}")
Performance Comparison
import time
# Large array performance test
large_array = np.random.randint(-100, 100, 1000000)
# NumPy where() approach (fast)
start = time.time()
np_result = np.where(large_array > 0, large_array, 0)
np_time = time.time() - start
# Pure Python approach (slow)
start = time.time()
py_result = [x if x > 0 else 0 for x in large_array]
py_time = time.time() - start
print(f"NumPy where(): {np_time:.4f}s")
print(f"Python loop: {py_time:.4f}s")
print(f"Speedup: {py_time/np_time:.1f}x faster")
Real-World Applications
# Data cleaning: replace outliers
sensor_data = np.array([22.1, 23.5, 150.0, 21.8, 22.9, -50.0, 23.1])
cleaned = np.where((sensor_data < 0) | (sensor_data > 100),
np.mean(sensor_data[(sensor_data > 0) & (sensor_data < 100)]),
sensor_data)
# Financial data: profit/loss categorization
returns = np.array([0.05, -0.02, 0.08, -0.01, 0.12])
categories = np.where(returns > 0.05, 'High Gain',
np.where(returns > 0, 'Small Gain', 'Loss'))
# Image processing: threshold application
image_data = np.random.rand(100, 100) # Simulated grayscale image
binary_image = np.where(image_data > 0.5, 255, 0) # Black and white
Pro Tips for where()
- Use parentheses for complex conditions:
(cond1) & (cond2)
where()
without replacement returns indices- Combine with boolean indexing for powerful selections
- Works efficiently with broadcasting
Explore Advanced Techniques
Dive into advanced NumPy indexing, master array manipulation techniques, and explore data analysis workflows.
Share this article
Add Comment
No comments yet. Be the first to comment!