How to Save and Load NumPy Arrays

Data Persistence Made Simple
NumPy's Native Formats
Text-Based Formats for Human Readability
Memory Mapping for Large Arrays
Binary Data with Pickle
Cross-Language Compatibility
Performance and Format Comparison
Best Practices
Explore More

Data Persistence Made Simple

Your carefully crafted NumPy arrays shouldn't vanish when your program ends. Learn to save them efficiently and load them back exactly as they were.

NumPy's Native Formats

import numpy as np

# Create sample data
data = np.random.rand(1000, 100)
labels = np.array(['cat', 'dog', 'bird'] * 100)

# Save single array (.npy format)
np.save('my_data.npy', data)
loaded_data = np.load('my_data.npy')
print(np.array_equal(data, loaded_data))  # True

# Save multiple arrays (.npz format)
np.savez('dataset.npz', features=data, labels=labels)
loaded = np.load('dataset.npz')
print(loaded['features'].shape)  # (1000, 100)
print(loaded['labels'][:3])      # ['cat' 'dog' 'bird']

# Compressed format (saves space)
np.savez_compressed('compressed_data.npz', 
                   large_array=np.random.rand(10000, 1000))

# With context manager (automatically closes file)
with np.load('dataset.npz') as data:
    features = data['features']
    labels = data['labels']

Text-Based Formats for Human Readability

# Save as text (CSV-like)
small_array = np.random.rand(5, 3)
np.savetxt('data.txt', small_array, delimiter=',', fmt='%.4f')

# Load text data
loaded_text = np.loadtxt('data.txt', delimiter=',')

# Custom formatting
np.savetxt('formatted.txt', small_array, 
           fmt='%.2e',          # Scientific notation
           delimiter='\t',      # Tab separated
           header='col1\tcol2\tcol3',  # Header row
           comments='# ')       # Comment prefix

# Handling mixed data types
mixed_data = np.array([('Alice', 25, 1.75), ('Bob', 30, 1.80)], 
                     dtype=[('name', 'U10'), ('age', 'i4'), ('height', 'f4')])
np.savetxt('mixed.txt', mixed_data, fmt='%s %d %.2f')

Memory Mapping for Large Arrays

# Memory-mapped arrays for huge datasets
huge_array = np.random.rand(100000, 1000)

# Save as memory-mapped file
mmap_array = np.memmap('huge_data.dat', dtype='float64', mode='w+', 
                       shape=(100000, 1000))
mmap_array[:] = huge_array[:]  # Copy data
del mmap_array  # Flush to disk

# Load as memory-mapped (doesn't load into RAM immediately)
loaded_mmap = np.memmap('huge_data.dat', dtype='float64', mode='r', 
                        shape=(100000, 1000))
print(loaded_mmap[0, :5])  # Access specific parts without loading all

Binary Data with Pickle

import pickle

# Complex objects with metadata
class DataContainer:
    def __init__(self, data, metadata):
        self.data = data
        self.metadata = metadata

container = DataContainer(np.random.rand(100, 50), 
                         {'created': '2024-01-01', 'version': 1.0})

# Save with pickle
with open('container.pkl', 'wb') as f:
    pickle.dump(container, f)

# Load with pickle
with open('container.pkl', 'rb') as f:
    loaded_container = pickle.load(f)
    print(loaded_container.metadata)

Cross-Language Compatibility

# HDF5 format (requires h5py)
# Great for large datasets and cross-language compatibility
try:
    import h5py
    
    with h5py.File('data.h5', 'w') as f:
        f.create_dataset('array1', data=np.random.rand(1000, 100))
        f.create_dataset('array2', data=np.random.rand(500, 200))
        f.attrs['description'] = 'My dataset'
    
    with h5py.File('data.h5', 'r') as f:
        loaded_array = f['array1'][:]
        description = f.attrs['description']
        
except ImportError:
    print("Install h5py for HDF5 support: pip install h5py")

Performance and Format Comparison

Format	Speed	Size	Cross-platform	Human Readable
.npy	⭐⭐⭐⭐⭐	⭐⭐⭐⭐	⭐⭐⭐	❌
.npz	⭐⭐⭐⭐	⭐⭐⭐⭐⭐	⭐⭐⭐	❌
.txt	⭐⭐	⭐⭐	⭐⭐⭐⭐⭐	⭐⭐⭐⭐⭐
.h5	⭐⭐⭐⭐⭐	⭐⭐⭐⭐⭐	⭐⭐⭐⭐⭐	❌

Best Practices

Use .npy for single arrays, .npz for multiple arrays
Choose compressed formats for storage efficiency
Use memory mapping for arrays larger than RAM
Consider HDF5 for complex, structured datasets

Explore More

Dive into large-scale data processing, master data serialization techniques, and explore scientific data workflows.

Share this article

Navigation

How to Save and Load NumPy Arrays

Table Of Contents

Data Persistence Made Simple

NumPy's Native Formats

Text-Based Formats for Human Readability

Memory Mapping for Large Arrays

Binary Data with Pickle

Cross-Language Compatibility

Performance and Format Comparison

Best Practices

Explore More

Add Comment

More from Python

Navigation

Table Of Contents

Data Persistence Made Simple

NumPy's Native Formats

Text-Based Formats for Human Readability

Memory Mapping for Large Arrays

Binary Data with Pickle

Cross-Language Compatibility

Performance and Format Comparison

Best Practices

Explore More

Comments

Add Comment

More from Python

How to Use NumPy Random Number Generation

How to Handle Python's GIL (Global Interpreter Lock)

Python Itertools: Master Complex and Efficient Iterators in 2025

Python PDB Debugger: Complete Guide to Debugging Python Code in 2025

How to Create and Use Python Decorators

How to Handle Multiple Assignment and Unpacking