π Introduction
When working with large datasets, performance matters.
With NumPy, you can write high-speed and memory-efficient codeβif you know the right techniques.
π Why Optimization is Important?
Without optimization:
- Code becomes slow π’
- Memory usage increases
- Performance drops
β‘ 1. Use Vectorization (Avoid Loops)
β Slow way:
result = []
for i in range(1000):
result.append(i * 2)
β Fast way:
import numpy as nparr = np.arange(1000)
result = arr * 2
β Uses internal optimized operations
π§ 2. Choose Correct Data Types
arr = np.array([1, 2, 3], dtype=np.int32)
β Saves memory compared to default int64
π¦ 3. Use Views Instead of Copies
view = arr[1:3]
β Avoids extra memory usage
π 4. Use In-Place Operations
arr += 5
β Faster and memory-efficient
β‘ 5. Avoid Python Loops
Always prefer NumPy functions:
np.sum(arr)
β Faster than manual loops
π 6. Use Efficient Functions
Examples:
np.dot()β matrix multiplicationnp.mean()β averagenp.sum()β aggregation
π’ 7. Preallocate Arrays
β Avoid:
arr = []
for i in range(1000):
arr.append(i)
β Use:
arr = np.zeros(1000)
β Faster memory allocation
π 8. Memory Comparison Example
import numpy as nplist_data = [i for i in range(1000)]
numpy_data = np.arange(1000)
β NumPy uses less memory than lists
β‘ Why NumPy is Fast?
NumPy is fast because:
- Written in C
- Uses contiguous memory
- Supports vectorization
π¦ Real-World Use Case
prices = np.array([100, 200, 300])discounted = prices * 0.9
print(discounted)
β Efficient bulk operations
π Works with Other Libraries
Performance optimization is crucial in:
- TensorFlow
- Scikit-learn
- Pandas
π Summary Table
| Tip | Benefit |
|---|---|
| Vectorization | Faster execution |
| Correct dtype | Less memory |
| Views | Avoid duplication |
| In-place ops | Save memory |
| Preallocation | Speed boost |
π§ Pro Tips
- Always avoid loops when possible
- Use built-in NumPy functions
- Monitor memory usage for large data
π Conclusion
Optimizing performance in NumPy helps you handle large datasets efficiently and write faster Python code.