📌 Introduction
When working with large datasets, performance matters.
With NumPy, you can write high-speed and memory-efficient code—if you know the right techniques.
🔍 Why Optimization is Important?
Without optimization:
- Code becomes slow 🐢
- Memory usage increases
- Performance drops
⚡ 1. Use Vectorization (Avoid Loops)
❌ Slow way:
result = []
for i in range(1000):
result.append(i * 2)
✅ Fast way:
import numpy as nparr = np.arange(1000)
result = arr * 2
✔ Uses internal optimized operations
🧠 2. Choose Correct Data Types
arr = np.array([1, 2, 3], dtype=np.int32)
✔ Saves memory compared to default int64
📦 3. Use Views Instead of Copies
view = arr[1:3]
✔ Avoids extra memory usage
🔄 4. Use In-Place Operations
arr += 5
✔ Faster and memory-efficient
⚡ 5. Avoid Python Loops
Always prefer NumPy functions:
np.sum(arr)
✔ Faster than manual loops
📊 6. Use Efficient Functions
Examples:
np.dot()→ matrix multiplicationnp.mean()→ averagenp.sum()→ aggregation
🔢 7. Preallocate Arrays
❌ Avoid:
arr = []
for i in range(1000):
arr.append(i)
✅ Use:
arr = np.zeros(1000)
✔ Faster memory allocation
📉 8. Memory Comparison Example
import numpy as nplist_data = [i for i in range(1000)]
numpy_data = np.arange(1000)
✔ NumPy uses less memory than lists
⚡ Why NumPy is Fast?
NumPy is fast because:
- Written in C
- Uses contiguous memory
- Supports vectorization
📦 Real-World Use Case
prices = np.array([100, 200, 300])discounted = prices * 0.9
print(discounted)
✔ Efficient bulk operations
🔗 Works with Other Libraries
Performance optimization is crucial in:
- TensorFlow
- Scikit-learn
- Pandas
📊 Summary Table
| Tip | Benefit |
|---|---|
| Vectorization | Faster execution |
| Correct dtype | Less memory |
| Views | Avoid duplication |
| In-place ops | Save memory |
| Preallocation | Speed boost |
🧠 Pro Tips
- Always avoid loops when possible
- Use built-in NumPy functions
- Monitor memory usage for large data
🔚 Conclusion
Optimizing performance in NumPy helps you handle large datasets efficiently and write faster Python code.