Generators in Python are special iterators that generate values on the fly instead of storing them in memory. They are defined using functions with the yield keyword.
Generators are ideal for large datasets, streaming data, or infinite sequences because they are memory-efficient and lazy—values are produced only when needed.
Why Generators Are Important
- Save memory when working with large data
- Produce values on demand (lazy evaluation)
- Simplify iterator creation
- Useful in pipelines and data streaming
Example:
def simple_generator():
yield 1
yield 2
yield 3
for value in simple_generator():
print(value)
Creating a Generator Function
Generators are similar to functions but use yield instead of return.
def count_up_to(n):
count = 1
while count <= n:
yield count
count += 1
for number in count_up_to(5):
print(number)
Output:
1
2
3
4
5
Example 1: Reading Large Files
Generators are perfect for processing large files line by line without loading the entire file.
def read_file(file_name):
with open(file_name) as f:
for line in f:
yield line.strip()
for line in read_file("large_file.txt"):
print(line)
Example 2: Infinite Sequence Generator
Generators can create infinite sequences like Fibonacci numbers or prime numbers.
def fibonacci():
a, b = 0, 1
while True:
yield a
a, b = b, a + b
fib = fibonacci()
for _ in range(10):
print(next(fib))
Generator Expressions
Python also supports generator expressions, similar to list comprehensions but using parentheses.
squares = (x**2 for x in range(10))
for square in squares:
print(square)
Advantages:
- Lazy evaluation
- Memory-efficient
- Cleaner syntax
Real-World Scenario
- Data streaming from APIs – process data in chunks without storing everything in memory
- Log file monitoring – read logs line by line using a generator
- Large dataset processing – iterate through millions of rows without memory overflow
Best Practices
✔ Use generators for large or infinite datasets
✔ Avoid unnecessary generator nesting
✔ Combine with functions like itertools for complex pipelines
✔ Always handle StopIteration exceptions when manually iterating
Conclusion
Generators are a powerful feature in Python for memory-efficient, lazy data processing. Mastering generators allows you to handle large datasets, streaming data, and infinite sequences efficiently.
References
- Internal Reference: https://savanka.com/category/learn/python/
- External Reference: https://www.w3schools.com/python/