My github.io
Many times I see entirely new arrays/slices/lists/etc. defined just to contain calculated values that aren’t reused. This takes up more memory than necessary. There’s a better way!
// JavaScript is a good example of creating new values that are NOT lazy
// iterators. filter, map, etc. create new arrays!
const numbers = [1, 2, 3, 4, 5];
const oddsDoubled = numbers
.filter((n) => n % 2 !== 0)
.map((n) => n * 2);
One way, obviously, is to simply not declare a new value, and instead do the calculations in a loop. It’s so obvious it probably doesn’t need an example, but here is one anyway.
// This is a really easy and obvious way.
const oddsDoubled = [];
for (let i = 0; i <= numbers.length; i++) {
const n = numbers[i];
if (n % 2 !== 0) {
oddsDoubled.push(n * 2);
}
}
// Another easy way.
const oddsDoubled = [];
numbers.forEach((n) => {
if (n % 2 !== 0) {
oddsDoubled.push(n * 2);
}
});
And, of course, you could use reduce
.
But, the more complex your filtering, mapping or whatever other calculation you make gets, the bigger the loop gets, and this can make it hard to read.
Here’s an example of lazy iteration in Python (they’re called generators in
Python). Python makes lazy iteration very easy. Even range
itself creates a
lazy iterator!
numbers = range(1, 6) # lazy iterator
odds = filter(lambda n: n % 2 != 0, numbers) # Still lazy
odds_doubled = map(lambda n: n * 2, odds) # Yes, it's lazy
print(list(odds_doubled)) # [2, 6, 10]
You don’t need to memorize these function names, though, because Python has an easy syntax for making any lazy iterator.
numbers = range(1, 6)
# It's just like any other comprehension, only using () instead of [], {}, etc.
odds_doubled = (n * 2 for n in numbers if n % 2 != 0)
print(list(odds_doubled)) # [2, 6, 10]
You can make more complex generators with the yield
keyword.
# There are plenty of other ways to do a fizzbuzz, but this is good for
# explaining generators.
def fizzbuzz(start, stop):
for n in range(start, stop + 1):
if n % 15 == 0:
yield 'fizzbuzz'
elif n % 5 == 0:
yield 'buzz'
elif n % 3 == 0:
yield 'fizz'
else:
yield n
for x in fizzbuzz(1, 15):
print(x)
# 1
# 2
# fizz
# 4
# buzz
# fizz
# 7
# 8
# fizz
# buzz
# 11
# fizz
# 13
# 14
# fizzbuzz
Lazy iterators are also easy to do in Rust.
// The range {start..end}, filter, and map are all lazy iterators
let numbers_doubled: Vec<u8> = {1..=5}
.filter(|n| n % 2 != 0)
.map(|n| n * 2)
.collect();
println!("{:?}", numbers_doubled); // [2, 6, 10]
So, how would you use lazy iterators in JavaScript? JavaScript now provides the ability to define generators, too!
// The * makes it a generator.
// Lazy generator of numbers
function* range(start, stop) {
for (let i = start; i <= stop; i++) {
yield i;
}
}
function* filter(f, iter) {
for (const v of iter) {
if (f(v)) {
yield v;
}
}
}
function* map(f, iter) {
for (const v of iter) {
yield f(v);
}
}
const numbers = range(1, 5);
const odds = filter((n) => n % 2 !== 0, numbers);
const oddsDoubled = map((n) => n * 2, odds);
console.log(Array.from(oddsDoubled)); // [2, 6, 10]
So why is lazy iteration useful? If you look back at the fizzbuzz
Python example above, what if, instead of yielding values, we
returned a list, like so?
def fizzbuzz(start, stop):
fizzbuzzes = []
for n in range(start, stop + 1):
fizz = 'fizz' if n % 3 == 0 else ''
buzz = 'buzz' if n % 5 == 0 else ''
fizzbuzzes.append(f'{fizz}{buzz}' or n)
return fizzbuzzes
This may be fine for small ranges, but what if we wanted
a lot of numbers (or even infinite numbers)?
For example, fizzbuzz(1, 1_000_000_000)
.
Now we have to create a massive list
, which may be so large
that it takes up too much memory and crashes.
But a lazy iterator will generate the values as needed instead
of all at once, saving memory (although the loop may take a long
time to complete). This is why they are “lazy.” They don’t do
anything until they need to. Because of this behavior, they
can even be infinite! itertools.count
is a great example
of this.
r = fizzbuzz(1, 15)
for n in r:
# do nothing
pass
for n in r:
print(n)
What do you think would be print
ed if the above code was
run? The answer: nothing. The generator r
is consumed
in the first loop, so there are no values yielded in the
second loop. This is something to keep in mind when using
generators. There are workarounds, of course, depending on
what you need. Simply re-declaring the generator may be enough,
or you could use one of the tools in itertools
.