Ruby enumerators are slow... and wonderful
Ruby Enumerators are a really cool abstraction. They can be used in many ways. I particularly love that, in the standard library, most methods that take a block to iterate over something will return an enumerator when invoked without it. For instance,
CSV.for_each use this convention.
I hadn’t found a practical use case for them until I needed to process several CSV files advancing through them at the same time. I could grab an enumerator for each CSV file and make them advance as I wanted. It worked like a charm, but I noticed it was very slow. I decided to profile and compare them with their internal iterator counterpart:
require 'benchmark/ips' COUNT = 500000 Benchmark.ips do |x| x.report('Enumerable') do total = 0 COUNT.times do |i| total += i end end x.report('Enumerator') do total = 0 enumerator = COUNT.times while true begin total += enumerator.next rescue StopIteration break end end end end
Results (Ruby 2.4.0):
Enumerable 37.073 (± 5.4%) i/s - 186.000 in 5.030412s Enumerator 1.588 (± 0.0%) i/s - 8.000 in 5.040230s
As you see, Enumerators are much slower that the corresponding internal iterator. I can’t use them in my case since CSV processing speed was key to the global performance of the system, but I still love them.