generators vs list comprehension performance in python

  • Last Update :
  • Techknowledgy :

So what’s the difference between Generator Expressions and List Comprehensions?The generator yields one item at a time and generates item only when in demand. Whereas, in a list comprehension, Python reserves memory for the whole list. Thus we can say that the generator expressions are memory efficient than the lists.We can see this in the example below.,There is a remarkable difference in the execution time. Thus, generator expressions are faster than list comprehension and hence time efficient.,In the above example, if we want to print the output for generator expressions, we can simply iterate it over generator object.,What is List Comprehension?It is an elegant way of defining and creating a list. List Comprehension allows us to create a list using for loop with lesser code. What normally takes 3-4 lines of code, can be compressed into just a single line.

Output:

 0 2 4 6 8 10

Suggestion : 2

John's answer is good (that list comprehensions are better when you want to iterate over something multiple times). However, it's also worth noting that you should use a list if you want to use any of the list methods. For example, the following code won't work:

def gen():
   return (something
      for something in get_some_stuff())

print gen()[: 2] # generators don 't support indexing or slicing
print[5, 6] + gen() # generators can 't be added to lists

So you try starting out by writing a list comprehension:

logfile = open("hugefile.txt", "r")
entry_lines = [(line, len(line)) for line in logfile
   if line.startswith("ENTRY")
]

So instead we can use a generator to apply a "filter" to our content. No data is actually read until we start iterating over the result.

logfile = open("hugefile.txt", "r")
entry_lines = ((line, len(line)) for line in logfile
   if line.startswith("ENTRY"))

Not even a single line has been read from our file yet. In fact, say we want to filter our result even further:

long_entries = ((line, length) for (line, length) in entry_lines
   if length > 80)

For example:

sum(x * 2
   for x in xrange(256))

dict((k, some_func(k)) for k in some_list_of_keys)

When creating a generator from a mutable object (like a list) be aware that the generator will get evaluated on the state of the list at time of using the generator, not at time of the creation of the generator:

>>> mylist = ["a", "b", "c"] >>>
   gen = (elem + "1"
      for elem in mylist) >>>
   mylist.clear() >>>
   for x in gen: print(x)
# nothing

I'm using the Hadoop Mincemeat module. I think this is a great example to take a note of:

import mincemeat

def mapfn(k, v):
   for w in v:
   yield 'sum', w
#yield 'count', 1

def reducefn(k, v):
   r1 = sum(v)
r2 = len(v)
print r2
m = r1 / r2
std = 0
for i in range(r2):
   std += pow(abs(v[i] - m), 2)
res = pow((std / r2), 0.5)
return r1, r2, res

For functional programming, we want to use as little indexing as possible. For this reason, If we want to continue using the elements after we take the first slice of elements, islice() is a better choice since the iterator state is saved.

from itertools
import islice

def slice_and_continue(sequence):
   ret = []
seq_i = iter(sequence) #create an iterator from the list

seq_slice = islice(seq_i, 3) #take first 3 elements and print
for x in seq_slice: print(x),

   for x in seq_i: print(x ** 2), #square the rest of the numbers

slice_and_continue([1, 2, 3, 4, 5])

Suggestion : 3

14 Sep 2020 | 5 minute read

import timeit
import sys

TIMES = 10000

def clock(label, cmd):
   res = timeit.repeat(cmd, number = TIMES)
print(label, *("{:.3f} seconds".format(x) for x in res))

def size(label, obj):
   print(label, "%d bytes" % sys.getsizeof(obj))

range_size = 10000

listcomp_label = "listcomp        :"
genexp_label = "genexp          :"

clock(listcomp_label, "sum([num for num in range(%d)])" % range_size)
clock(genexp_label, "sum(num for num in range(%d))" % range_size)
size(listcomp_label, [num
   for num in range(range_size)
])
size(genexp_label, (num
   for num in range(range_size)))
listcomp: 2.749 seconds 2.913 seconds 2.742 seconds
genexp: 4.163 seconds 4.152 seconds 4.161 seconds
listcomp: 87624 bytes
genexp: 88 bytes

Suggestion : 4

Generator expression’s syntax is just like List comprehension except the brackets, but the main difference between List Comprehension & Generator Expression is that later returns a Generator object instead of list. We should use Generators when we are only interested in looping over the items one at a time and avoid keeping unnecessary elements in memory, as we explained in above examples.,In this article we will discuss the differences between list comprehensions and Generator expressions.,In python, a generator expression is used to generate Generators. It looks like List comprehension in syntax but (} are used instead of []. Let’s get the sum of numbers divisible by 3 & 5 in range 1 to 1000 using Generator Expression.,So, we avoided keeping unnecessary numbers in memory using Generator. But, do we always need to create a functions for creating Generator? The answer is no. Here comes the Generator Expression in picture.

Now let’s see how to do that using list comprehension,

# Create a list of numbers which are divisible by 3 & 5 and are in range from 1 to 1000
listOfNums = [n
   for n in range(1000) if n % 3 == 0 and n % 5 == 0
]

# get the sum of all numbers in list
total = 0
for num in listOfNums:
   total += num

print('Total = ', total)

Total = 33165

Let’s create a Generator that yields numbers divisible by 3 & 5 one by one i.e.

def selectedNumbers():
   ''
' A Generator that yields multiples of 3 & 5 in range 0 to 1000'
''
for num in range(1000):
   if num % 3 == 0 and num % 5 == 0:
   yield num

Complete example is as follows,

from datetime
import datetime

def selectedNumbers():
   ''
' A Generator that yields multiples of 3 & 5 in range 0 to 1000'
''
for num in range(1000):
   if num % 3 == 0 and num % 5 == 0:
   yield num

def main():

   print('*** Getting the Sum of selected numbers using List Comprehension ***')

# Create a list of numbers which are divisible by 3 & 5 and are in range from 1 to 1000
listOfNums = [n
   for n in range(1000) if n % 3 == 0 and n % 5 == 0
]

# get the sum of all numbers in list
total = 0
for num in listOfNums:
   total += num

print('Total = ', total)

print('*** Getting the Sum of selected numbers using Generators ***')

# Get a Generator Object
generatorObj = selectedNumbers()

# Iterate over yielded values one by one and calculate the sum
total = 0
for num in generatorObj:
   total += num

print('Total = ', total)

print('*** Getting the Sum of selected numbers using Generator Expression ***')

# Get a Generator object using Generator Expression
generatorObj = (n
   for n in range(1000) if n % 3 == 0 and n % 5 == 0)

# Iterate over yielded values one by one and calculate the sum
total = 0
for num in generatorObj:
   total += num

print('Total = ', total)

print('*** Getting the Sum of selected numbers using Generator Expression & sum() ***')

# Pass the Generator object returned by Generator Expression to sum()
total = sum((n
   for n in range(1000) if n % 3 == 0 and n % 5 == 0))

print('Total = ', total)

if __name__ == '__main__':
   main()

Suggestion : 5

Published on 2015-06-26

l = [n * 2
   for n in range(1000)
] # List comprehension
g = (n * 2
   for n in range(1000)) # Generator expression

examples/python/generator_expression.py

#!/usr/bin/env python
from __future__ import print_function
import sys

l = [n*2 for n in range(1000)] # List comprehension
g = (n*2 for n in range(1000)) # Generator expression

print(type(l)) # <type 'list'>
   print(type(g)) # <type 'generator'>

      print(sys.getsizeof(l)) # 9032
      print(sys.getsizeof(g)) # 80

      print(l[4]) # 8
      #print(g[4]) # TypeError: 'generator' object has no attribute '__getitem__'

      for v in l:
      pass
      for v in g:
      pass

Suggestion : 6

This produces the exact same result as feeding the list function a generator comprehension. However, using a list comprehension is slightly more efficient than is feeding the list function a generator comprehension.,Using generator comprehensions to initialize lists is so useful that Python actually reserves a specialized syntax for it, known as the list comprehension. A list comprehension is a syntax for constructing a list, which exactly mirrors the generator comprehension syntax:,Tuples can be created using comprehension expressions too, but we must explicitly invoke the tuple constructor since parentheses are already reserved for defining a generator-comprehension.,A generator comprehension can be specified directly as an argument to a function, wherever a single iterable is expected as an input to that function.

# start: 2(included)
# stop: 7(excluded)
# step: 1(
   default)
for i in range(2, 7):
   print(i)
# prints: 2..3..4..5..6
# start: 1(included)
# stop: 10(excluded)
# step: 2
for i in range(1, 10, 2):
   print(i)
# prints: 1..3..5..7..9
# A very common use
case !
# start: 0(
   default, included)
# stop: 5(excluded)
# step: 1(
   default)
for i in range(5):
   print(i)
# prints: 0..1..2..3..4
(<expression> for <var> in <iterable> if <condition>)
for <var> in <iterable>:
      if bool(<condition>):
         yield <expression>
# when iterated over, `even_gen`
will generate 0..2..4.....98
even_gen = (i
   for i in range(100) if i % 2 == 0)