We know that we can use a for loop to traverse elements of a sequence. For such traversals, the for loop depends upon what are known as, iterators, provided by these iterable objects. An iterator is an object that traverses over a sequence of data.
For objects to be iterable, they need to provide two methods. The iter() method to provide a handle for the called object. The next() method to provides the next value in the sequence. The iterator object acts as a handle for all the subsequent next() calls. Once, the next() reaches the end, it would typically throw a StopIteration exception. This exception communicates that we have reached the end of the sequence. Please not that the next() method is often embedded in the iterator handle itself.
These two methods are provided with leading and trailing double underscores. Thus, iter() method equates to __iter__() method. Likewise, the next() method equates to __next__() method. These double underscores means that these methods should be treated as private methods.
Let us get cracking and investigate the iterator object for some of the common sequence objects like lists, dictionaries, and tuples.
objList = [] print("Printing attributes of lists:") print(type(objList)) print(dir(objList)) objDictionary = {} print("\nPrinting attributes of dictionaries:") print(type(objDictionary)) print(dir(objDictionary)) objTuple = (0,) print("\nPrinting attributes of tuples:") print(type(objTuple)) print(dir(objTuple))
When we run the above example, the output (provided below) shows that the above sequence types do indeed have the __iter__() method. So far, so good. However, if you are like me, your eyes might be aching after searching (unsuccessfully!) __next__() or next() methods! The reason these methods are missing in the output is because the object returned form the iter() method also contains its own __next__() method. Thus, when we call next() using the iterator handle, then the iterator calls its own __next__() method to get the next element.
[user@codingbison]$ python3 iterators_comparison.py Printing attributes of lists: <class 'list'> ['__add__', '__class__', '__contains__', '__delattr__', '__delitem__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__getitem__', '__gt__', '__hash__', '__iadd__', '__imul__', '__init__', '__iter__', '__le__', '__len__', '__lt__', '__mul__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__reversed__', '__rmul__', '__setattr__', '__setitem__', '__sizeof__', '__str__', '__subclasshook__', 'append', 'count', 'extend', 'index', 'insert', 'pop', 'remove', 'reverse', 'sort'] Printing attributes of dictionaries: <class 'dict'> ['__class__', '__contains__', '__delattr__', '__delitem__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__getitem__', '__gt__', '__hash__', '__init__', '__iter__', '__le__', '__len__', '__lt__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__setitem__', '__sizeof__', '__str__', '__subclasshook__', 'clear', 'copy', 'fromkeys', 'get', 'items', 'keys', 'pop', 'popitem', 'setdefault', 'update', 'values'] Printing attributes of tuples: <class 'tuple'> ['__add__', '__class__', '__contains__', '__delattr__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__getitem__', '__getnewargs__', '__gt__', '__hash__', '__init__', '__iter__', '__le__', '__len__', '__lt__', '__mul__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__rmul__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', 'count', 'index'] [user@codingbison]$
When we run a for loop over these objects, then it is these two methods that allow us to navigate over their elements. To be more precise, the __iter__() method returns an iterator handle of the object at the beginning of the for loop and the iterator's next() method returns the next element in the sequence.
Armed with these two methods, let us write another example. This example uses both a for-loop and the iter()/next() combination to retrieve list elements that contain names of various tigers species. BTW, the last three species in the list are now extinct due to relentless hunting, poaching, and encroachment. Something to feel nostalgic about! Moving on, the example catches StopIteration exception using the try/except semantics. Since StopIteration marks the end of the list, the example breaks out of the loop after catching this exception. catch
objList = ["Siberian Tigers", "Bengal Tigers", "Indochinese Tigers", \ "Malayan Tigers", "South China Tigers", "Bali Tigers", \ "Caspian Tigers", "Javan Tigers"] print("Let us print using a for-loop:") for elem in objList: print("\t" + elem) print("\nLet us print the iterator object:") objIter = iter(objList) print(type(objIter)) print(dir(objIter)) print("\nLet us print manually:") while True: try: print("\t" + next(objIter)) except StopIteration: print("\tEnd of List") break
The output shows identical results for both approaches.
Let us print using a for-loop: Siberian Tigers Bengal Tigers Indochinese Tigers Malayan Tigers South China Tigers Bali Tigers Caspian Tigers Javan Tigers Let us print the iterator object: <class 'list_iterator'> ['__class__', '__delattr__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__gt__', '__hash__', '__init__', '__iter__', '__le__', '__length_hint__', '__lt__', '__ne__', '__new__', '__next__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__'] Let us print manually: Siberian Tigers Bengal Tigers Indochinese Tigers Malayan Tigers South China Tigers Bali Tigers Caspian Tigers Javan Tigers End of List
Before we go any further, we should mention that with Python3.0, some of the dictionary methods, in fact, now return iterators. These methods are keys(), values(), and items(). Method keys() returns all the keys present in the dictionary. Method values() returns all the values present in the dictionary. Lastly, if we want to get a list of both, then we can use the method items(). With Python 2.x, these methods return a list. The following example prints the type of the returned values from these methods. We run it for both Python2.7 and Python3.2.
objDictionary = {} iterKeys = objDictionary.keys() print(type(iterKeys)) iterValues = objDictionary.values() print(type(iterValues)) iterKeysValues = objDictionary.items() print(type(iterKeysValues))
Here is the output when we run it with Python2.7 (command python2) and Python3 (command python3). With Python3.2, these values are new types (which are actually iterators), instead of list types.
[user@codingbison]$ python2 iterator_dictionary_methods.py <type 'list'> <type 'list'> <type 'list'> [user@codingbison]$ [user@codingbison]$ python3 iterator_dictionary_methods.py <class 'dict_keys'> <class 'dict_values'> <class 'dict_items'> [user@codingbison]$
In the earlier examples, we saw iterators for lists, dictionaries, and tuples. However, Python supports additional sequence types like generators, list comprehensions, and strings as well.
Let us explore these additional sequence types and see how Python implements iterators for these. We provide a simple example that prints a generator, a list comprehension, a comprehension style generator, and a string.
objList = ["Siberian Tigers", "Bengal Tigers", "Malayan Tigers", "Caspian Tigers"] def gen_func(li): for elements in li: yield elements objGenerator = gen_func(objList) print("Printing attributes of a generator function:") print(type(objGenerator)) print(dir(objGenerator)) print("\nPrinting attributes of a comprehension style generator function:") objComprGenerator = (x.upper() for x in objList) print(type(objComprGenerator)) print("\nPrinting attributes of a list comprehension:") objComprehension = [x.upper() for x in objList] print(type(objComprehension)) print("\nPrinting attributes of a string:") varStr = "Tiger" print(dir(varStr))
As expected, the output shows that each of these objects have an __iter__() method. Since we know that a list comprehension returns a list and we saw methods supported by lists, we omit printing list methods. Also note that with Python2.0, strings do not explicitly provide the __iter__() method. Python, instead, relies on the __getitem__() method provided by strings to construct a simple instructor.
[user@codingbison]$ python3 iterators_additional.py Printing attributes of a generator function: <class 'generator'> ['__class__', '__delattr__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__gt__', '__hash__', '__init__', '__iter__', '__le__', '__lt__', '__name__', '__ne__', '__new__', '__next__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', 'close', 'gi_code', 'gi_frame', 'gi_running', 'send', 'throw'] Printing attributes of a comprehension style generator function: <class 'generator'> Printing attributes of a list comprehension: <class 'list'> Printing attributes of a string: ['__add__', '__class__', '__contains__', '__delattr__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__getitem__', '__getnewargs__', '__gt__', '__hash__', '__init__', '__iter__', '__le__', '__len__', '__lt__', '__mod__', '__mul__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__rmod__', '__rmul__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', 'capitalize', 'center', 'count', 'encode', 'endswith', 'expandtabs', 'find', 'format', 'format_map', 'index', 'isalnum', 'isalpha', 'isdecimal', 'isdigit', 'isidentifier', 'islower', 'isnumeric', 'isprintable', 'isspace', 'istitle', 'isupper', 'join', 'ljust', 'lower', 'lstrip', 'maketrans', 'partition', 'replace', 'rfind', 'rindex', 'rjust', 'rpartition', 'rsplit', 'rstrip', 'split', 'splitlines', 'startswith', 'strip', 'swapcase', 'title', 'translate', 'upper', 'zfill'] [user@codingbison]$
Most of the time, we do not have to deal with iterators; the for-loop takes care of it for us. However, it is important to know how they work since it is likely that one day you might have to write a class that needs to have its own iterator.
With that goal in mind, let us build an object that supports iterators. And for that, it would need to do three things. First, it should define the __iter__() method so that callers can get a iterator handle to this object. Second, it should define the __next__() so that the iterator handle can get elements of the object, one by one. Third, once the __next__() method is done traversing all of its elements, it should throw a StopIteration exception to communicate that there are no more elements left.
Here is a trivial example that defines a class, tigers, and makes it iterable by providing __iter__() and __next__() methods. We should pay close attention to self.counter data attribute of the class. When returning from the __iter__() method, we initialize it to 0 and every time, the iterator handle accesses the __next__() method, we increment it by 1. Once it has reached the length of the array, we know that there are no more elements left and hence, we raise the StopIteration exception.
class tigers(): def __init__(self): self.seq = [] self.counter = 0 def add(self, val): self.seq.append(val) def __iter__(self): self.counter = 0 return self def __next__(self): print("[Within the object] counter: " + str(self.counter)) while self.counter < len(self.seq): x = self.seq[self.counter] self.counter += 1 return x if (self.counter == len(self.seq)): raise StopIteration objTiger = tigers() print(type(objTiger)) print(dir(objTiger)) objTiger.add("Siberian Tigers") objTiger.add("Bengal Tigers") objTiger.add("Indochinese Tigers") objTiger.add("Malayan Tigers") objTiger.add("South China Tigers") objTiger.add("Bali Tigers") objTiger.add("Caspian Tigers") objTiger.add("Javan Tigers") print("\nLet us print elements of the object") for elem in objTiger: print(elem)
If we do not have StopIteration, then the object would keep returning None. Without StopIteration, the for-loop would be unable to know if it has reached the last element of the list or not. Accordingly, it would keep printing None, for ever!
The output shows that the for loop implicitly calls the __next__() method and with each call, the value of the counter attribute. is increased by 1.
<class '__main__.tigers'> ['__class__', '__delattr__', '__dict__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__gt__', '__hash__', '__init__', '__iter__', '__le__', '__lt__', '__module__', '__ne__', '__new__', '__next__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', '__weakref__', 'add', 'counter', 'seq'] Let us print elements of the object [Within the object] counter: 0 Siberian Tigers [Within the object] counter: 1 Bengal Tigers [Within the object] counter: 2 Indochinese Tigers [Within the object] counter: 3 Malayan Tigers [Within the object] counter: 4 South China Tigers [Within the object] counter: 5 Bali Tigers [Within the object] counter: 6 Caspian Tigers [Within the object] counter: 7 Javan Tigers [Within the object] counter: 8
One last note. With Python 2.x, we need to have the next() method, instead of the __next__() method. To make the above example work for both Python2.x and Python3.x, we can add the next() method and make it call the __next__() method and should simply return the value returned from __next__(). In other words, the body of the next() method should contain "return self.__next__()" and we should be good to go!