I'm writing a Python generator which looks like "cat". My specific use case is for a "grep like" operation. I want it to be able to break out of the generator if a condition is met:
summary={}
for fn in cat("filelist.dat"):
    for line in cat(fn):
        if line.startswith("FOO"):
            summary[fn] = line
            break
So when break happens, I need the cat() generator to finish and close the file handle to fn.
I have to read 100k files with 30 GB of total dat开发者_如何转开发a, and the FOO keyword happens in the header region, so it is important in this case that the cat() function stops reading the file ASAP.
There are other ways I can solve this problem, but I'm still interested to know how to get an early exit from a generator which has open file handles. Perhaps Python cleans them up right away and closes them when the generator is garbage collected?
Thanks,
Ian
Generators have a close method that raises GeneratorExit at the yield statement. If you specifically catch this exception, you can run some tear-down code:
import contextlib
with contextlib.closing( cat( fn ) ):
    ...
and then in cat:
try:
    ...
except GeneratorExit:
    # close the file
If you'd like a simpler way to do this (without using the arcane close method on generators), just make cat take a file-like object instead of a string to open, and handle the file IO yourself:
for filename in filenames:
    with open( filename ) as theFile:
        for line in cat( theFile ):
            ...
However, you basically don't need to worry about any of this, because the garbage collection will handle it all. Still,
explicit is better than implicit
By implementing the context protocol and the iterator protocol in the same object, you can write pretty sweet code like this:
with cat("/etc/passwd") as lines:
    for line in lines:
        if "mail" in line:
            print line.strip()
            break
This is a sample implementation, tested with Python 2.5 on a Linux box. It reads the lines of /etc/passwd until it finds the one for user audio, and then stops:
from __future__ import with_statement
class cat(object):
    def __init__(self, fname):
        self.fname = fname
    def __enter__(self):
        print "[Opening file %s]" % (self.fname,)
        self.file_obj = open(self.fname, "rt")
        return self
    def __exit__(self, *exc_info):
        print "[Closing file %s]" % (self.fname,)
        self.file_obj.close()
    def __iter__(self):
        return self
    def next(self):
        line = self.file_obj.next().strip()
        print "[Read: %s]" % (line,)
        return line
def main():
    with cat("/etc/passwd") as lines:
        for line in lines:
            if "mail" in line:
                print line.strip()
                break
if __name__ == "__main__":
    import sys
    sys.exit(main())
Or even simpler:
with open("/etc/passwd", "rt") as f:
    for line in f:
        if "mail" in line:
            break
File objects implement the iterator protocol (see http://docs.python.org/library/stdtypes.html#file-objects)
Please also consider this example:
def itertest():
    try:
        for i in xrange(1000):
            print i
            yield i
    finally:
        print 'finally'
x = itertest()
for i in x:
    if i > 2:
        break
print 'del x'
del x
print 'exit'
0
1
2
3
del x
finally
exit
It shows that finally is run after the iterator is cleaned up. I think __del__(self) is calling self.close(), see also here: https://docs.python.org/2.7/reference/expressions.html#generator.close
There seems to be another possibility using try..finally (tested on Python 2.7.6):
def gen():
    i = 0
    try:
        while True:
            print 'yield %i' % i
            yield i
            i += 1
        print 'will never get here'
    finally:
        print 'done'
for i in gen():
    if i > 1:
        print 'break'
        break
    print i
Gives me the following printout:
yield 0
0
yield 1
1
yield 2
break
done
 
         
                                         
                                         
                                         
                                        ![Interactive visualization of a graph in python [closed]](https://www.devze.com/res/2023/04-10/09/92d32fe8c0d22fb96bd6f6e8b7d1f457.gif) 
                                         
                                         
                                         
                                         加载中,请稍侯......
 加载中,请稍侯......
      
精彩评论