开发者

Is there a better way to read an element from a file in Python?

开发者 https://www.devze.com 2023-04-11 12:58 出处:网络
I have written a crude Python program to pull phrases from an index in a CSV file and write these rows to another file.

I have written a crude Python program to pull phrases from an index in a CSV file and write these rows to another file.

import csv

total = 0

ifile = open('data.csv', "rb")
reader = csv.reader(ifile)

ofile = open('newdata_write.csv', "wb")
writer = csv.writer(ofile, delimiter='\t', quotechar='"', quoting=csv.QUOTE_ALL)

for row in reader:
    if ("some text") in row[x]:
        total = total + 1
        writer.writerow(row)
    elif ("some more text") in row[x]:
        total = total + 1   
        writer.writerow(row) 
    elif ("even more text I'm looking for") in row[x]:  
        total = total + 1   
        writer.writerow(row)

   < many, many more lines >

print "\nTotal = %d." % total

ifile.close()

My question is this: Isn't there a better (more elegant/less verbose) Pythonic way to do this? I feel this is a case of not knowing what I don't know. The CSV file I'm searching is not large (3863 lines, 669 KB) so I don't think it is necessary to use SQL to solve this, although I am certainly open to that.

I am a Python newbie, in love with the language and teaching myself through the normal channels (books, tutorials, Project Euler, Stack Overflow).

开发者_如何学Python

Any suggestions are greatly appreciated.


You're looking for any with a generator expression:

matches = "some text", "some more text", "even more text I'm looking for"
for row in reader:
    if any(match in row for match in matches):  
        total += 1   
        writer.writerow(row)

Alternatively, you could just write all the rows at once:

writer.writerows(row for row in reader if any(match in row for match in matches))

but as written that doesn't get you a total.


It's not a huge improvement, but you could do something like

keyphraseList = (
     "some text",
     "some more text",
     "even more text I'm looking for")

...
for row in reader:
   for phrase in keyphraseList:
       if phrase in row[x]:
           total = total + 1
           writer.writerow(row)
           break

(not tested)


You can get pythonic by using list comprehensions instead of for loops. For example, if you are looking for index strings 'aa' or 'bb', you could do

matches = [row for row in reader if 'aa' in row[0] or 'bb' in row[0]]


I'm not sure this version is better, just shorter, anyway hope it helps

import csv

total = 0

keys = ['a', 'b', 'c']
with open('infile', 'rb') as infile, open('outfile', 'wb') as outfile:
    rows = [x for x in csv.reader(infile) if any([k in x[0] for k in keys])]
    csv.writer(outfile, delimiter='\t', quotechar='"', quoting=csv.QUOTE_ALL).writerows(rows)

print 'Total: %d' % len(rows)


not necessairly, 'better', but I would compare the item to a set and clean up total a bit. It may not be 'better' but it is more succinct

This

for row in reader:
    if ("some text") in row[x]:
        total = total + 1
        writer.writerow(row)
    elif ("some more text") in row[x]:
        total = total + 1   
        writer.writerow(row) 
    elif ("even more text I'm looking for") in row[x]:  
        total = total + 1   
        writer.writerow(row)

becomes

myWords = set(('some text','some more text','even more'))
for row in reader:
     if row[x] in myWords: 
          total += 1
          writer.writerow(row)

you could just use a simple list, but sets become quicker on more memory intensive tasks.

in response to the comment by agf

>>> x = set(('something','something else'))
>>> Ture if 'some' in x else False
False
>>> True if 'something' in x else False
True

is this what your saying would not work?

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号