开发者

"reduce" function in python not work on "namedtuple"?

开发者 https://www.devze.com 2023-04-10 09:11 出处:网络
I have a log file that is formatted in the following way: datetimestring \\t username \\t transactionName \\r\\n

I have a log file that is formatted in the following way:

datetimestring \t username \t transactionName \r\n

I am attempting to run some stats over this dataset. I have the following code:

import time
import collections
file = open('Log.txt', 'r')

TransactionData = collections.namedtuple('TransactionData', ['transactionDate', 'user', 'transactionName'])
transactions = list()

for line in file:
    fields = line.split('\t')

    transactionDate = time.strptime(fields[0], '%Y-%m-%d %H:%M:%S')
    user = fields[1]
    transactionName = fields[2]

    transdata = TransactionData(transactionDate, user, transactionName)
    transactions.append(transdata)

file.close()

minDate = reduce(lambda x,y: min(x.transactionDate, y.transactionDate), transactions)
print minDate

I did not want to define a class for such a simple dataset, so I used a name tuple. When I attempt to run, I get this error:

Traceback (most recent call last):
  File "inquiriesStat.py", line 20, in <module>
    minDate = reduce(lambda x,y: min(x.transactionDate, y.transactionDate), transactions)
  File "inquiriesStat.py", line 20, in <lambda>
    minDate = reduce(lambda x,y: min(x.transactionDate, y.transactio开发者_Go百科nDate), transactions)
AttributeError: 'time.struct_time' object has no attribute 'transactionDate'

It appears that the lambda function is operating on the 'transactionDate' property directly instead of passing in the full tuple. If I change the lambda to:

lambda x,y: min(x, y)

It works as I would expect. Any ideas why this would be the case?


Simply use:

minDate = min(t.transactionDate for t in transactions)

Below is an explanation of why your code isn't working.

Let's say transactions = [t1, t2, t3] where t1...t3 are three named tuples.

By the definition of reduce, your code:

reduce(lambda x,y: min(x.transactionDate, y.transactionDate), transactions)

is equivalent to

min(min(t1.transactionDate, t2.transactionDate).transactionDate, t3.transactionDate)

Clearly, the inner min() returns time.struct_time instead of a named tuple, so when reduce tries to apply .transactionDate to it, that fails.

There are ways to fix this, and to make use of reduce for this problem. However, there seems to be little point given that a direct application of min does the job and to my eye is a lot clearer than anything involving reduce.

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号