开发者

Sorting a csv file in Python with sorted() returns values in programmer DESC order, not time DESC order

开发者 https://www.devze.com 2023-03-09 08:38 出处:网络
I\'m not doing anything overly complex I believe. I\'m presorting a large csv data file because it is full of data that arrives in random time order. The index is correct, but the return formatting is

I'm not doing anything overly complex I believe. I'm presorting a large csv data file because it is full of data that arrives in random time order. The index is correct, but the return formatting is off.

开发者_C百科
    sortedList=sorted(reader,key=operator.itemgetter(1))

So instead of sorting like [-100 -10 -1 0 10 100 5000 6000]; I get [-1 -10 -100 0 100 5000 60]

I tried both the lambda function examples and itemgetter, but I don't really know where to go from there.

Thanks for the help.

The answer to my question is in the comments. The numerical value was being sorted as a string and not a number. I didn't know that I could specify the data type of the key in sorted(). This code works as I intended:

    sortedList=sorted(reader,key=lambda x:float(x[1]))


Just from the output you see there, it looks like these are being sorted as strings rather than as numbers.

So you could do:

sortedList=sorted(reader, key=lambda t: int( t[1] ))

or

sortedList=sorted(reader, key=lambda t: float( t[1] ))

Or better, try to ensure that the sequence reader gets populated with numbers, rather than strings, when it's created, perhaps using QUOTE_NONNUMERIC as a fmtparam for the reader (see http://docs.python.org/library/csv.html#csv.QUOTE_NONNUMERIC).


It looks like "reader" yields strings, and what you want is integers. You could try something like :

    sorted(reader, key=lambda x: float(x[1]))


it looks like your numbers are getting sorted alphabetically (as strings) rather than numerically:

>>> sorted([10,2000,30])
[10, 30, 2000]
>>> sorted(['10','2000','30'])
['10', '2000', '30']

To fix this, you can pass a numeric sort:

def numeric_compare(x, y):
    return int(x)-int(y)

>>> sorted(['10','2000','30'],cmp=numeric_compare)
['10', '30', '2000']


It looks like your list is being sorted as strings rather than numbers. When you read in your CSV file, it is still text and must be converted to integers first.


I like compose:

from operator import itemgetter

def compose(f, g):
    return lambda *a, **k: g(f(*a, **k))

sortedList = sorted(reader, key=compose(itemgetter(1), float))
0

精彩评论

暂无评论...
验证码 换一张
取 消