开发者

Fastest way to extract elements from a list that not matched condition in python

开发者 https://www.devze.com 2023-04-13 08:25 出处:网络
I\'m seeking the fastest way to extract all tuple members from a list under condition(s). Example: From a list of tuple (e.g. [(0,0,4),(1,0,3),(1,2,1),(4,0,0)]) I need to extract all members that ha

I'm seeking the fastest way to extract all tuple members from a list under condition(s).

Example: From a list of tuple (e.g. [(0,0,4),(1,0,3),(1,2,1),(4,0,0)]) I need to extract all members that have more than 3 in first tuple position, then more than 2 in second tuple position, and then more than 1 in last tuple position. Which should extract in this example (4,0,0) (->first condition), nothing (->second condition) and (0,0,4),(1,0,3) (->last condition). This example is very small, I need to perform that on list of thousands of tuples.

From the code I produced from your answers, here are the results in sec:

my_naive1, like proposed by Emil Vikström? 13.0360000134

my_naive2 110.727999926

Tim Pietzcker 9.8329999446

Don 12.5640001297

import itertools, operator, time, copy
from operator import itemgetter


def combinations_with_replacement_counts(n, r):  #(A, N) in our example.N individuals/balls in A genotypes/boxes
   size = n + r - 1
   for indices in itertools.combinations(range(size), n-1):
       #print indices
       starts = [0] + [index+1 for index in indices]
       stops = indices + (size,)
       yield tuple(map(operator.sub, stops, starts))


xp = list(combinations_with_replacem开发者_如何转开发ent_counts(3,20))  # a very small case

a1=time.time()
temp=[]
for n in xp:
    for n1 in xp:

        for i in xp:
            if i[0] <= min(n1[0],n[0]) or i[1] <= min(n1[1],n[1]) or i[2] <= min(n1[2],n[2]):
                temp.append(i)


a2=time.time()
for n in xp:
    for n1 in xp:
        xp_copy = copy.deepcopy(xp)
        for i in xp:
            if i[0] > min(n[0],n[0]) or i[1] > min(n[1],n[1]) or i[2] > min(n[2],n[2]):
                xp_copy.remove(i)

a3=time.time()
for n in xp:
    for n1 in xp:
        output = [t for t in xp if t[0]<=min(n[0],n[0]) or t[1]<=min(n[1],n[1]) or t[2]<=min(n[2],n[2])]
a4=time.time()

for n in xp:
    for n1 in xp:
        l1 = sorted(xp, key=itemgetter(0), reverse=True)
        l1_fitered = []
        for item in l1:
            if item[0] <= min(n[0],n[0]):
                break
            l1_fitered.append(item)

        l2 = sorted(l1_fitered, key=itemgetter(1), reverse=True)
        l2_fitered = []
        for item in l2:
            if item[1] <= min(n[1],n[1]):
                break
            l2_fitered.append(item)

        l3 = sorted(l2_fitered, key=itemgetter(2), reverse=True)
        l3_fitered = []
        for item in l3:
            if item[2] <= min(n[2],n[2]):
                break
            l3_fitered.append(item)
a5=time.time()            



print "soluce my_naive1, like proposed by Emil Vikström?",a2-a1
print "soluce my_naive2",a3-a2
print "soluce Tim Pietzcker",a4-a3
print "soluce Don",a5-a4


>>> l = [(0,0,4), (1,0,3), (1,2,1), (4,0,0)]
>>> output = [t for t in l if t[0]>3 or t[1]>2 or t[2]>1]
>>> output
[(0, 0, 4), (1, 0, 3), (4, 0, 0)]

This is fast because t[1]>2 is only evaluated if t[0]>3 is False (same for the third condition). So in your example list, only 8 comparisons are necessary.

You might save time and memory (depending on what you're doing with the filtered data) if you use a generator expression instead:

>>> l = [(0,0,4), (1,0,3), (1,2,1), (4,0,0)]
>>> for item in (t for t in l if t[0]>3 or t[1]>2 or t[2]>1):
>>>     # do something with that item


Have three lists, one for each condition, and just loop through the input set with a for loop, sorting each tuple into the correct target list. This will perform in O(n) (linear) time, which is the fastest possible asymptotic runtime for this problem. It will also only loop over the list once.


If you do not care the order of resulting items, I suggest a lookup in sorted list, with break condition on first non-matching item: this would skip list tails.

from operator import itemgetter
l = [(..., ..., ...), (...)]
l1_source = sorted(l, key=itemgetter(0), reverse=True)
l1_fitered = []
for item in l1:
    if item[0] <= 3:
        break
    l1_fitered .append(item)

l2 = sorted(l, key=itemgetter(1), reverse=True)
...
0

精彩评论

暂无评论...
验证码 换一张
取 消