I have a table with the fields id, user_id, condition1, condition2, condition3, score
. And each user can have several rows in the table. What I want to do now is to create several toplists. It could be for example a toplist where condition1 = foo
and I only want to count each user once but I want the complete best row from each user.
So SELECT user_id, MAX(score) AS s FROM table WHERE condition1 = foo ORDER BY s DESC LIMIT 50
don't work.
There are so many different toplists I want so creating a second table where I store the users best result isn't really an option. Because each user would probably have more then 10开发者_运维百科0 different best results.
The two things that are important is that getting the top 50 i quick. But also getting which place a specific user is placed on (which is pretty easy by just checking how many unique users have a bigger score than the specific user).
Update: I tested both Thomas and Quassnoi ideas and Thomas idea took 11 seconds and Quassnoi took 4.5 seconds.
Then I figured out another way to do it which is:
SELECT (
SELECT id
FROM table AS ti
WHERE ti.user_id = t.user_id
AND condition1 = foo
ORDER BY score DESC
LIMIT 1
)
FROM table as t
WHERE condition1 = foo
GROUP BY user_id
ORDER BY MAX(score) DESC
LIMIT 50
And then I just make another query where a pick out all rows WHERE id IN(all ids returned from the first query)
and this way takes 0.4 seconds.
Is this a good way of doing it or am I just lucky in my testdata?
Not tested, but I think you are missing a GROUP BY clause:
SELECT user_id, MAX(score) AS s FROM table WHERE condition1 = foo GROUP BY user_id ORDER BY s DESC LIMIT 50
SELECT *
FROM mytable m
WHERE m.id =
(
SELECT id
FROM mytable mi
WHERE mi.user_id = m.user_id
AND mi.condition1 = 'foo'
ORDER BY
score DESC, id DESC
LIMIT 1
)
ORDER BY
score DESC
LIMIT 50
or
SELECT m.*
FROM users u
JOIN mytable m
ON mi.id =
(
SELECT id
FROM mytable mi
WHERE mi.user_id = u.id
ORDER BY
score DESC, id DESC
LIMIT 1
)
ORDER BY
score DESC
LIMIT 50
Which one is faster depends on the distribution of the scores, but the first one is generally faster unless you have really few users with really many scores which tie really often.
For this to work fast, you should a composite index:
mytable (score, id)
plus an index for each filtering combination, like this:
mytable (user_id, score, id) -- for no filtering
mytable (user_id, condition1, score, id) -- for filtering on condition1
etc.
Select T.Id, T.user_id, T.condition1, T.condition2, T.condition3, T.score
From Table As T
Where Exists (
Select 1
From Table As T2
Where T2.user_id = T.user_id
And T2.condition1 = 'foo'
Having Max(T2.score) = T.score
)
Limit 50
精彩评论