I have a very simple table structure here. Just a list of words related to a simple user_id.
开发者_如何学编程Word Table:
word - varchar(50)
user_id - integer
I need to find words used by one user that are not used by other users. Currently I'm doing this and it works alright on Postgresql (9.0.3) at 200k words (~.3-.5 seconds) and completely falls over on MySQL (5.1.54) with the same data (5+ mins and it is still running). All used columns are indexed.
SELECT
word, count(word) as count
FROM
words
WHERE
word not in (select word from words where user_id <> 99 group by word)
and user_id = 99
GROUP BY word
ORDER BY count desc LIMIT 20
1) Anyone know of a better way to do this?
2) Anyone know why it is completely failing on MySql?
EDIT: This fixes the issue on MySQL, from 5 mins+ to 10-20ms - Thanks Borealid
SELECT
word, count(word) as count
FROM
words
WHERE
word not in (select distinct word from words where user_id <> 99)
and user_id = 99
GROUP BY word
ORDER BY count desc LIMIT 20
Thanks.
Try NOT EXISTS():
SELECT
w1.word,
COUNT(w1.word) as count
FROM
words w1
WHERE
NOT EXISTS (
SELECT 1
FROM
words w2
WHERE
w2.user_id <> 99
AND
w1.word = w2.word
)
AND
w1.user_id = 99
GROUP BY
w1.word
ORDER BY
count DESC
LIMIT 20;
Make sure you have an index on the user_id and word (or a combination), use explain to see the queryplan and what works best for you.
====== Edit: Also try the LEFT JOIN solution using a IS NULL:
SELECT
w1.word,
COUNT(w1.word) AS count
FROM
words w1
LEFT JOIN words w2 ON (w1.word = w2.word AND w1.user_id <> w2.user_id)
WHERE
w1.user_id = 99
AND
w2.word IS NULL
GROUP BY
w1.word
ORDER BY
count DESC
LIMIT 20;
Try an index on both columns:
CREATE INDEX idx_word_user ON words ( word, user_id);
精彩评论