开发者

Slow MySQL joins with inner select

开发者 https://www.devze.com 2023-03-27 20:38 出处:网络
The following query works, but becomes slower as the size of the sendlog table increases over time. The goal is to select a list of all subscribers from the newsletter_subscribers table which do not h

The following query works, but becomes slower as the size of the sendlog table increases over time. The goal is to select a list of all subscribers from the newsletter_subscribers table which do not ha开发者_Go百科ve an email entry for a given newsletter id in the newsletter_sendlog table. Currently, it's taking about 2.2 seconds on my mysql server with only a couple of thousand entries in the sendlog.

SELECT `newsletter_subscribers`.* 
FROM `newsletter_subscribers`
    INNER JOIN `newsletter_to_subscriber` 
        ON newsletter_to_subscriber.subscriber_id = newsletter_subscribers.id
    LEFT JOIN (
        SELECT `newsletter_sendlog`.`subscriber_email` 
        FROM `newsletter_sendlog` 
        WHERE (newsletter_id='7')
      ) AS `sendlog` 
        ON newsletter_subscribers.email = sendlog.subscriber_email 
WHERE (sendlog.subscriber_email IS NULL) 
AND (newsletter_to_subscriber.newsletter_id = '7')

EXPLAIN(query) outputs the following:

Slow MySQL joins with inner select

I'm not too familiar with the output of EXPLAIN, but if I read it correctly it would suggest that it's not using the index which I've defined on newsletter_sendlog.subscriber_email. I've tried using USE INDEX(email) on that table, but it does not seem to take effect.

Any suggestions on how to optimize this? Or possibly suggest another query that does the same?


The create table for newsletter_sendlog:

CREATE TABLE `newsletter_sendlog` (
  `id` int(11) unsigned NOT NULL auto_increment,
  `subscriber_email` varchar(100) NOT NULL default '',
  `newsletter_id` int(11) default NULL,
  `sendstatus` int(11) default NULL,
  `senddate` timestamp NOT NULL default CURRENT_TIMESTAMP,
  PRIMARY KEY  (`id`),
  KEY `newsletter_id` (`newsletter_id`),
  KEY `email` (`subscriber_email`)
) ENGINE=MyISAM AUTO_INCREMENT=2933 DEFAULT CHARSET=latin1;

create table for newsletter_subscribers:

CREATE TABLE `newsletter_subscribers` (
  `id` int(11) unsigned NOT NULL auto_increment,
  `email` varchar(100) NOT NULL default '',
  `name` tinytext,
  PRIMARY KEY  (`id`),
  KEY `email` (`email`)
) ENGINE=MyISAM AUTO_INCREMENT=2964 DEFAULT CHARSET=utf8;

create table for newsletter_to_subscriber:

CREATE TABLE `newsletter_to_subscriber` (
  `id` int(11) unsigned NOT NULL auto_increment,
  `newsletter_id` int(11) NOT NULL,
  `subscriber_id` int(11) NOT NULL,
  PRIMARY KEY  (`id`),
  KEY `newsletter_subscriber` (`newsletter_id`,`subscriber_id`)
) ENGINE=MyISAM AUTO_INCREMENT=2964 DEFAULT CHARSET=latin1;

Update:

Create table for newsletter_to_subscriber now looks like this after adding index on subscriber_id:

CREATE TABLE `newsletter_to_subscriber` (
  `id` int(11) unsigned NOT NULL auto_increment,
  `newsletter_id` int(11) NOT NULL,
  `subscriber_id` int(11) NOT NULL,
  PRIMARY KEY  (`id`),
  KEY `newsletter_subscriber` (`newsletter_id`,`subscriber_id`),
  KEY `subscriber` (`subscriber_id`)
) ENGINE=MyISAM AUTO_INCREMENT=2964 DEFAULT CHARSET=latin1;

The explain for the query suggested by @nobody:

Slow MySQL joins with inner select


It is always better to select the specific fields, in stead of an asterisk (*), and avoid backticks (`). Try to see if the following (rewritten) query works better:

SELECT 
    newsletter_subscribers.id,
    newsletter_subscribers.email,
    newsletter_subscribers.name
FROM
    newsletter_subscribers
    LEFT JOIN
        newsletter_to_subscriber
        ON
            newsletter_to_subscriber.subscriber_id = newsletter_subscribers.id
    LEFT JOIN
        newsletter_sendlog
        ON
            newsletter_subscribers.email = newsletter_sendlog.subscriber_email
WHERE
    newsletter_to_subscriber.newsletter_id = 7
    AND
        newsletter_sendlog.newsletter_id = 7
    AND
        newsletter_sendlog.subscriber_email IS NULL


SELECT `newsletter_subscribers`.* FROM `newsletter_subscribers`
  INNER JOIN `newsletter_to_subscriber` 
    ON newsletter_to_subscriber.subscriber_id = newsletter_subscribers.id
  LEFT JOIN (
      SELECT `newsletter_sendlog`.`subscriber_email` FROM `newsletter_sendlog` 
        WHERE (newsletter_id='7')) AS `sendlog` 
    ON newsletter_subscribers.email=sendlog.subscriber_email 
  WHERE (sendlog.subscriber_email IS NULL) 
    AND (newsletter_to_subscriber.newsletter_id = '7')

Could you try implementing index key on single column newsletter_to_subscriber.subscriber_id

and see if it helps?

Try using table structure as below:

CREATE TABLE `newsletter_to_subscriber` (
  `id` int(11) unsigned NOT NULL auto_increment,
  `newsletter_id` int(11) NOT NULL,
  `subscriber_id` int(11) NOT NULL,
  PRIMARY KEY  (`id`),
  KEY `newsletter_subscriber` (`newsletter_id`,`subscriber_id`)
  KEY `subscriber_id_key` (`subscriber_id`)
  KEY `newsletter_id_key` (`newsletter_id`)
) ENGINE=MyISAM AUTO_INCREMENT=2964 DEFAULT CHARSET=latin1;


Not totally sure, but I think the index is being ignored because you're looking for NULL values.

There's a different, hopefully more efficient way to run this query though:

select *
from newsletter_subscribers
where email not in 
(select subscriber_email
from newsletter_sendlog
where newsletter_id='7')


Firstly you don't need that sub-query:

SELECT `newsletter_subscribers`.* 
FROM `newsletter_subscribers`
    INNER JOIN `newsletter_to_subscriber` 
        ON( newsletter_to_subscriber.subscriber_id = newsletter_subscribers.id )
    LEFT JOIN `newsletter_sendlog`
        ON( newsletter_subscribers.email = newsletter_sendlog.subscriber_email AND
            newsletter_sendlog.newsletter_id = '7' )
WHERE newsletter_sendlog.subscriber_email IS NULL

the query above will do the job.

Secondly in newsletter_to_subscriber you have a multi-part index on newsletter_id and subscriber_id which can't be used by your query because it will be searching for subscriber_id and that comes second in the index, you need to have a separate index on subscriber_id:

INDEX( subscriber_id )
0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号