Query against two integer columns take an absurd amount of time_问答_开发者

I have a query that gets generated (by Django) like this:

SELECT `geo_ip`.`id`, `geo_ip`.`start_ip`,
       `geo_ip`.`end_ip`, `geo_ip`.`start`,
       `geo_ip`.`end`, `geo_ip`.`cc`, `geo_ip`.`cn`
FROM `geo_ip`
WHERE (`geo_ip`.`start` <= 2084738290 AND `geo_ip`.`end` >= 2084738290 )
LIMIT 1

It queries a GeoLocating table with 134189 entries in it. Each query takes >100ms to perform when indexes are added, which makes it unusable for more than one-off things. I'm going to cache the response so I only have to do the IP lookup once, but I'm curious if I'm missing some obvious way of making it a magnitude faster. My table:

CREATE TABLE `geo_ip` (
  `start_ip` char(15) NOT NULL,
  `end_ip` char(15) NOT NULL,
  `start` bigint(20) NOT NULL,
  `end` bigint(20) NOT NULL,
  `cc` varchar(6) NOT NULL,
  `cn` varchar(150) NOT NULL,
  `id` int(11) NOT NULL AUTO_INCREMENT,
  PRIMARY KEY (`id`),
) ENGINE=InnoDB AUTO_INCREMENT=134190 DEFAULT CHARSET=latin1

Creating an index on both columns like so:

ALTER TABLE geo_ip ADD INDEX (start, end);

Gives the following explain:

EXPLAIN SELECT geo_ip.id, geo_ip.start_ip, geo_ip.end_ip,
               geo_ip.start, geo_ip.end, geo_ip.cc, geo_ip.cn
FROM geo_ip
WHERE (geo_ip.end >= 2084738290 AND geo_ip.start < 2084738290)
LIMIT 1;
+----+-------------+--------+-------+---------------+-------+---------+------+-------+----------+-------------+
| id | select_type | table  | type  | possible_keys | key   | key_len | ref  | rows  | filtered | Extra       |
+----+-------------+--------+-------+---------------+-------+---------+------+-------+----------+-------------+
|  1 | SIMPLE      | geo_ip | range | start         | start | 8       | NULL | 67005 |   100.00 | Using where |
+----+-------------+--------+-------+---------------+-------+---------+------+-------+----------+-------------+

It takes well over 100ms to complete selects:

SELECT geo_ip.id, geo_ip.start_ip, geo_ip.end_ip,
       geo_ip.start, geo_ip.end, geo_ip.cc,
       geo_ip.cn
FROM geo_ip
WHERE (geo_ip.end >= 2084738290 and geo_ip.start < 2084738290)
LIMIT 1;
+-------+--------------+----------------+------------+------------+----+-----------+
| id    | start_ip     | end_ip         | start      | end        | cc | cn        |
+-------+--------------+----------------+------------+------------+----+-----------+
| 51725 | 124.66.128.0 | 124.66.159.255 | 2084732928 | 2084741119 | SG | Singapore |
+-------+--------------+----------------+------------+------------+----+-----------+
1 row in set (0.18 sec)

Is more expensive than having a single individual index:

ALTER TABLE geo_ip ADD INDEX (`start`);
ALTER TABLE geo_ip ADD INDEX (`end`);
+----+-------------+--------+-------+---------------+-------+---------+------+-------+-------------+
| id | select_type | table  | type  | possible_keys | key   | key_len | ref  | rows  | Extra       |
+----+------开发者_如何学运维-------+--------+-------+---------------+-------+---------+------+-------+-------------+
|  1 | SIMPLE      | geo_ip | range | start,end     | start | 8       | NULL | 68017 | Using where |
+----+-------------+--------+-------+---------------+-------+---------+------+-------+-------------+

It takes around 100ms to complete these requests:

SELECT geo_ip.id, geo_ip.start_ip, geo_ip.end_ip, geo_ip.start, geo_ip.end, geo_ip.cc, geo_ip.cn FROM geo_ip
WHERE (geo_ip.end >= 2084738290 AND geo_ip.start < 2084738290) limit 1;
+-------+--------------+----------------+------------+------------+----+-----------+
| id    | start_ip     | end_ip         | start      | end        | cc | cn        |
+-------+--------------+----------------+------------+------------+----+-----------+
| 51725 | 124.66.128.0 | 124.66.159.255 | 2084732928 | 2084741119 | SG | Singapore |
+-------+--------------+----------------+------------+------------+----+-----------+
1 row in set (0.11 sec)

But both of these methods take way too long, is it possible to do anything about this?

Time is always consumed in the "where" clause.

And because you are working on two different fields with "lower than" or "greater than", it has to read a lot of indexes to find out which record is the one you want.

I should have done my table this way :

+-------+-------+----------------+------------+----+-----------+
| id    | type  | ip             | geo        | cc | cn        |
+-------+-------+----------------+------------+----+-----------+
| 51725 | start | 124.66.159.255 | 2084732928 | SG | Singapore |
+-------+-------+----------------+------------+----+-----------+
| 51726 | end   | 124.66.159.255 | 2084732928 | SG | Singapore |
+-------+-------+----------------+------------+----+-----------+

so that I can select this :

select * from table where geo between '2084732927' and '2084732928'

with an index on geo. Should be much, much faster. But sorry, I have no time to try.

Query against two integer columns take an absurd amount of time

精彩评论

关注公众号

热门标签

图文推荐

Query against two integer columns take an absurd amount of time

更多 问答 相关资讯：

精彩评论

关注公众号

热门标签

图文推荐

更多问答相关资讯：