开发者

mysql update is so slow - is there any faster way to update data?

开发者 https://www.devze.com 2023-04-04 09:33 出处:网络
I have to update 2M*2rows in a mysql database. All the information is in a file, that I process with php.

I have to update 2M*2rows in a mysql database.

All the information is in a file, that I process with php. I get the information in a array, and then push it in the database using

UPDATE processed 
SET number1=$row[1], number2=$row[2], timestamp=unix_timestamp()
where match (id) against ('\"$id\"' IN BOOLEAN MODE) limit 1

That's working - but it takes soo long...

I have an index (primary) on (id).

I have tried to use something else than (id) that's on a fulltext index (i'm using Myisam) - it's even slower.

As my database is pretty big, and mysql has to go through everything to find the right line to update, it takes a few seconds per update.. which means a few days to process my update!

Is there any faster way to do that? If I switch to innodb will that be faster? (Even if it's not I guess it can be cool at during the update, my whole table won't be locked).

As number1 & number2 are numbers, I though about grouping all the (id) that have to be updated to the same number - would that be faster?

Is there a way to tune mysqld so that number1, number2 & id colums would stay in RAM, making it faster to access / update?

Any idea is welcome, as I'm totally lost... :)

edit: adding an example code so that you can un开发者_如何转开发derstand my situation:

foreach ($data_rows as $rows) {
  $row=explode(":", $rows);  // $row[0] info
                             // $row[1] new number1
                             // $row[2] new number2

 $query = $db->query("select * from processed where match (info) against ('\"$info\"' IN BOOLEAN MODE) limit 1");

   while ($line = $query->fetch_object())
 {
   $data[$line->hash]['number1']=$line->number1;
   $data[$line->hash]['number2']=$line->number2;
   $id=$line->id;
  }

  if (is_array($data[$info]))       {  // Check if we have this one in the database.
    // If the number is correct, no need to update.
     if (($data[$info]['number1'] != $row[1]) && ($data[$info]['number2'] != $row[2])) {
 $db->query("UPDATE processed SET number1=$row[1], number2=$row[2], timestamp=unix_timestamp() where id=$id");
print "updated - $info - $row[1] - $row[2]\n";
                                               }
                    }
else    {
print "$info not in database\n";
    }
                }   

shema:

CREATE TABLE `processed` (
  `id` int(30) NOT NULL AUTO_INCREMENT,
  `timestamp` int(14) DEFAULT NULL,
  `name` text,
  `category` int(2) DEFAULT '0',
  `subcat` int(2) DEFAULT '0',
  `number1` int(20) NOT NULL,
  `number2` int(20) NOT NULL,
  `comment` text,
  `hash` text,
  `url` text,
  PRIMARY KEY (`id`),
  FULLTEXT KEY `name` (`name`),
  FULLTEXT KEY `hash` (`hash`)
) ENGINE=MyISAM AUTO_INCREMENT=1328365 DEFAULT CHARSET=utf8;
/*!40101 SET character_set_client = @saved_cs_client */;

edit again:

ANALYZE TABLE processed; did help a lot in improving the time of my UPDATEs. (fresh indexes!)

Will add my data in another table & join update anyway :)


You are performing 2M*2 UPDATE commands. That does take a while...

I would advise you to dump the file contents into a temp table and then running a single UPDATE command.

Update

Here is how you'd run a single joined UPDATE:

UPDATE processed 
inner join DumpTable on processed.id = DumpTable.id
SET number1=DumpTable.value1 , number2=DumpTable.value2, timestamp=unix_timestamp()


Well a) You should always be sanitizing your data -

sprintf("UPDATE processed 
         SET number1=%d, number2=%d, timestamp=unix_timestamp() 
         WHERE match (id) 
         AGAINST ('%d' $id IN BOOLEAN MODE) limit 1",
         mysql_real_escape_string($row[1]),
         mysql_real_escape_string($row[2]),
         mysql_real_escape_string($id)
);

Also if you switch to InnoDB it may be slightly faster, however, it is a better option for a lot of people as you do not lock the whole table you are working on for every UPDATE you do, you only lock the row which you are updating.

So it is most definatly something to think about, please read the following link : http://www.kavoir.com/2009/09/mysql-engines-innodb-vs-myisam-a-comparison-of-pros-and-cons.html


Have a look at:

http://yoshinorimatsunobu.blogspot.com/2010/10/using-mysql-as-nosql-story-for.html

750,000qps might speed thing up a bit.

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号