开发者

Remove duplicates and update IDs linked to nonduplicates

开发者 https://www.devze.com 2022-12-27 19:23 出处:网络
I have two tables, tableA and tableB, linked through a locationID. TableA has descri开发者_开发技巧ptive survey info with each record from a different time (ie unique), while TableB has purely locat

I have two tables, tableA and tableB, linked through a locationID.

TableA has descri开发者_开发技巧ptive survey info with each record from a different time (ie unique), while TableB has purely locational information. However, there are lots of duplicates in TableB, yet each has a unique locationID, which has an entry in TableA. I've found plenty posts about removing duplicates from TableB, but how can I update the locationIDs in TableA so they are linked to the unique locations in TableB once duplicates are removed...

Help much appreciated!


First of all you have to update TableA before removing the duplicates in TableB. If not, you are gonna lose information.

After that, make all the duplicates in table B equal in one field (let's say LocationDescription)

Then, select one LocationID of all them. Let´s say the MIN Location ID.

After that, update tableA with that MinID and then remove from TableB the duplicates in a really easy way (with a NOT in).

Here´s an example:

Asumming

TableA:                                 Table B

Location ID                             LocationId  LocationDescription
   1                                       1        Neuquen
   2                                       2        Cipolletti
   3                                       3        Neuquen
   4                                       4        Cipolletti
   5                                       5        Neuquen

UPDATE TableA
SET locationID=TABLEBAUX.UniqueID
FROM TableA 
INNER JOIN 
(
SELECT UniqueID, LocationID
FROM 
(SELECT MIN(LocationID) as UniqueID,LocationDescription
FROM TableB
GROUP BY LocationDescription) TEMP
INNER JOIN TableB
ON TEMP.LocationDescription=TABLEB.LocationDescription) TABLEBAUX
ON TableA.LocationID=TABLEBAUX.LocationID

DELETE FROM TableB 
WHERE LocationID NOT in (SELECT LocationID FROM TABLEA)

After this you get this:

TableA:                                 Table B

Location ID                             LocationId  LocationDescription
   1                                       1        Neuquen
   2                                       2        Cipolletti
   1                                       
   2                                       
   1                                       


When you find a duplicate, you know at least two locationIDs. Therefore, before you delete from TableB, you can update TableA:

UPDATE TableA
SET locationID = foundLocationID1
WHERE locationID = foundLocationID2

Alternatively, if you have a whole group you're about to delete at once:

UPDATE TableA
SET locationID = foundLocationID1
WHERE locationID IN (foundLocationID2, foundLocationID3, foundLocationID4)
0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号