开发者

An efficient way to get the difference between two arrays of objects?

开发者 https://www.devze.com 2023-03-20 22:03 出处:网络
I have two arrays of objec开发者_如何学编程ts: var a = [{\'id\': 20},{\'id\': 15},{\'id\': 10},{\'id\': 17},{\'id\': 23}];

I have two arrays of objec开发者_如何学编程ts:

var a = [  {'id': 20},   {'id': 15},   {'id': 10},   {'id': 17},   {'id': 23}  ];

var b = [ {'id': 90},   {'id': 15},    {'id': 17},   {'id': 23}  ];  

I'd like to get objects which are in a, but not in b. Results from this example would be:

{'id': 20} and {'id': 10}.

Because the arrays could be large, I need an efficient way to do this.


// Make hashtable of ids in B
var bIds = {}
b.forEach(function(obj){
    bIds[obj.id] = obj;
});

// Return all elements in A, unless in B
return a.filter(function(obj){
    return !(obj.id in bIds);
});

very minor addendum: If the lists are very large and you wish to avoid the factor of 2 extra memory, you could store the objects in a hashmap in the first place instead of using lists, assuming the ids are unique: a = {20:{etc:...}, 15:{etc:...}, 10:{etc:...}, 17:{etc:...}, 23:{etc:...}}. I'd personally do this. Alternatively: Secondly, javascript sorts lists in-place so it doesn't use more memory. e.g. a.sort((x,y)=>x.id-y.id) Sorting would be worse than the above because it's O(N log(N)). But if you had to sort it anyway, there is an O(N) algorithm that involves two sorted lists: namely, you consider both lists together, and repeatedly take the leftmost (smallest) element from the lists (that is examine, then increment a pointer/bookmark from the list you took). This is just like merge sort but with a little bit more care to find identical items... and maybe pesky to code. Thirdly, if the lists are legacy code and you want to convert it to a hashmap without memory overhead, you can also do so element-by-element by repeatedly popping the elements off of the lists and into hashmaps.


With lodash 4.12.0 you can use _.differenceBy.

_.differenceBy(a, b, 'id');


A general way to do this would be:

  1. put all objects from b into a hashtable
  2. iterate over a, for each item check if it is in the hashtable

A lot of programming environments have set and/or HashSet implementations these days, which make it very simple to do this.

In special cases, other ways might be more efficient. If, for example, your elements were byte-sized values, and a and b both fairly big, then I would use a boolean array "flags" with 256 elements, initialize all to false. Then, for each element x of b, set flags[x] to true. Then iterate over a, and for each y in a, check if flags[y] is set.


If you not adverse to including a library use underscore.js it has a good intersection function http://documentcloud.github.com/underscore/

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号