开发者

proximity matrix in python

开发者 https://www.devze.com 2023-02-18 04:33 出处:网络
What is the best way to compute the distance/proximity matrix for very large sparse vectors? For example you are given the following design matrix, where each row is 68771 dimensional sparse vector.

What is the best way to compute the distance/proximity matrix for very large sparse vectors? For example you are given the following design matrix, where each row is 68771 dimensional sparse vector.

开发者_开发知识库

designMatrix <5830x68771 sparse matrix of type '' with 1229041 stored elements in Compressed Sparse Row format>


Have you tried the routines in scipy.spatial.distance?

http://docs.scipy.org/doc/scipy/reference/spatial.distance.html

If this forces you to go to a dense representation, then you may be better off rolling your own, depending on the density of nonzero elements. You could squeeze out the zeros while retaining a map between the new and original indices, calculate the pairwise distances on the remaining nonzero elements and then use the indexing to map things back.

0

精彩评论

暂无评论...
验证码 换一张
取 消