开发者

Filtering the results of a sorted query in Lucene.NET

开发者 https://www.devze.com 2023-03-14 03:08 出处:网络
I\'m using Lucene.NET, which is currently up to date with Lucene 2.9. I\'m trying to implement a kind of select distinct, but without the need to drill down into any groups. I know that Lucene 3.2 has

I'm using Lucene.NET, which is currently up to date with Lucene 2.9. I'm trying to implement a kind of select distinct, but without the need to drill down into any groups. I know that Lucene 3.2 has a faceted search that may solve this, but I don't have the time to port it to 2.9 yet.

I figure in any event, when you perform a paged query with a sort operator, Lucene has to find all the documents that match the query, sort them, then take the top N results, where N is the page size. I'd like to build something that is also applied after the sorted query has completed, but takes the top N unique results and returns them. I'm think开发者_运维技巧ing of using a HashSet and one of the indexed fields to determine uniqueness. I'd rather find a way to extend something in Lucene than try and do this once the results are already returned for performance reasons.

Custom filters seem to run before the main query is even applied and custom collectors run before sorting is applied, unless you are sorting by Lucene's document id. So what is the best approach to this problem? A point in the direction of the right component to extend will get you the answer on this one, an example implementation will most definitely get you the answer. Thanks in advance


I'd make the search without sorting, and in a custom collector, would collect the results in a sorted list of size N based on "uniqueness"

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号