开发者

Apache lucene and text meaning

开发者 https://www.devze.com 2023-04-12 04:28 出处:网络
I have a question about searching process in lucene/. I use this code for search Directory directory = FSDirectory.GetDirectory(@\"c:\\index\");

I have a question about searching process in lucene/. I use this code for search

    Directory directory = FSDirectory.GetDirectory(@"c:\index");
    Analyzer analyzer = new StandardAnalyzer();

    QueryParser qp = new QueryParser("content", analyzer);
    qp.SetDefaultOperator(QueryParser.Operator.AND);

    Query query = qp.Parse(search string);

In one document I'v开发者_C百科e set "I want to go shopping" for a field and in other document I've set "I wanna go shopping".

the meaning of both sentences is same!

is there any good solution for lucene to understand meaning of sentences or kind of normalize the scentences ? for example save the fields like "I wanna /want to/ go shopping" and remove the comment with regexp in result.


Lucene provides filter to normalize words and even map similar words.

PorterStemFilter -
Stemming allows words to be reduced to their roots.
e.g. wanted, wants would be reduced to root want and search for any of those words would match the document.
However, wanna does not reduce to root want. So it may not work in this case.

SynonymFilter -
would help you to map words similar in a configuration file.
so wanna can be mapped to want and if you search for either of those, the document must match.

you would need to add the filters in your analysis chain.

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号