lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Yossi Vainshtein <yossi.vainsht...@gmail.com>
Subject Lucene fuzzy and wildcard search, and scoring in AutomatonQuery
Date Wed, 18 Feb 2015 12:33:29 GMT
Hi all,

I'm using Apache Lucene and currently trying to combine Fuzzy and Prefix
(or Wildcard) query to implement a kind of suggestion mechanism.

For example, if the query is "levy", a document containing "Levinshtein" should
also be returned.

As there seems no builtin query of this sort in Lucene, I've searched for
solutions, this issue has been asked about. I used the approach suggested
here
http://stackoverflow.com/questions/28565090/scoring-results-of-automatonquery
<http://stackoverflow.com/questions/2631206/lucene-query-bla-match-words-that-start-with-something-fuzzy-how>
by
Robert Muir, that creates the query as a concatenation of two Automata
(Levinshtein and Wildcard).

That works great indeed, but, now the thing is that there's no scoring. All
results get result of *1.0*. I really want "Levy" to be ranked higher then
"Levninshtein" in the previous example.

By the way, I tried using Lucene auto-suggestion in the form of
FuzzySuggester, but it's not feasible with large inputs, it holds all
suggestion in RAM and bloats the memory usage.

Is there another way of doing this? Or I should implement my own *Scorer*
 or *Similarity*?


Thanks

Yossi

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message