lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris.Hill <ch...@tooled-up.com>
Subject Synonyms with multiple alternatives
Date Wed, 15 Nov 2017 13:43:31 GMT
I am using Lucene 4.8 (.net flavour) and cannot find a decent working example to answer my
issue.

In our source data we have lots of similar items that can be described in the same way - for
example "lawnmower", "lawn mower" & "grass cutter".

Obviously we have no control over how people choose to search for such items as they will
just enter their most familiar term.

What we need to do is return all items that contain any of those strings / phrases, if any
one of those phrases is used to search - so searching for "lawnmower" could return :

XYZ Electric Lawnmower
ABC Rotary Lawn mower
123 Hover Grass Cutter

Likewise any of the other terms entered to search should return all the same matches as above
(if searching for "lawn mower" or "grass cutter")

I am looking to implement the SynonymFilter but I can't grasp how I need to do this to achieve
what we want -  I have had some success mapping one term to another but I can't work out how
to extend this to 3 or more terms in a "group" of similar terms.

So I will have to add the following combinations always to my SynonymMap :
a > b, b > a, a > c, c > a, b > c and c > b?

Am I looking to do this in both the built index and the incoming query? In my source data
I could have different variations of the term, and obviously I cannot predict how people will
search for it. Or is it good enough to only process the query to look for all the alternate
terms?

Do I retain the original value in the Map when adding the synonym? I can't "see" what is being
created to know what is going on under the hood so I can work out the best approach.

Thanks

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message