lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erick Erickson <erickerick...@gmail.com>
Subject Re: synonyms vs replacements
Date Mon, 29 Aug 2011 12:00:24 GMT
See here abou the "multi word" problem....
http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.SynonymFilterFactory

As for the rest, it's a tradeoff (surprise, surprise, surprise <G>).

You're right, expanding at index time leads to a somewhat
larger index, but less complex queries. And if you change
your synonyms file, you need to re-index from scratch

Indexing at query time lets you keep your synonyms up to
date. But the queries are more complex and somewhat
slower...

Which is "better" depends (tm), so pick your poison. One
strategy is to expand at index time, and *also* expand
at query time, but with a different synonym file. The idea
is that your query-time synonym file is the set of terms that
you want to add to your index-time expansion next
time you can re-index from scratch. Then periodically you
merge your query-time syns into your index-time syns, re-index
from scratch and empty your query-time syns. Rinse, repeat.

So, there isn't really a "right" answer. Personally I prefer to
expand at index time, but that's largely a preference.

Best
Erick

On Fri, Aug 26, 2011 at 4:52 PM, Robert Petersen <robertpe@buy.com> wrote:
> Hello all,
>
>
>
> Which is better?   Say you add an index time synonym between nunchuck
> and nunchuk and then both words will be in the document and both will be
> searchable.   I can get the same exact behavior by putting an index time
> replacement of nunchuck => nunchuk and a search time replacement of the
> same.
>
>
>
> I figured the replacement strategy keeps the the index size slightly
> smaller by only having the one term in the index, but the synonym
> strategy only requires you update the master, not the slave farm, and
> requires slightly less work for the searchers during a user query.  Are
> there any other considerations I should be aware of?
>
>
>
> Thanks
>
>
>
> BTW nunchuk is the correct spelling.  J
>
>
>
>
>
>

Mime
View raw message