lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ahmet Arslan <iori...@yahoo.com.INVALID>
Subject Re: Changing Similarity without re-indexing (for example from default to BM25)
Date Wed, 19 Aug 2015 23:28:03 GMT
Hi Tom,

computeNorm(FieldInvertState) method is the only place where similarity is tied to indexing
process.
If you want to switch between different similarities, they should share the same implementation
for the method. For example, subclasses of SimilarityBase can be used without re-indexing.

By the way DefaultSimilarity and BM25 looks compatible.

For memory consumption reasons, exact value of the field length is encoded/decoded into byte
in norms at the expense of some precision loss.

Ahmet


On Wednesday, August 19, 2015 7:40 PM, Tom Burton-West <tburtonw@umich.edu> wrote:
Hello all,

The last time I worked with changing Simlarities was with Solr 4.1 and at
that time, it was possible to simply change the schema to specify the use
of a different Similarity without re-indexing.   This allowed me to
experiment with several different ranking algorithms without having to
re-index.

Currently the documentation states that while doing this is theoretically
possible but not well defined:

"To change Similarity
<http://lucene.apache.org/core/5_2_0/core/org/apache/lucene/search/similarities/Similarity.html>,
one must do so for both indexing and searching, and the changes must happen
before either of these actions take place. Although in theory there is
nothing stopping you from changing mid-stream, it just isn't well-defined
what is going to happen."

http://lucene.apache.org/core/5_2_0/core/org/apache/lucene/search/similarities/package-summary.html#changingSimilarity

Has something changed between 4.1 and 5.2 that actually will prevent
changing Similarity without re-indexing from working, or is this just a
warning in case at some future point someone contributes code so that a
particular similarity takes advantage of a different index format?

Tom

Mime
View raw message