lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Hostetter <>
Subject Re: short documents = help me tweak Similarity??
Date Thu, 05 Apr 2007 21:53:50 GMT

: The problem comes when your float value is encoded into that 8 bit
: field norm, the 3 length and 4 length both become the same 8 bit
: value.  Call Similarity.encodeNorm on the values you calculate for the
: different numbers of terms and make sure they return different byte
: values.

bingo.  You can't change encodeNorm and decodeNorm (because they are
static -- they need to work at query time regardless of what Similarity
was used at index time) but you can change the function of your length
norm to make small deviations in short lengths significant enough to
register even with the encoding.


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message