lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Hostetter <hossman_luc...@fucit.org>
Subject Re: product based term combination for BooleanQuery?
Date Wed, 04 Jul 2007 05:26:57 GMT

(side note: if you are going to try and obfuscate your field names when
sending explain output so we don't know you are using wikipedia data (not
that we care), please at least be consistent about it so the final
explanations actual make sense -- it will save everyone a lot of confusion
and help us help you)

the biggest factor in your scores seems to be the fieldNorms for your
name, title and alias fields ... they are so high, that tf and idf are
pretty much irrelevant.

By the looks of it, when you were indexing your docs, you used a
consistent field boost per field on every instance of that field for every
document ... this is really not a use case where index time field (or
document) boosts make sense.  in my opinion hte number one thing you can
do to imrpove your relevency right now is to stop using index time
boosts and use query boosts instead.

If you don't want to reindex completely the LengthNormModifier class (in
the misc contrib) can update all of your norms in place without reindexing
and throw away any index time boosts you had.


-Hoss


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message