lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mek <meki...@gmail.com>
Subject Re: Very high fieldNorm for a field resulting in bad results
Date Wed, 27 Sep 2006 02:44:48 GMT
Thanks a lot Chris for the detailed & patitent response.


>
> The value of a the field norm for any field named "A" is typically the
> lengthNorm of the field, times the document boost, times the field boost
> for *each* Field instance added to the document with the name "A".
> (lengthNorm is by default 1/swrt(num of terms))

That explains the very high value for the fieldNorm. The boost value
became boost_vale^#of  values in the field.

A couple of more questions:

1. Can I do away with index-time boosting for fields & tweak
query-time boosting for them ? I understand that doc level boosting is
very useful while indexing.
But for fields, both index-boost & query-boost are mutiples which lead
to the score, so would it be safe to say that I can replace the
index-time boost with query-time boosting. This allows me a lot of
freedom to test different values without re-indexing which takes  me
about 6 hours.

2. When searching through the archive I had read a post by you, saying
its possible to give exact matches much higher weightage by indexing
the START & END
from : http://www.nabble.com/What-are-norms--tf1919250.html#a5335856
"it is possible to score exact matches on (tokenized) fields very high
without using lengthNorm by indexing START and END tokens for the field as
well, and then including them in your sloppy phrase queries -- the
"tighter" match will score highest."

Can you please elaborate on this,

Thanks a ton for the response,
mekin

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message