lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Simon Willnauer <simon.willna...@googlemail.com>
Subject Re: Index result percentile variation based on Index time Norm and No-Norm
Date Sun, 19 Jun 2011 17:39:28 GMT
On Sun, Jun 19, 2011 at 5:39 PM, Saurabh Gokhale
<saurabhgokhale@gmail.com> wrote:
> Yes, it makes sense. So in the case of No_Norm I suppose, all the fields
> small or large, are treated the same instead of as per their sizes and
> implicit boost factor.

well yes they will use TF / IDF for scoring additional length
normalization and boost are omitted

simomn
>
> Thanks for the explanation
>
> Saurabh
>
> On Sun, Jun 19, 2011 at 9:26 AM, Simon Willnauer <
> simon.willnauer@googlemail.com> wrote:
>
>> if you use norms lucene uses the boost of the fields / document and
>> multiplies it with a length normalization factor -> 1.0 /
>> Math.sqrt(numTerms) so you scores should be different. Does that
>> explain what you are seeing?
>>
>> Simon
>>
>> On Sun, Jun 19, 2011 at 7:06 AM, Saurabh Gokhale
>> <saurabhgokhale@gmail.com> wrote:
>> > Hi All,
>> > I have a question regarding index time parameter Index.ANALYZED_NO_NORMS
>> > and Index.ANALYZED
>> > As per the definition, ANALYZED_NO_NORMS is no different than ANALYZED
>> > option except that it does not store norm (boost) information in the
>> index.
>> > So if I create 2 indexes: 1 with ANALYZED option and another
>> > with ANALYZED_NO_NORMS option, for the same indexed document and for the
>> > same search query both should bring same result with same matching
>> > percentile.
>> > But the percentile for the index created using NO_NORM is higher than the
>> > one created with NORM. Why is that so?
>> > Attached is the java example with 2 indexes created and the same document
>> > searched using BooleanQuery.
>> > The result indicates that though the matched documents are same, their
>> > percentiles differ? can some one pls help me understand norm better? is
>> it
>> > because in the index with norm, lucene adds some boost for index fields?
>> >
>> >
>> > Result of running the program is:
>> >
>> --------------------------------------------------------------------------------------------
>> > Adding Doc to NORM Index :: author : author1
>> > ---------------------------------------
>> > Adding Doc to NO NORM Index :: author : author1
>> > =======================================
>> > Adding Doc to NORM Index :: author : author2
>> > ---------------------------------------
>> > Adding Doc to NO NORM Index :: author : author2
>> > =======================================
>> > Adding Doc to NORM Index :: author : author3
>> > ---------------------------------------
>> > Adding Doc to NO NORM Index :: author : author3
>> > =======================================
>> > Adding Doc to NORM Index :: author : author4
>> > ---------------------------------------
>> > Adding Doc to NO NORM Index :: author : author4
>> > =======================================
>> > Search for the book with Author: author1
>> > NORM:: WITH NORM  Query :: (author:author1) (subject:book subject:first
>> > subject:my) -isbn:123
>> >  Match: 19.738173%  || Doc Author: author2 || Doc subject: My next book
>> ||
>> > Doc ISBN: 333
>> >  Match: 6.168179%  || Doc Author: author3 || Doc subject: this first text
>> ||
>> > Doc ISBN: 444
>> > Search for the book with Author: author1
>> > NORM:: WITHOUT NORM  Query :: (author:author1) (subject:book
>> subject:first
>> > subject:my) -isbn:123
>> >  Match: 39.476345%  || Doc Author: author2 || Doc subject: My next book
>> ||
>> > Doc ISBN: 333
>> >  Match: 9.869086%  || Doc Author: author3 || Doc subject: this first text
>> ||
>> > Doc ISBN: 444
>> >
>> >
>> > Thanks
>> > Saurabh
>> >
>>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message