lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erick Erickson <erickerick...@gmail.com>
Subject Re: NOT_ANALYSED_NO_NORMS should get max field length boost
Date Tue, 12 Jan 2010 13:25:14 GMT
Are you saying that you index the *same* field differently in different
documents? Or do you index the field in question in the same way in
all documents?

I ask because I'm having a hard time following the logic here. A
field that is NOT analyzed is an all-or-none match, i.e.
looking for "paul" in an unanalyzed field "paul taylor" will
not match, so boosting is pretty irrelevant on that field.

If you're analyzing the same field for some documents and not
analyzing it for other documents, I don't know what happens, but
it's probably bad.

Could you boost your *other* fields by less than one to achieve
the same end?

If none of this is relevant, could you explain your use case a little
more?

HTH
Erick

On Tue, Jan 12, 2010 at 7:53 AM, Paul Taylor <paul_t100@fastmail.fm> wrote:

> Lucene in Action says you can possibly use NOT_ANALYSED_NO_NORMS when
> indexing fields that arent tokenized, but later says norms are used to boost
> fields with less /single term, so matches based on these single term fields
> would miss out on this boost. Is there a way to use NOT_ANALYSED_NO_NORMS on
> these fields will will mean they end up with the best boost (1.0 as default)
> , and then documents that are analysed with norms receive a negative boost
> (<1.0) if they contain more than one token.
>
> I'm not using Document or Field boosting, so seems a bit silly for me to
> store all these norms just to say this field contains a single token and
> therefore should get an addtional boost.
>
> Perhaps Im misundersanding this, and this would work as required.
>
>
> thanks Paul
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message