lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Karl Wettin (JIRA)" <>
Subject [jira] Commented: (LUCENE-1260) Norm codec strategy in Similarity
Date Wed, 09 Apr 2008 18:25:24 GMT


Karl Wettin commented on LUCENE-1260:

As long as the norm remains a fixed size (1 byte) then it doesn't really matter whether it's
tied to Similarity's or the store itself - it would be nice if the Index could tell you which
normDecoder to use, but it's not any more unreasonable to expect the application to keep track
of this (if it's not the default encoding) since applications already have to keep track of
things like which Analyzer is "compatible" with querying this index.

If we want norms to be more flexible, so tat apps can pick not only the encoding but also
the size... then things get more interesting, but it's still feasible to say "if you customize
this, you have to make your reading apps and your writing apps smart enough to know about
your customization."

I like the idea of an index that is completely self aware of norm encoding, what payloads
mean, et c. 

I also want to move it to the instance scope so I can have multiple indices with unique norm
span/resolutions created from the same classloader.

My use case is really about document boost and not normalization. 

So another solution to this is to introduce a (variable bit sized?) document boost file and
completely separate it from the norms instead of as now where  normalization and document
boost is baked up as the same thing. I think there would be no need to touch the norms encoding
then, that the default resolution is good enough for /normalization/. It would fix several
caveats with norms as I see it. 

> Norm codec strategy in Similarity
> ---------------------------------
>                 Key: LUCENE-1260
>                 URL:
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>    Affects Versions: 2.3.1
>            Reporter: Karl Wettin
>         Attachments: LUCENE-1260.txt
> The static span and resolution of the 8 bit norms codec might not fit with all applications.

> My use case requires that 100f-250f is discretized in 60 bags instead of the default..

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message