lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Karl Wettin (JIRA)" <>
Subject [jira] Commented: (LUCENE-1260) Norm codec strategy in Similarity
Date Mon, 07 Apr 2008 23:01:25 GMT


Karl Wettin commented on LUCENE-1260:

I suppose it would be possible to implement a NormCodec that would listen to encodeNorm(float)
while one is creating a subset of the index in order to find all norm resolution sweetspots
for that corpus using some appropriate algorithm. Mean shift?.

Perhaps it even would be possible to compress it down to n bags from the start and then allow
for it to grow in case new documents with other norm requirements are added to the store.

I haven't thought too much about it yet, but it seems to me that norm codec has more to do
with the physical store (Directory) than Similarity and should perhaps be moved there instead?
I have no idea how, but I also want to move it to the instance scope so I can have multiple
indices with unique norm span/resolutions created from the same classloader.

> Norm codec strategy in Similarity
> ---------------------------------
>                 Key: LUCENE-1260
>                 URL:
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>    Affects Versions: 2.3.1
>            Reporter: Karl Wettin
>         Attachments: LUCENE-1260.txt
> The static span and resolution of the 8 bit norms codec might not fit with all applications.

> My use case requires that 100f-250f is discretized in 60 bags instead of the default..

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message