lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael McCandless <luc...@mikemccandless.com>
Subject Re: Change norm encoding
Date Mon, 09 Nov 2009 17:03:08 GMT
On Mon, Nov 9, 2009 at 11:04 AM, Benjamin Heilbrunn <benhei@gmail.com> wrote:

> i've got a problem concerning encoding of norms.
> I want to use int values (0-255) instead of float interpreted bytes.
>
> In my own Similarity-Class, which I use for indexing and searching, I
> implemented the static methods encodeNorms, decodeNorms and
> getNormDecoder.
> But because they are static and the encoding of norms happens in
> NormsWriterPerField.finish() with the following lines of code:
>
>      final float norm =
> docState.similarity.computeNorm(fieldInfo.name, fieldState);
>      norms[upto] = Similarity.encodeNorm(norm);
>      docIDs[upto] = docState.docID
>
> my implementation is only used for computation of norm values but not
> for the encoding.
> Is there a reason why norm encoding and decoding is hardwired to the
> implementation in Similarity?

I don't think there's a particular reason... this is just how it has
always been.  I think making it more extensible would be good!a

> And is there any elegant way to bypass this behaviour instead of
> implementing an mapper, which maps every int between 0 and 255 to an
> float value out of Similarity.NORM_TABLE, befor encoding.

I think a patch is needed, to allow the Similarity instance (not the
static class) to provide the mapping, and decode table?  Various
queries call the decode, so you'd need to fix those too... wanna cough
up a patch?

Mike

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message