lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dan Climan" <dcli...@keepmedia.com>
Subject RE: Explanations and overridden similarity
Date Fri, 17 Dec 2004 18:02:16 GMT
Thanks for pointing out the error in my originally proposed solution to the
question of overriding encodeNorm and decodeNorm, but am I correct that
there's an issue?

Is it possible to override the encodeNorm and decode Norm using the public
API?

My strategy has been as follows:
a) define a custom encodeNorm and decodeNorm (see below)
b) Every time I get a new IndexWriter or IndexSearcher, I immediately call
setSimilarity. For Example:

IndexWriter iw = new IndexWriter(dir, aAnalyzer, overwriteIndex);
iw.setSimilarity(new MySimilarity());

OR

searcher = new IndexSearcher(reader);
searcher.setSimilarity(new MySimilarity());


I define MySimilarity as follows:

public class MySimilarity extends DefaultSimilarity {
    public  MySimilarity() {
        super();
    }

    public static byte encodeNorm(float f) {
        int insertion = Math.abs(Arrays.binarySearch(kmNorms, f));
        if (insertion >= 256)
            insertion = 255;
        byte b = (byte) (insertion - 128);
        
        return b;
    }

    public static float decodeNorm(byte b) {
        return MYNORMS[(int)b + 128];
    }
    
    public float lengthNorm(String fieldName, int numTerms) {
        return 1.0f;
    }


    //Note this is an array of 256 values in sorted order
    public static final float[] MYNORMS = {    
    0.0000E+00f,
    1.0000E-10f,
    1.0000E-09f,
    .
    .
    .
    1.0000E+04f,
    1.0000E+05f
    };
}


I would have expected that once setSimilarity is called for an IndexWriter
or IndexSearcher that all use of a Similarity method would come from the
Similarity that was set. Since the classes I mentioned previously all have
constructions such as "Similarity.decodeNorm(f)". They do not appear to
behave this way.

Do you agree that IndexWriter and IndexSearcher should behave as I suggest
with regard to setSimilarity?

Is there another way to accomplish a change in the encodeNorm/decodeNorm
using the existing API? If not, should I try to generate a patch to the
classes that use this construction to force them to use the simliarity set
by setSimilarity?

Dan



-----Original Message-----
From: Doug Cutting [mailto:cutting@apache.org] 
Sent: Thursday, December 16, 2004 9:28 PM
To: Lucene Developers List
Subject: Re: Explanations and overridden similarity


Dan Climan wrote:
> Shouldn't the call to Similarity.decodeNorm be replaced with a call to 
> Similarity.getDefault().decodeNorm

decodeNorm is a static method.

Doug


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-dev-help@jakarta.apache.org


Mime
View raw message