jackrabbit-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marcel Reutegger <marcel.reuteg...@gmx.net>
Subject Re: Scoring question
Date Mon, 08 Sep 2008 15:24:09 GMT
Hi,

jackrabbit 1.5 will allow you to configure a custom similarity implementation.
See: http://wiki.apache.org/jackrabbit/Search parameter: similarityClass

for details on how to implement a similarity class see the lucene documentation.

regards
 marcel

flopsi73 wrote:
> Hi everybody,
> 
> i have a question regarding custom scoring:
> I want to implement a scoring so that the score of a document is just equal
> to the occurences of the terms in the document. No special rules about term
> length, ocurrences in other documents etc.
> 
> defining that only jcr:content/@jcr:data is indexed, e.g. a document with
> content
> 'This is a test document of jackrabbit scoring mechanism, just a test
> document'
> should always get a score of 3
> with a search
> 'test scoring'
> 
> Does anyone  have an idea on how to achieve this most easily? Is there
> already anything? Or if not, which classes are to subclass? Just Scorer and
> Weight? I think Similarity is not necessary (see MatchAllScorer)?!? Or maybe
> even Query?
> 
> I thought about something like this (in a new 'HitScorer' class):
> 
> 	public float score() throws IOException {
> 		TermFreqVector tfv = reader.getTermFreqVector(nextDoc, "jcr:content");
> 		int[] freqs = tfv.getTermFrequencies();
> 		int sum = 0;
> 		for (int i = 0; i < freqs.length; i++)
> 			sum += freqs[i];
> 		return sum;
> 	}
> 
> But what to do in Weight.getSumOfSquaredWeights and Weight.normalize? Just
> 1.0f? And is the property name correct? I admit i am a bit confused about
> the DefaultSimilarity formula(s)...
> 
> Thanks a lot, best regards
> Flo
> 


Mime
View raw message