lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ion Badita <ion.bad...@mcr.ro>
Subject Re: RE : Re: problem undestanding the hits.score
Date Fri, 02 Nov 2007 09:18:35 GMT
That is already in the similarity formula, in tf term, documents that 
have more occurrences of a given term receive a higher score.


Jamal H Tandina wrote:
> <<<<
>
> If you want to give priority to documents that are larger, like z1, you
>  
> should change the DefaultSimilarity (at index time), more exactly the 
> method:
>
>   public float lengthNorm(String fieldName, int numTerms) {
>     return (float)(1.0 / Math.sqrt(numTerms));
>   }
>
> to something like this
>
>   public float lengthNorm(String fieldName, int numTerms) {
>     return (float)(Math.sqrt(numTerms));
>
>
>   }
>   
>
> I want to give priority to documents that have the word we are searching more frequent
!
>
> Thank you
>
>
>
> Jamal H Tandina <jtandina@yahoo.fr> a écrit : Thank you  for your reply 
>
> How can i change the defaultSimilarity in the indexing and the searching, do you have
an example or an url how to set the Similarity  ?
> http://lucene.zones.apache.org:8080/hudson/job/Lucene-Nightly/javadoc/org/apache/lucene/search/Similarity.html
>
> Thanks again 
>
> Ion Badita  a écrit : Try too look at Similarity, there you will find thinks about the

> scoring. Your query is more "similar" with the shorter document.
> If you have 2 documents with a field body; first with words "red flower" 
> and the second with just one word "flower", and search for the word 
> "flower", the second document will score high because is very similar 
> with the query.
>
> If you want to give priority to documents that are larger, like z1, you 
> should change the DefaultSimilarity (at index time), more exactly the 
> method:
>
>   public float lengthNorm(String fieldName, int numTerms) {
>     return (float)(1.0 / Math.sqrt(numTerms));
>   }
>
> to something like this
>
>   public float lengthNorm(String fieldName, int numTerms) {
>     return (float)(Math.sqrt(numTerms));
>   }
>
>
> Reindex your documents with the Similarity modified and try to search 
> again. The IndexWriter has a method to set the similarity used for indexing.
>
>
> I hope this will help you...
>
>
> Ion
>
>
>
> Jamal jamalator wrote:
>   
>>  Hi 
>>
>> I have indexed this html document 
>> =============z1========================
>>
>>   
>> zo zo zo zo zo zo zo zo zo zo zo zo 
>>     
>
>   
>> zo zo zo zo zo zo zo zo zo zo zo zo 
>>     
>
>   
>> zo zo zo zo zo zo zo zo zo zo zo zo 
>>   
>>
>> =============z2=========================
>>  
>>    
>>  zo zo zo zo zo zo zo zo zo zo zo zo 
>>     
>
>   
>>  zo zo zo zo zo zo zo zo zo zo zo zo 
>>     
>
>   
>>    
>>  
>> =============z3==========================
>>  
>>    
>>  zo zo zo zo zo zo zo zo zo zo zo zo 
>>     
>
>   
>>    
>>  
>> =========================================
>> with this code
>>
>> Field contentK1 = new  Field("htmlcontent",httpd.getContentKeywords(),Field.Store.NO,Field.Index.TOKENIZED
);
>> contentK1.setBoost(1/10f);  //10%
>> doc.add(contentK1);
>>        
>> and when a search "zo" with luke i have (whitespaceanalyser):
>>
>> (score , id   )
>> (0,0957,z2 )
>> (0,0947,z3 )
>> (0,0938,z1)
>>
>> NORMALY the resut expected have to be z1 z2 z3
>>
>> Some One have an idea ??
>>
>> Thank you all
>>
>>                 
>>
>> ---------------------------------
>>   Ne gardez plus qu'une seule adresse mail ! Copiez vos mails vers Yahoo! Mail 
>>
>>              
>> ---------------------------------
>>  Ne gardez plus qu'une seule adresse mail ! Copiez vos mails vers Yahoo! Mail 
>>   
>>     
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>
>
>              
> ---------------------------------
>  Ne gardez plus qu'une seule adresse mail ! Copiez vos mails vers Yahoo! Mail 
>
>              
> ---------------------------------
>  Ne gardez plus qu'une seule adresse mail ! Copiez vos mails vers Yahoo! Mail 
>   


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message