lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jamal H Tandina <jtand...@yahoo.fr>
Subject RE : Re: problem undestanding the hits.score
Date Fri, 02 Nov 2007 08:51:15 GMT
Thank you  for your reply 

How can i change the defaultSimilarity in the indexing and the searching, do you have an example
or an url how to set the Similarity  ?
http://lucene.zones.apache.org:8080/hudson/job/Lucene-Nightly/javadoc/org/apache/lucene/search/Similarity.html

Thanks again 

Ion Badita <ion.badita@mcr.ro> a écrit : Try too look at Similarity, there you will
find thinks about the 
scoring. Your query is more "similar" with the shorter document.
If you have 2 documents with a field body; first with words "red flower" 
and the second with just one word "flower", and search for the word 
"flower", the second document will score high because is very similar 
with the query.

If you want to give priority to documents that are larger, like z1, you 
should change the DefaultSimilarity (at index time), more exactly the 
method:

  public float lengthNorm(String fieldName, int numTerms) {
    return (float)(1.0 / Math.sqrt(numTerms));
  }

to something like this

  public float lengthNorm(String fieldName, int numTerms) {
    return (float)(Math.sqrt(numTerms));
  }


Reindex your documents with the Similarity modified and try to search 
again. The IndexWriter has a method to set the similarity used for indexing.


I hope this will help you...


Ion



Jamal jamalator wrote:
>  Hi 
>
> I have indexed this html document 
> =============z1========================
> 
>   
> zo zo zo zo zo zo zo zo zo zo zo zo 

> zo zo zo zo zo zo zo zo zo zo zo zo 

> zo zo zo zo zo zo zo zo zo zo zo zo 
>   
> 
> =============z2=========================
>  
>    
>  zo zo zo zo zo zo zo zo zo zo zo zo 

>  zo zo zo zo zo zo zo zo zo zo zo zo 

>    
>  
> =============z3==========================
>  
>    
>  zo zo zo zo zo zo zo zo zo zo zo zo 

>    
>  
> =========================================
> with this code
>
> Field contentK1 = new  Field("htmlcontent",httpd.getContentKeywords(),Field.Store.NO,Field.Index.TOKENIZED
);
> contentK1.setBoost(1/10f);  //10%
> doc.add(contentK1);
>        
> and when a search "zo" with luke i have (whitespaceanalyser):
>
> (score , id   )
> (0,0957,z2 )
> (0,0947,z3 )
> (0,0938,z1)
>
> NORMALY the resut expected have to be z1 z2 z3
>
> Some One have an idea ??
>
> Thank you all
>
>                 
>
> ---------------------------------
>   Ne gardez plus qu'une seule adresse mail ! Copiez vos mails vers Yahoo! Mail 
>
>              
> ---------------------------------
>  Ne gardez plus qu'une seule adresse mail ! Copiez vos mails vers Yahoo! Mail 
>   


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org



             
---------------------------------
 Ne gardez plus qu'une seule adresse mail ! Copiez vos mails vers Yahoo! Mail 
Mime
  • Unnamed multipart/alternative (inline, 8-Bit, 0 bytes)
View raw message