lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Grant Ingersoll <gsing...@apache.org>
Subject Re: I got the score "0.3044460713863373" for the cosine similarity of two document with the same text content !!
Date Wed, 06 May 2009 01:21:10 GMT
What is SimilarityQueries?  I'd try the explain capabilities to see  
more.


On May 5, 2009, at 2:23 PM, Kamal Najib wrote:

> hi all,
> i got the similarity score 0.3044460713863373 between two docs which  
> have the same text content, is it correct? I expected 1.0, hier is  
> my result line:
>
> doc:"this expression of galectin-1 in blood vessel walls was  
> correlated with vascular"	
> doc2 :"this expression of galectin-1 in blood vessel walls was  
> correlated with vascular"	Score :"0.3044460713863373"
> is the score correct?
> my methode is :
> public double getSimilarity(String v1,String  v2) throws Exception
> {
> 	
> 	float result=0;
> 	directory = new RAMDirectory();
> 	Analyzer analyzer = new StandardAnalyzer();
> 	IndexWriter writer = new IndexWriter(directory, analyzer,
>      	true, IndexWriter.MaxFieldLength.LIMITED);
> 	
> 	
> 	Document doc1 = new Document();
> 	doc1.add(new Field("term",v1, Field.Store.YES,  
> Field.Index.TOKENIZED));
> 	writer.addDocument(doc1);
> 	writer.close();
> 	IndexReader ir=IndexReader.open(directory);
> 	IndexSearcher searcher = new IndexSearcher(directory);
> 	Query  
> query=SimilarityQueries.formSimilarQuery(v2,analyzer,"term",null);
> 	ScoreDoc[] scoreDocs = searcher.search(query,5).scoreDocs;
> 	int docNum = scoreDocs[0].doc;
>        result = scoreDocs[0].score;
>        Document hitDoc = searcher.doc(docNum);
> 	System.out.println("Term 1 :"+v2+"  Term2:"+hitDoc.get("term")+"   
> Score :"+result);
>        return result;
> }
> please help.
> thanks in advance.
> Kamal
> -- 
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org

--------------------------
Grant Ingersoll
http://www.lucidimagination.com/

Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids)  
using Solr/Lucene:
http://www.lucidimagination.com/search


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message