lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Davide <davidi...@libero.it>
Subject Re: Problem finding similar documents with MoreLikeThis method.
Date Wed, 19 Jul 2006 10:28:38 GMT

mark harwood wrote:
> Looks like the class defaults to only searching a field called "contents".
> 
> Either:
> a) call setFieldNames() with null to force the class to use a list of all indexed fields
derived from your IndexReader
> or
> b) call setFieldNames() with the explicit shortlist of field names you want to match
on
> 
> 
> Cheers
> Mark
> 

I've tried but It still doesn't work. I've called the method:

setFieldNames(new String[]{"Field1", "Field2", ...}) with "Field1",
"Field2" the fields I used when I index the files but nothing *Query* is
still empty and MoreLikeThis doesn't work... I don't think the problem
is this.


For simplicity I give you a general code (a test) that doesn't work, You
can try it and tell me if also for you the code doesn't work...

I have also tried the *main* code of MoreLikeThis class and it doesn't
work.. (I have changed the Index_dir and the document to add to index)

------------------------------------------------------------------------
-------------------- MoreLikeThis Test ---------------------------------
------------------------------------------------------------------------
	
//Build an IndexWriter object to build an index
IndexWriter writer = new IndexWriter("C:\\Temp\\index", new
StandardAnalyzer(), true);
	
//----- Adding a document to index ----	
Document doc = new Document();
File f = new File("C:\\Document.txt");
FileReader fileReader = new FileReader(f);
	        	
Field field = new Field("contents", fileReader, Field.TermVector.YES);
	
doc.add(field);	
writer.addDocument(doc);
//--------------------------------------

	
//Optimize index and close
writer.optimize();
System.out.println("The documents in the index are: "+writer.docCount());
writer.close();
		
		    	
//-------- Now try to find similar documents

Directory indexDir = FSDirectory.getDirectory("C:\\Temp\\index", false);

IndexReader ir = IndexReader.open(indexDir);

MoreLikeThis mlt = new MoreLikeThis(ir);

//mlt.setFieldNames(new String[] {"contents"});

Query query = null;
if (fr != null){
	System.out.println("Parsing FileReader: " + fr);
	query = mlt.like(fr);
	
}	
		
System.out.println("The Query is: " + query);
		    	
IndexSearcher is = new IndexSearcher(indexDir);
		    		
Hits hits = is.search(query);
		    		
for (Iterator iterDoc = hits.iterator(); iterDoc.hasNext();) {
		    	    	
	Hit hit = (Hit)iterDoc.next();		    	
	System.out.println("\n\nSimilar file: "+hit.get("contents"));
}
------------------------------------------------------------------------------
		
NOTE:
1) Document.txt is a text file containg some text


I really don't understand why It doesn't work... I'm feel lost... :(




Mime
View raw message