lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jack Krupansky" <j...@basetechnology.com>
Subject Re: Lucene-MoreLikethis
Date Tue, 15 Jan 2013 23:41:49 GMT
There are lots of parameters you can adjust, but the defaults essentially 
assume that you have a fairly large corpus and aren't interested in 
low-frequency terms.

So, try MoreLikeThis#setMinDocFreq. The default is 5. You don't have any 
terms in your example with a doc freq over 2.

Also, try setMinTermFreq. The default is 2. You don't have any terms with a 
term frequency above 1.

-- Jack Krupansky

-----Original Message----- 
From: Thomas Keller
Sent: Tuesday, January 15, 2013 3:22 PM
To: java-user@lucene.apache.org
Subject: Lucene-MoreLikethis

Hey,

I have a question about "MoreLikeThis" in Lucene, Java. I built up an index 
and want to find similar documents. But I always get no results for my 
query, mlt.like(1) is always empty. Can anyone find my mistake? Here is an 
example. (I use Lucene 4.0)

public class HelloLucene {
  public static void main(String[] args) throws IOException, ParseException 
{

   StandardAnalyzer analyzer = new StandardAnalyzer(Version.LUCENE_40);
   Directory index = new RAMDirectory();
   IndexWriterConfig config = new IndexWriterConfig(Version.LUCENE_40, 
analyzer);

    IndexWriter w = new IndexWriter(index, config);
    addDoc(w, "Lucene in Action", "193398817");
    addDoc(w, "Lucene for Dummies", "55320055Z");
    addDoc(w, "Managing Gigabytes", "55063554A");
    addDoc(w, "The Art of Computer Science", "9900333X");
    w.close();

    // search
    IndexReader reader = DirectoryReader.open(index);
    IndexSearcher searcher = new IndexSearcher(reader);

    MoreLikeThis mlt = new MoreLikeThis(reader);
    Query query = mlt.like(1);
    System.out.println(searcher.search(query, 5).totalHits);
  }

  private static void addDoc(IndexWriter w, String title, String isbn) 
throws IOException {
    Document doc = new Document();
    doc.add(new TextField("title", title, Field.Store.YES));

    // use a string field for isbn because we don't want it tokenized
    doc.add(new StringField("isbn", isbn, Field.Store.YES));
    w.addDocument(doc);
  }
}

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org 


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message