lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From mark harwood <markharw...@yahoo.co.uk>
Subject Re: Problem finding similar documents with MoreLikeThis method.
Date Fri, 21 Jul 2006 11:17:14 GMT
Does your index use StandardAnalyzer? Are your fields stored (Field.Store.YES)?

MoreLikeThis uses StandardAnalyzer by default to read the stored content from the example
doc which may produce tokens that do not match those of the indexed content. Use setAnalyzer()
to ensure they are in sync.




----- Original Message ----
From: Martin Braun <mbraun@uni-hd.de>
To: java-user@lucene.apache.org
Sent: Friday, 21 July, 2006 11:09:40 AM
Subject: Re: Problem finding similar documents with MoreLikeThis method.

Hello,

inspired by this thread, I also tried to implement a MoreLikeThis
search. But I have the same Problem of a null query.

I did set the Fieldname to a Field that is stored in the Index.
But "like" just returns null.

Here is my Code:

            Hits hits = this.is.search(new TermQuery(new Term("katkey", Katkey)));
            System.out.println("DOCID:"+hits.id(0));
            System.out.println("hits:"+hits.doc(0).getField("kurz").stringValue()    );
             MoreLikeThis mlt = new MoreLikeThis(this.ir);
             mlt.setFieldNames( new String[] {"kurz"} );
             mlt.setMinDocFreq(0);
             Query query = mlt.like( hits.id(0) );
             System.out.println("QUERY:"+query);
             return this.query(query.toString(),0,10,0);


The Field "kurz" contains the following String:

003481627  M               <v>Swinton, Elizabeth DeSabato</v>: <a>¬The¬
graphic art of Onchi Koshiro</a> : innovation and tradition / Elizabeth
de Sab
ato Swinton
New York [u.a.]: Garland, <b>1986</b>. - XXVIII, 307, |&lt;180&gt;| S.
:
zahlr. Ill.
ISBN 0-8240-6868-8


Any Ideas?

thanks in advance,
martin


Davide schrieb:
> mark harwood wrote:
>> Does your index have only the one document?
>>
>> MoreLikeThis will only generate queries with terms that occur in more than "minDocFreq"
(default setting is 5).
>>
>> This is to avoid the large overheads associated with searching for very common words
in your example text.
>>
>>
>>
> 
> Thanks very much Mark.
> I have specified:
> 
> setMinDocFreq(0);
> 
> for testing and It works... the query now isn't empty and the search
> return a document...
> So I think that the "problem" was that...
> 
> Thank you
> Davide.
> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org





---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message