lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Doug Cutting <cutt...@lucene.com>
Subject Re: Problems with exact matces on non-tokenized fields...
Date Fri, 27 Sep 2002 18:24:08 GMT
lex Murzaku wrote:
> I was trying this as well but now I get something I can't understand:
> My query (Query: +element:POST +nr:3) is supposed to match only one
> record. Indeed Lucene returns that record with the highest score but it
> also returns others that shouldn't be there at all even if it was an OR
> query. Another observation: it returns all records where "nr" >= 3.
> Notice the last record returned doesn't contain neither "POST" nor "3".
> I am attaching a self contained running example with this problem and
> would appreciate any comment.
>  
> 0.6869936 Keyword<nr:3> Keyword<element:POST>
> 0.63916886 Keyword<nr:4> Keyword<element:POST>
> 0.6044586 Keyword<nr:6> Keyword<element:POST>
> 0.5773442 Keyword<nr:5> Keyword<element:POST>
> 0.56318253 Keyword<nr:9> Keyword<element:POST>
> 0.54449975 Keyword<nr:8> Keyword<element:POST>
> 0.5247468 Keyword<nr:7> Keyword<element:POST>
> 0.45054603 Keyword<nr:10> Keyword<element:GET>

Phew!  It took me a while to spot this one...

The bug is with your test program.  You keep adding fields to the same 
document instance.  If you change your program to print the entire 
document, you'll see:

Query: +element:POST +nr:3
0.6869936 Document<Keyword<element:POST> Keyword<nr:3> 
Keyword<element:POST> Keyword<nr:2> Keyword<element:POST> Keyword<nr:1>

Keyword<element:POST> Keyword<nr:0>>
0.63916886 Document<Keyword<element:POST> Keyword<nr:4> 
Keyword<element:POST> Keyword<nr:3> Keyword<element:POST> Keyword<nr:2>

Keyword<element:POST> Keyword<nr:1> Keyword<element:POST> Keyword<nr:0>>
0.6044586 Document<Keyword<element:POST> Keyword<nr:6> 
Keyword<element:POST> Keyword<nr:5> Keyword<element:POST> Keyword<nr:4>

Keyword<element:POST> Keyword<nr:3> Keyword<element:POST> Keyword<nr:2>

Keyword<element:POST> Keyword<nr:1> Keyword<element:POST> Keyword<nr:0>>
0.5773442 Document<Keyword<element:POST> Keyword<nr:5> 
Keyword<element:POST> Keyword<nr:4> Keyword<element:POST> Keyword<nr:3>

Keyword<element:POST> Keyword<nr:2> Keyword<element:POST> Keyword<nr:1>

Keyword<element:POST> Keyword<nr:0>>
0.56318253 Document<Keyword<element:POST> Keyword<nr:9> 
Keyword<element:POST> Keyword<nr:8> Keyword<element:POST> Keyword<nr:7>

Keyword<element:POST> Keyword<nr:6> Keyword<element:POST> Keyword<nr:5>

Keyword<element:POST> Keyword<nr:4> Keyword<element:POST> Keyword<nr:3>

Keyword<element:POST> Keyword<nr:2> Keyword<element:POST> Keyword<nr:1>

Keyword<element:POST> Keyword<nr:0>>
0.54449975 Document<Keyword<element:POST> Keyword<nr:8> 
Keyword<element:POST> Keyword<nr:7> Keyword<element:POST> Keyword<nr:6>

Keyword<element:POST> Keyword<nr:5> Keyword<element:POST> Keyword<nr:4>

Keyword<element:POST> Keyword<nr:3> Keyword<element:POST> Keyword<nr:2>

Keyword<element:POST> Keyword<nr:1> Keyword<element:POST> Keyword<nr:0>>
0.5247468 Document<Keyword<element:POST> Keyword<nr:7> 
Keyword<element:POST> Keyword<nr:6> Keyword<element:POST> Keyword<nr:5>

Keyword<element:POST> Keyword<nr:4> Keyword<element:POST> Keyword<nr:3>

Keyword<element:POST> Keyword<nr:2> Keyword<element:POST> Keyword<nr:1>

Keyword<element:POST> Keyword<nr:0>>
0.45054603 Document<Keyword<element:GET> Keyword<nr:10> 
Keyword<element:POST> Keyword<nr:9> Keyword<element:POST> Keyword<nr:8>

Keyword<element:POST> Keyword<nr:7> Keyword<element:POST> Keyword<nr:6>

Keyword<element:POST> Keyword<nr:5> Keyword<element:POST> Keyword<nr:4>

Keyword<element:POST> Keyword<nr:3> Keyword<element:POST> Keyword<nr:2>

Keyword<element:POST> Keyword<nr:1> Keyword<element:POST> Keyword<nr:0>>

So you need to create a new document instance each time.  I've attached 
a modified version of your test program that does this and gives the 
results you desire:

Query: +element:POST +nr:3
1.0 Document<Keyword<element:POST> Keyword<nr:3>>

Doug

Mime
View raw message