lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Christoph Goller <gol...@detego-software.de>
Subject Re: cvs commit: jakarta-lucene/src/java/org/apache/lucene/index MultiReader.java FilterIndexReader.java IndexReader.java SegmentReader.java
Date Tue, 20 Apr 2004 08:25:10 GMT
Incze Lajos wrote:

> I'm putting my findings here, as seems to me related. In a mid size
> corpora I've found the following mistery:
> 
> 1) +SZIDO:"jan 1"                                    -- 92 hits
> 2) +SZIDO:"jan 1" +TYPE:ER-CIKK                      -- 433 hits
> 3) +SZIDO:"jan 1" +TYPE:ER-CIKK NONSENSE:nonsense    -- 92 hits
> 
> 2) is obviously a nonsense. The NONSENSE field in the 3rd query
> does not exists. Altough I do not understand what's happening,
> and couldn't produce a revealing test case, I've found that if
> I switch off the ConjunctionScorer optimization (the same way
> as the 3rd query switched it off) by inserting
> 
> ///////////////////////////////////////////////////////////////////
>       allRequired = false;
> ///////////////////////////////////////////////////////////////////
>       if (allRequired && noneBoolean) {           // ConjunctionScorer is okay
> 
> this bug disappears. Also, found that (at least for me) only the
> PhraseQuery produces this result. If I change the 2nd query with
> 
> 2A) +SZIDO:(+jan +1) +TYPE:ER-CIKK
> 
> I gain the (good) 92 hits result. I'm almost sure that there is something
> wrong with the document order and skipto what is specific to the
> PhraseQuery.
> 
> incze

Hi Incze,

looks like the bug in PhraseScorer that I fixed last week (discovered by 
Daniel). Could you verify whether the strange behavior still shows up with the
current CVS-version of Lucene. You may use your old index. Reindexing is not
necessary.

Thanks,
Christoph



---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-dev-help@jakarta.apache.org


Mime
View raw message