lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From eks dev <>
Subject Re: speed of BooleanQueries on 2.9
Date Thu, 16 Jul 2009 12:40:26 GMT

ok new facts, less chaos :) 

- LUCENE-1744 fixed it definitely; I have it confirmed 
Also, we found another example of the Query that was stuck (t1 t2 t3)~2 ... this is also fixed
with LUCENE-1744

Re:  "some queries are 4X slower  than before".  Was that a different issue?  (Because this
issue is "the query runs forever").

Maybe :) I do not know. 
When I wrote this email about "the query runs forever" I did not know if this slowdown is
the same or different issue... I have just reported some unusual observation (4 times slower)
and was later convinced that this stuck Query confirms the same problem ....

Now, I do not know  if that was the same effect, or wrong measurement, or something else lurking
... Good point, will try to repeat test on this slowdown...

Just a reminder This 4_times_slower Query is different:
+(a b c) +(x y z)

+((NAME:hans NAME:hahns^0.23232001 NAME:hams^0.27648002 NAME:hamz^0.25392 NAME:hanas^0.18722998
NAME:hanbs^0.18722998 NAME:hanfs^0.18722998 NAME:hangs^0.18722998 NAME:hanhs^0.24030754 NAME:hanis^0.18722998
NAME:hanjs^0.18722998 NAME:hanks^0.18722998 NAME:hanms^0.18722998 NAME:hanos^0.18722998 NAME:hanrs^0.18722998
NAME:hansb^0.20172001 NAME:hansd^0.20172001 NAME:hansf^0.20172001 NAME:hansg^0.20172001 NAME:hansi^0.20172001
NAME:hansj^0.20172001 NAME:hansk^0.20172001 NAME:hansl^0.20172001 NAME:hansn^0.20172001 NAME:hanso^0.20172001
NAME:hansp^0.20172001 NAME:hanst^0.20172001 NAME:hansu^0.20172001 NAME:hansw^0.20172001 NAME:hansy^0.20172001
NAME:hansz^0.20172001 NAME:hants^0.18722998 NAME:hanus^0.18722998 NAME:hanws^0.18722998 NAME:hehns^0.20172001
NAME:hens^0.2736075 NAME:hins^0.24843 NAME:hons^0.24843 NAME:huhns^0.1801875 NAME:huns^0.24843)^2.0)

+(((ZIPS:berlin ZIPS:barlin^0.28227 ZIPS:berien^0.25947002 ZIPS:berling^0.23232001 ZIPS:perlin^0.26133335))^1.2)


----- Original Message ----
> From: Michael McCandless <>
> To:
> Sent: Thursday, 16 July, 2009 13:52:06
> Subject: Re: speed of BooleanQueries on 2.9
> On Thu, Jul 16, 2009 at 6:38 AM, eks devwrote:
> > and this String has exactly that form
> > (x OR y OR z) OR (a OR b OR c),
> > That is exactly how I construct the Query, have a look at brackets on this 
> toString result .
> Duh!  OK, I had missed that your large query actually had 2 clauses at
> the top!  Sigh.
> OK, that part of the puzzle now at least makes sense.  The rewrite()
> of your query will not reduce to a single OR query (as I previously
> thought).
> So in fact you have a BS at the top (because you called
> setAllowDocsOutOfOrder(true)), with 2 clauses, and each of those
> clauses uses BS2 to score.
> I think advance() is not involved, but LUCENE-1744 could very well
> have fixed this, because BS calls sub.scorer.docID() when interacting
> with its sub-scorers, and due to LUCENE-1744, that would always return
> -1 from a BS2, so BS could enter an infinite loop.
> If you run w/o the fix for LUCENE-1744, with my instrumentation, I can
> confirm this.  But I think likely this is it.
> Also: you started this thread by saying "some queries are 4X slower
> than before".  Was that a different issue?  (Because this issue is
> "the query runs forever").
> Mike
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message