lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Hostetter <hossman_luc...@fucit.org>
Subject Re: performance boost through multithreaded query processing?
Date Thu, 30 Oct 2008 18:06:52 GMT

: We improved the performance through caching the bitsets of the single 
: fuzzy query/wildcard query.

: Within our logs we can see that combined queries within a BooleanQuery 
: are processed sequentially. So our question are: Does it make sense for 
: you to parallelize the processing of the queries within a boolean query 
: (with a restriction of the amount of prallel processed queries)? With 

inspite of it's name a BooleanQuery doesn't just provide boolean logic on 
sets -- it's actually computing aggregate scoring information based on the 
subscores.  that's not something that i can imaging being easy to do if 
you try to parallelize the processing of hte subclauses.

based on your first comment however, it sounds like you don't care about 
scoring -- if that's the case, then sure: instead of using a BooleanQuery, 
compute your disjoint sets (in parallel) and then interset/union them

: Could there be drawbacks combining the results of the booelan clauses. 

one gotcha i can imagine is dealing with SHOULD clauses ... if youre 
boolean queries are all MUST and MUST_NOT life is easy, but trying to 
apply set intersection logic with SHOULD clauses gets interesting (1 is 
fine, 2 get unioned, 2 with a MUST get unioned and then intersected with 
the MUST, etc...)






-Hoss


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message