lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Arun Kumar K <arunk...@gmail.com>
Subject DocIDBitSets & Grouping
Date Mon, 24 Jun 2013 10:54:57 GMT
Hi Guys,

I am using Lucene 4.2.

1> For my use case i am doing a search say name:xyz* and then i have a need
to do a grouping with (from query same as name:xyz* + Filter + GroupSort)
may be in same/different thread.

>From my understanding the second internal search will be faster but i have
good number of threads doing the same with different queries which may
affect the IO Cache.

Still, i don't want to perform same search internally again for grouping .

Reusing the previous search results by having a Bitset and using
BitsFilteredDocIDSet to Filter may solve till filtering but is there any
way to wrap these result DocIDsets as input for grouping ?
                       or
                Any smart way ?


2> For an AND Query i have tried
a) BooleanQuery, Query
b) FiledCacheTermsFilter
c) DocIDBitSet + BitsFilteredDocIDSet.
with 1 GB index and 4 lakh documents matching First Query and 2 lakh
documents matching second query but retrieving/collecting 10000 documents
only.

With prior warming i find that (a) & (b) take almost same time. I knew that
only when we reuse the Filter we get its benefits.
(c) takes around 30-40ms less time.

Can we conclude from this that method (c) is better ?
Is my choice Bitset implementation appropriate ?

Did i get somethings wrong & Are there any smart ways to do these ?

Thanks,
Arun

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message