lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Carsten Schnober <>
Subject SpanQuery and Bits
Date Thu, 06 Dec 2012 09:54:55 GMT
I have a problem understanding and applying the BitSets concept in
Lucene 4.0. Unfortunately, there does not seem to be a lot of
documentation about the topic.

The general task is to extract Spans matching a SpanQuery which works
with the following snippet:

for (AtomicReaderContext atomic : reader.getContext().leaves()) {		
  Spans spans = query.getSpans(atomic, new Bits.MatchAllBits(0),
  while ( {
    // extract payloads etc.

I understand that the acceptDocs parameter in SpanQuery.getSpans()
restricts the search to a set of documents. In the example given above,
it searches all documents (Bits.MatchAllBits), right?

What I would like to do is generate a Bits object that is based on a
BooleanQuery beforehand in order to restrict the search through
getSpans() to a set of documents that contain certain terms.
I also have a MultiReader object that handles multiple indexes.
My intuitive approach would be to apply a QueryWrapperFilter like this:

MultiReader reader = ...
BooleanQuery bq = ...
DocIdSet bitset = ???;
Filter filter = new QueryWrapperFilter(bq);
for (AtomicReaderContext context = reader.getContext().leaves()) {
  filter.getDocIdSet(context, new Bits.MatchAllBits(0))

The obvious question is: how do I handle the context bitsets returned by
getDocIdSet() correctly so that I can pass the 'bitset' variable to the
getSpans() call?

Or am I on the wrong path for this kind of problem?

Institut für Deutsche Sprache |
Projekt KorAP                 |
Tel. +49-(0)621-43740789      |
Korpusanalyseplattform der nächsten Generation
Next Generation Corpus Analysis Platform

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message