lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Carsten Schnober <>
Subject SpanQuery, Filter, BooleanQuery
Date Mon, 29 Oct 2012 12:40:47 GMT
I've got a setup in which I would like to perform an arbitrary query
over one field (typically realised through a WildcardQuery) and the
matches are returned as a SpanQuery because the result payloads are
further processed using and Span.getPayload(). This works
fine with the following code (extract), using Lucene 4.0.0:

// these fields are initialized externally through public methods:
private final MultiReader reader;
private final String termString;
private final String fieldname;
private final int maxHits;

private Map<Term, TermContext> termContexts = new HashMap<>();
WildcardQuery wildcard;
Term term = new Term(fieldname, termString);
SpanQuery query;	// Lucene query
Spans luceneSpans;

wildcard = new WildcardQuery(term);
query = (SpanQuery) new
spans = query.getSpans(atomic, matchingTitleIDs.bits(), termContexts);

for (AtomicReaderContext atomic : reader.getContext().leaves()) {
  spans = query.getSpans(atomic, matchingTitleIDs.bits(), termContexts);
  while ( && total <= maxHits) {

Now, I'd like to add the option to filter the resulting Spans object by
another WildcardQuery on a different field that contains document
titles. My intuitive approach would have been to use a filter like this:

Filter filter = new QueryWrapperFilter(new WildcardQuery(new
Term(titlefield, titles)));

The filter is applied in a dedicated method with this line:

DocIdSet matchingTitleIDs = filter.getDocIdSet(context, new

And subsequently, the getSpan() call from above is substituted by:

spans = query.getSpans(atomic, matchingTitleIDs.bits(), termContexts);

However, this yields either a NullPointerException when there are no
hits or does not affect the results at all in comparison to no filtering.

I've come across the thread "lucene-4.0: QueryWrapperFilter & docBase"
[1] in which Uwe suggests not to use QueryWrapperFilter, but to use
another Query and to combine it using a Boolean Query in such a
scenario, if I understand correctly. Does this still claim for Lucene 4.0?
However, I am not sure how to use a BooleanQuery here because I need the
SpanQuery result.

Any thoughts about what I'm doing wrong and how to fix this?
Thank you very much!


Institut für Deutsche Sprache |
Projekt KorAP                 |
Tel. +49-(0)621-43740789      |
Korpusanalyseplattform der nächsten Generation
Next Generation Corpus Analysis Platform

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message