lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Christoph Goller (JIRA)" <>
Subject [jira] [Commented] (LUCENE-7398) Nested Span Queries are buggy
Date Wed, 03 Aug 2016 15:26:20 GMT


Christoph Goller commented on LUCENE-7398:

The whole idea of the patch is to change the order of the matches returned by SpanOrQuery.

SpanTermQuery q2 = new SpanTermQuery(new Term(FIELD, "w2"));
SpanTermQuery q3 = new SpanTermQuery(new Term(FIELD, "w3"));
SpanNearQuery q23 = new SpanNearQuery(new SpanQuery[]{q2, q3}, 0, true);
SpanOrQuery q223 = new SpanOrQuery(q2, q23);

For a document containing "w1 w2 w3 w4" query q223 now returns as first match "w2 w3" (the
longer one) and then "w2" while formerly it was the other way round. Both matches have the
same start position, but different end positions and the contract about spans says that if
start positions equal we first get the match with the lower end position (Javadoc of spans).

> Nested Span Queries are buggy
> -----------------------------
>                 Key: LUCENE-7398
>                 URL:
>             Project: Lucene - Core
>          Issue Type: Bug
>          Components: core/search
>    Affects Versions: 5.5, 6.x
>            Reporter: Christoph Goller
>            Assignee: Alan Woodward
>            Priority: Critical
>         Attachments: LUCENE-7398.patch, LUCENE-7398.patch,
> Example for a nested SpanQuery that is not working:
> Document: Human Genome Organization , HUGO , is trying to coordinate gene mapping research
> Query: spanNear([body:coordinate, spanOr([spanNear([body:gene, body:mapping], 0, true),
body:gene]), body:research], 0, true)
> The query should match "coordinate gene mapping research" as well as "coordinate gene
research". It does not match  "coordinate gene mapping research" with Lucene 5.5 or 6.1, it
did however match with Lucene 4.10.4. It probably stopped working with the changes on SpanQueries
in 5.3. I will attach a unit test that shows the problem.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message