lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Kenny Wong <kw...@proofpoint.com>
Subject Nested SpanQuery issue
Date Thu, 24 Jan 2019 23:56:13 GMT
Hi,

"one two three four five six"

We are unable to match the above text using the query (small reproducer at the bottom):

    spanNear([spanNear([f:one, spanOr([f:two, f:three])], 1, true), f:five], 1, true)

The human readable form is "one W~1 (two OR three) W~1 five", which reads like ("one" within
1 slop of "two" or "three") and within 1 slop of "five".

We think it should match as "<b>one</b> two <b>three</b> four <b>five</b>",
but it seems the inner spanNear sees "one two" as satisfying the criteria and does not consider
"three", which is required for an overall match. If we increase the slops to 2, we do get
a match. However, a slop of 1 looks sufficient here.

Could this be a bug with SpanNearQuery?

Thank you,
Kenny Wong

public class LuceneTest {

    public static void main(String[] args) throws Exception {
        RAMDirectory mem = new RAMDirectory();
        IndexWriter writer = new IndexWriter(mem,
            new IndexWriterConfig(new WhitespaceAnalyzer()));
        try {
            Document doc = new Document();
            Field f = new TextField("f", "one two three four five six", Store.NO);
            doc.add(f);
            writer.addDocument(doc);
        }
        finally {
            writer.close();
        }

        SpanQuery q = newSpanNear(1,
            newSpanNear(1, newSpanTerm("one"), newSpanOr(newSpanTerm("two"), newSpanTerm("three"))),
            newSpanTerm("five"));

        try (DirectoryReader reader = DirectoryReader.open(mem)) {
            TopDocs topDocs = new IndexSearcher(reader).search(q, 1);
            System.out.println(1 == topDocs.totalHits);
        }
    }

    static SpanQuery newSpanTerm(String text) {
        return new SpanTermQuery(new Term("f", text));
    }

    static SpanQuery newSpanNear(int slop, SpanQuery... clauses) {
        return new SpanNearQuery(clauses, slop, true);
    }

    static SpanQuery newSpanOr(SpanQuery...clauses) {
        return new SpanOrQuery(clauses);
    }
}

Mime
View raw message