lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Peterson <quu...@gmail.com>
Subject Grouping in Lucene queries giving unexpected results
Date Thu, 16 Feb 2017 18:42:41 GMT
I have a question about the meaning and behavior of grouping behavior with
Lucene queries.

In particular, here is the scenario I am testing. I have indexed 1,000
documents.

|---+-------------------------------------------+---------------|
| # | Query String                              | Result (Hits) |
|---+-------------------------------------------+---------------|
| 1 | *:*                                       |          1000 |
| 2 | host:host_1                               |            46 |
| 3 | location:location_5                       |           100 |
| 4 | host:host_1 AND NOT location:location_5   |            37 |
| 5 | host:host_1 AND (NOT location:location_5) |             0 |
|---+-------------------------------------------+---------------|

I don't understand why the last query returns 0. I would expect queries 4
and 5 to return the same result.

Here's the interpretation based on running it through the Lucene
classic.QueryParser:

|-------------------------------------------+--------------------------------------|
| Query String                              |
QueryParser.parse(qry).toString()    |
|-------------------------------------------+--------------------------------------|
| host:host_1 AND NOT location:location_5   | +host:host_1
-location:location_5    |
| host:host_1 AND (NOT location:location_5) | +host:host_1
+(-location:location_5) |
|-------------------------------------------+--------------------------------------|

I'd like some help understanding why I'm getting this unintuitive behavior.

Also, I see that the StandardSyntaxParser generates a different query
string:

|-------------------------------------------+-------------------------------------------------|
| Query String                              |
StandardSyntaxParser.parse(qry).toQueryString() |
|-------------------------------------------+-------------------------------------------------|
| host:host_1 AND NOT location:location_5   | host:host_1 AND
-location:location_5            |
| host:host_1 AND (NOT location:location_5) | host:host_1 AND (
-location:location_5 )        |
|-------------------------------------------+-------------------------------------------------|

Are these equivalent in Lucene? Should I stop using the classic.QueryParser?

*Details*

Using Lucene 5.5.0.
Using classic.QueryParser and query code is:

    Directory directory = FSDirectory.open(getCurrentDirectory().toPath());
    StandardAnalyzer analyzer = new StandardAnalyzer();
    DirectoryReader reader = DirectoryReader.open(directory);
    IndexSearcher searcher = new IndexSearcher(reader);
    QueryParser parser = new QueryParser("ts", analyzer);
    Query query = parser.parse("host:host_1 AND NOT location:location_5");

    int limit = 1000;
    TopDocs hits = searcher.search(query, limit);
    System.out.println("hits.totalHits = " + hits.totalHits);


Thanks very much for your insights here.

-Michael Peterson

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message