lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Peterson <quu...@gmail.com>
Subject Range queries get misinterpreted when parsed twice via the "Standard" parsers
Date Thu, 09 Mar 2017 13:58:09 GMT
Hello,

At Rocana we have a search system that builds a Lucene query on a front end
(web)
system and sends the query string to a backend system. The query typed in
by the user
on the front end first gets parsed (for rewriting and adding additional
hidden clauses),
turned back into a Lucene query string and that query string is sent over
the network
to the backend where it is parsed again into a Query object for searching
with the
IndexSearcher.

We are using Lucene 5.5.0.

We've hit a problem with range queries with this model - namely that a
range query
of the form

ts:[1000 TO 2000]

when run through the StandardSyntaxParser and back out as a string gets
changed to

[ts:1000 ts:2000]

Which would be fine, except that when that alternative form of range syntax
is fed
back into either the StandardSyntaxParser or the StandardQueryParser it
misinterprets
it and attaches the default field to it.

Here's code to illustrate:

  String query = "ts:[1000 TO 2000] AND foo";
  String defaultField = "text";

  StandardSyntaxParser p = new StandardSyntaxParser();
  QueryNode queryTree = p.parse(query, defaultField);
  String queryStringFromTree = queryTree.toQueryString(new
EscapeQuerySyntaxImpl()).toString();

  StandardQueryParser qp = new StandardQueryParser(IndexUtil.getAnalyzer());
  org.apache.lucene.search.Query queryFromOrig = qp.parse(query,
defaultField);
  org.apache.lucene.search.Query queryFromTree =
qp.parse(queryStringFromTree, defaultField);

  System.out.println("queryStringFromTree    : " + queryStringFromTree);
  System.out.println("Orig query parsed      : " + queryFromOrig);
  System.out.println("From Tree query parsed : " + queryFromTree);

which prints:

  queryStringFromTree    : [ts:1000 ts:2000] AND text:foo
  Orig query parsed      : +ts:[1000 TO 2000] +text:foo
  From Tree query parsed : +text:[ts:1000 TO ts:2000] +text:foo

What do you recommend to handle this issue?


Thank you,
Michael Peterson

http://www.rocana.com

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message