lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alexei Dets (JIRA)" <j...@apache.org>
Subject [jira] Commented: (LUCENE-1418) QueryParser can throw NullPointerException during parsing of some queries in case if default field passed to constructor is null
Date Thu, 13 Nov 2008 20:10:44 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-1418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12647381#action_12647381
] 

Alexei Dets commented on LUCENE-1418:
-------------------------------------

> You shouldn't be passing null to the queryparser as a field right?
Wrong. It is not documented anywhere and in practice this _WORKS_ _PERFECTLY_ except for one
case of one particular syntax error in a query string that is not handled correctly in this
case. (All?) other syntax errors correctly cause ParseException and all valid queries are
parsed successfully. BTW, this is exactly what does MultiFieldQueryParser in its constructor
- it calls super(null, analyzer).

> its not part of the query syntax, so it doesn't make sense to feed a "" instead of the
null
And this is also wrong because Lucene query parser syntax allows for queries without explicit
field specification (i.e. "cat AND dog" is a valid query). From the current docs it is not
obvious how to properly construct a query parser that will correctly parse such queries, not
clear what should be passed as a field in a QueryParser constructor because there is no any
particular field, the same query parser instance is used to parse queries for different fields.
If I'll pass not null and not empty string then AFAIR instead of "+cat + dog" I'll get as
a result "+field:cat +field:doc" and this is not acceptable as this will cause a creation
of a separate query parser instance for each field - a real overkill I'd say.

Really there are two issues:
1) it should be documented how should be created "fieldless" query parser: that empty string
should be used as a field name, that null is unsupported and can give undefined results (alternative
is to fix this problem with NullPointerException and document that both ways are supported
- null and empty string);
2) currently there is no way to use the same query parser instance for parsing queries for
the different fields (except set default field to empty string and manually add "field:" prefix
to each query before parsing).

What I propose:
1) document that null field gives undefined behavior (and optionally add null check in constructor
 /IMHO documentation can be enough/ or document that null is officially supported);
2) add public setField method to be able to change the field (may be also setAnalyzer can
be usefull?);
3) add default constructor QueryParser() - and document that in this case field (and analyzer)
must be set later before parsing.

I guess currently the same thing can be achieved using static method MultiFieldQueryParser.parse(String[]
queries, String[] fields, BooleanClause.Occur[] flags, Analyzer analyzer) but having the way
to make it using regular QueryParser can be also usefull (i.e. significantly less overhead
- static methods in MultiFieldQueryParser are internally constructing a new parser instance
per field!). QueryParser can also benefit from a static parse method similar to MultiFieldQueryParser:
public static Query QueryParser.parse(String query, String field, Analyzer analyzer).

I think I can come up with the patch if we all agree on something :-)

> QueryParser can throw NullPointerException during parsing of some queries in case if
default field passed to constructor is null
> --------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: LUCENE-1418
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1418
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: QueryParser
>    Affects Versions: 2.4
>         Environment: CentOS 5.2 (probably any applies)
>            Reporter: Alexei Dets
>            Priority: Minor
>
> In case if QueryParser was constructed using "QueryParser(String f,  Analyzer a)" constructor
and f equals null then QueryParser can fail with NullPointerException during parsing of some
queries that _does_ contain field name but have unbalanced parenthesis.
> Example 1:
> Query:  field:(expr1) expr2)
> Result:
> java.lang.NullPointerException
> 	at org.apache.lucene.index.Term.<init>(Term.java:50)
> 	at org.apache.lucene.index.Term.<init>(Term.java:36)
> 	at org.apache.lucene.queryParser.QueryParser.getFieldQuery(QueryParser.java:543)
> 	at org.apache.lucene.queryParser.QueryParser.Term(QueryParser.java:1324)
> 	at org.apache.lucene.queryParser.QueryParser.Clause(QueryParser.java:1211)
> 	at org.apache.lucene.queryParser.QueryParser.Query(QueryParser.java:1168)
> 	at org.apache.lucene.queryParser.QueryParser.TopLevelQuery(QueryParser.java:1128)
> 	at org.apache.lucene.queryParser.QueryParser.parse(QueryParser.java:170)
> Example2:
> Query:  field:(expr1) "expr2")
> Result:
> java.lang.NullPointerException
> 	at org.apache.lucene.index.Term.<init>(Term.java:50)
> 	at org.apache.lucene.index.Term.<init>(Term.java:36)
> 	at org.apache.lucene.queryParser.QueryParser.getFieldQuery(QueryParser.java:543)
> 	at org.apache.lucene.queryParser.QueryParser.getFieldQuery(QueryParser.java:612)
> 	at org.apache.lucene.queryParser.QueryParser.Term(QueryParser.java:1459)
> 	at org.apache.lucene.queryParser.QueryParser.Clause(QueryParser.java:1211)
> 	at org.apache.lucene.queryParser.QueryParser.Query(QueryParser.java:1168)
> 	at org.apache.lucene.queryParser.QueryParser.TopLevelQuery(QueryParser.java:1128)
> 	at org.apache.lucene.queryParser.QueryParser.parse(QueryParser.java:170)
> Workaround: pass in constructor empty string as a default field name - in this case QueryParser.parse
method will throw ParseException (expected result because query string is wrong) instead of
NullPointerException.
> It is not obvious to me how to fix this so I'll describe my usecase, may be I'm doing
something completely wrong.
> Basically I have a set of per-field queries entered by user and need to programmatically
construct (after some preprocessing) one real Lucene query combined from these user-entered
per-field subqueries.
> To achieve this I basically do the following (simplified a bit):
> QueryParser parser = new QueryParser(null, analyzer); // I'll always provide a field
name in a query string as it is different each time and I don't have any default
> BooleanQuery query = new BooleanQuery();
> Query subQuery1 = parser.parse(field1 + ":(" + queryString1 + ')');
> query.add(subQuery1, operator1); // operator = BooleanClause.Occur.MUST, BooleanClause.Occur.MUST_NOT
or BooleanClause.Occur.SHOULD
> Query subQuery2 = parser.parse(field2 + ":(" + queryString2 + ')');
> query.add(subQuery2, operator2); 
> Query subQuery3 = parser.parse(field3 + ":(" + queryString3 + ')');
> query.add(subQuery3, operator3); 
> ...
> IMHO either QueryParser constructor should be changed to throw NullPointerException/InvalidArgumentException
in case of null field passed (and API documentation updated) or QueryParser.parse behavior
should be fixed to correctly throw ParseException instead of NullPointerException. Also IMHO
of a great help can be _public_ setField/getField methods of QueryParser (that set/get field),
this can help in use cases like my:
> QueryParser parser = new QueryParser(null, analyzer); // or add constructor with analyzer
_only_ for such cases
> BooleanQuery query = new BooleanQuery();
> parser.setField(field1);
> Query subQuery1 = parser.parse(queryString1);
> query.add(subQuery1, operator1);
> parser.setField(field2);
> Query subQuery2 = parser.parse(queryString2);
> query.add(subQuery2, operator2); 
> ...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message