lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Adriano Crestani (JIRA)" <>
Subject [jira] Commented: (LUCENE-1486) Wildcards, ORs etc inside Phrase queries
Date Wed, 22 Jul 2009 02:38:15 GMT


Adriano Crestani commented on LUCENE-1486:

I see what Adriano was talking about now - technically the first 2 quotes would match, and
then the second two - I think Mark H was just demonstrating that you shouldn't try that query
though - a user might think they are quoting smith, but for the example, it doesn't matter.
I think he just trying to show that you shouldn't try and "nest" phrases - even though they
wouldn't be interpreted that way anyway.

Well, if you guessed his intention correctly, the comment is misleading: "phrases inside phrases
is bad". But lets wait for his response.

Other things that are not supported might throw exceptions too

I think a user would expect a ParseException. Probably, every query parser user catches ParserException
and show a nice message to its final user. Now, if the query parser starts throwing random
exception to say the syntax is invalid, every software that uses Lucene query parser is gonna
start crashing. For me it's like if a compiler started throwing segmentation fault every time
you forget a } in the code.

> Wildcards, ORs etc inside Phrase queries
> ----------------------------------------
>                 Key: LUCENE-1486
>                 URL:
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: QueryParser
>    Affects Versions: 2.4
>            Reporter: Mark Harwood
>            Assignee: Mark Harwood
>            Priority: Minor
>             Fix For: 2.9
>         Attachments:, junit_complex_phrase_qp_07_21_2009.patch,
LUCENE-1486.patch, LUCENE-1486.patch, LUCENE-1486.patch, LUCENE-1486.patch,
> An extension to the default QueryParser that overrides the parsing of PhraseQueries to
allow more complex syntax e.g. wildcards in phrase queries.
> The implementation feels a little hacky - this is arguably better handled in QueryParser
itself. This works as a proof of concept  for much of the query parser syntax. Examples from
the Junit test include:
> 		checkMatches("\"j*   smyth~\"", "1,2"); //wildcards and fuzzies are OK in phrases
> 		checkMatches("\"(jo* -john)  smith\"", "2"); // boolean logic works
> 		checkMatches("\"jo*  smith\"~2", "1,2,3"); // position logic works.
> 		checkBadQuery("\"jo*  id:1 smith\""); //mixing fields in a phrase is bad
> 		checkBadQuery("\"jo* \"smith\" \""); //phrases inside phrases is bad
> 		checkBadQuery("\"jo* [sma TO smZ]\" \""); //range queries inside phrases not supported
> Code plus Junit test to follow...

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message