lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mark Harwood (JIRA)" <j...@apache.org>
Subject [jira] Commented: (LUCENE-1486) Wildcards, ORs etc inside Phrase queries
Date Sat, 13 Jun 2009 11:59:08 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-1486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12719115#action_12719115
] 

Mark Harwood commented on LUCENE-1486:
--------------------------------------

The primary reason (and perhaps not a particularly good one) was I didn't want to wade around
in the Javacc syntax of the .jj file that generates the QueryParser and the required extensions
could be made in a subclass.

Also there is invariably a performance hit for supporting things like wildcards in phrase
queries so rather than adding another "off by default" flag in the main parser  and conditional
logic to test if "wildcards etc in phrases" are allowed, the subclass could be seen as a specialised
extension that is to be used by those that understand the trade-offs between functionality
and performance.  

I can sympathise with the purist approach of having all parser syntax defined in Javacc though.

> Wildcards, ORs etc inside Phrase queries
> ----------------------------------------
>
>                 Key: LUCENE-1486
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1486
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: QueryParser
>    Affects Versions: 2.4
>            Reporter: Mark Harwood
>            Priority: Minor
>             Fix For: 2.9
>
>         Attachments: ComplexPhraseQueryParser.java, TestComplexPhraseQuery.java
>
>
> An extension to the default QueryParser that overrides the parsing of PhraseQueries to
allow more complex syntax e.g. wildcards in phrase queries.
> The implementation feels a little hacky - this is arguably better handled in QueryParser
itself. This works as a proof of concept  for much of the query parser syntax. Examples from
the Junit test include:
> 		checkMatches("\"j*   smyth~\"", "1,2"); //wildcards and fuzzies are OK in phrases
> 		checkMatches("\"(jo* -john)  smith\"", "2"); // boolean logic works
> 		checkMatches("\"jo*  smith\"~2", "1,2,3"); // position logic works.
> 		
> 		checkBadQuery("\"jo*  id:1 smith\""); //mixing fields in a phrase is bad
> 		checkBadQuery("\"jo* \"smith\" \""); //phrases inside phrases is bad
> 		checkBadQuery("\"jo* [sma TO smZ]\" \""); //range queries inside phrases not supported
> Code plus Junit test to follow...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message