lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Robert Muir (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SOLR-219) Determine if prefix, wildcard, fuzzy queries should be lowercased
Date Thu, 16 Jun 2011 20:32:48 GMT

    [ https://issues.apache.org/jira/browse/SOLR-219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13050708#comment-13050708
] 

Robert Muir commented on SOLR-219:
----------------------------------

a lot of analysis things like stemming are not prepared to deal with wildcard characters in
the term, and returning multiple tokens (because a tokenizer splits on a * or whatever) makes
no sense either

in my opinion, a good solution here is to allow you to specify in your schema: this is the
analysis chain for these multitermqueries, so it would be a different chain rather than "query"
or "index" (similar to SOLR-2477 where I propose allowing you to specify one for "phrase").
The QP would use this chain for things like wildcards, and throw an exception if the analyzer
returns more than one token from a wildcard term.

This way you can use KeywordTokenizer + lowercase/fold characters or whatever, but in general
doing things like WDF or synonyms makes no sense here.  If you want to do things like stemming,
thats fine, you can shoot yourself in the foot this way and we won't stop you.

But in no case should we try to magically apply the analysis chain... too ambiguous what would
happen.


> Determine if prefix, wildcard, fuzzy queries should be lowercased
> -----------------------------------------------------------------
>
>                 Key: SOLR-219
>                 URL: https://issues.apache.org/jira/browse/SOLR-219
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Yonik Seeley
>            Priority: Minor
>             Fix For: 3.3
>
>         Attachments: lowercase_prefix.patch, wildcardlowercase.patch
>
>
> Solr should be able to "do the right thing" when doing prefix/wildcard/fuzzy queries
on fields with respect to lowercasing or not.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message