Mailing-List: contact dev-help@lucene.apache.org; run by ezmlm
Precedence: bulk
Reply-To: dev@lucene.apache.org
Date: Thu, 16 Jun 2011 20:32:48 +0000 (UTC)
From: "Robert Muir (JIRA)" <jira@apache.org>
To: dev@lucene.apache.org
Message-ID: 
 <238504312.12738.1308256368743.JavaMail.tomcat@hel.zones.apache.org>
Subject: [jira] [Commented] (SOLR-219) Determine if prefix, wildcard, fuzzy
 queries should be lowercased
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit


    [ https://issues.apache.org/jira/browse/SOLR-219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13050708#comment-13050708 ] 

Robert Muir commented on SOLR-219:
----------------------------------

a lot of analysis things like stemming are not prepared to deal with wildcard characters in the term, and returning multiple tokens (because a tokenizer splits on a * or whatever) makes no sense either

in my opinion, a good solution here is to allow you to specify in your schema: this is the analysis chain for these multitermqueries, so it would be a different chain rather than "query" or "index" (similar to SOLR-2477 where I propose allowing you to specify one for "phrase"). The QP would use this chain for things like wildcards, and throw an exception if the analyzer returns more than one token from a wildcard term.

This way you can use KeywordTokenizer + lowercase/fold characters or whatever, but in general doing things like WDF or synonyms makes no sense here.  If you want to do things like stemming, thats fine, you can shoot yourself in the foot this way and we won't stop you.

But in no case should we try to magically apply the analysis chain... too ambiguous what would happen.


> Determine if prefix, wildcard, fuzzy queries should be lowercased
> -----------------------------------------------------------------
>
>                 Key: SOLR-219
>                 URL: https://issues.apache.org/jira/browse/SOLR-219
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Yonik Seeley
>            Priority: Minor
>             Fix For: 3.3
>
>         Attachments: lowercase_prefix.patch, wildcardlowercase.patch
>
>
> Solr should be able to "do the right thing" when doing prefix/wildcard/fuzzy queries on fields with respect to lowercasing or not.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org