Return-Path: X-Original-To: apmail-lucene-dev-archive@www.apache.org Delivered-To: apmail-lucene-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 8A2266C27 for ; Thu, 16 Jun 2011 20:33:13 +0000 (UTC) Received: (qmail 19956 invoked by uid 500); 16 Jun 2011 20:33:11 -0000 Delivered-To: apmail-lucene-dev-archive@lucene.apache.org Received: (qmail 19886 invoked by uid 500); 16 Jun 2011 20:33:11 -0000 Mailing-List: contact dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@lucene.apache.org Delivered-To: mailing list dev@lucene.apache.org Received: (qmail 19879 invoked by uid 99); 16 Jun 2011 20:33:11 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 16 Jun 2011 20:33:11 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=5.0 tests=ALL_TRUSTED,T_RP_MATCHES_RCVD X-Spam-Check-By: apache.org Received: from [140.211.11.116] (HELO hel.zones.apache.org) (140.211.11.116) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 16 Jun 2011 20:33:09 +0000 Received: from hel.zones.apache.org (hel.zones.apache.org [140.211.11.116]) by hel.zones.apache.org (Postfix) with ESMTP id B639C41C08F for ; Thu, 16 Jun 2011 20:32:48 +0000 (UTC) Date: Thu, 16 Jun 2011 20:32:48 +0000 (UTC) From: "Robert Muir (JIRA)" To: dev@lucene.apache.org Message-ID: <238504312.12738.1308256368743.JavaMail.tomcat@hel.zones.apache.org> Subject: [jira] [Commented] (SOLR-219) Determine if prefix, wildcard, fuzzy queries should be lowercased MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/SOLR-219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13050708#comment-13050708 ] Robert Muir commented on SOLR-219: ---------------------------------- a lot of analysis things like stemming are not prepared to deal with wildcard characters in the term, and returning multiple tokens (because a tokenizer splits on a * or whatever) makes no sense either in my opinion, a good solution here is to allow you to specify in your schema: this is the analysis chain for these multitermqueries, so it would be a different chain rather than "query" or "index" (similar to SOLR-2477 where I propose allowing you to specify one for "phrase"). The QP would use this chain for things like wildcards, and throw an exception if the analyzer returns more than one token from a wildcard term. This way you can use KeywordTokenizer + lowercase/fold characters or whatever, but in general doing things like WDF or synonyms makes no sense here. If you want to do things like stemming, thats fine, you can shoot yourself in the foot this way and we won't stop you. But in no case should we try to magically apply the analysis chain... too ambiguous what would happen. > Determine if prefix, wildcard, fuzzy queries should be lowercased > ----------------------------------------------------------------- > > Key: SOLR-219 > URL: https://issues.apache.org/jira/browse/SOLR-219 > Project: Solr > Issue Type: Improvement > Reporter: Yonik Seeley > Priority: Minor > Fix For: 3.3 > > Attachments: lowercase_prefix.patch, wildcardlowercase.patch > > > Solr should be able to "do the right thing" when doing prefix/wildcard/fuzzy queries on fields with respect to lowercasing or not. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For additional commands, e-mail: dev-help@lucene.apache.org