Return-Path: Delivered-To: apmail-lucene-solr-user-archive@minotaur.apache.org Received: (qmail 71137 invoked from network); 5 Aug 2010 13:50:52 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 5 Aug 2010 13:50:52 -0000 Received: (qmail 73287 invoked by uid 500); 5 Aug 2010 13:50:50 -0000 Delivered-To: apmail-lucene-solr-user-archive@lucene.apache.org Received: (qmail 72947 invoked by uid 500); 5 Aug 2010 13:50:46 -0000 Mailing-List: contact solr-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: solr-user@lucene.apache.org Delivered-To: mailing list solr-user@lucene.apache.org Received: (qmail 72939 invoked by uid 99); 5 Aug 2010 13:50:45 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 05 Aug 2010 13:50:45 +0000 X-ASF-Spam-Status: No, hits=2.2 required=10.0 tests=FREEMAIL_ENVFROM_END_DIGIT,FREEMAIL_FROM,RCVD_IN_DNSWL_NONE,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of jtal04@gmail.com designates 209.85.216.48 as permitted sender) Received: from [209.85.216.48] (HELO mail-qw0-f48.google.com) (209.85.216.48) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 05 Aug 2010 13:50:37 +0000 Received: by qwd7 with SMTP id 7so5794480qwd.35 for ; Thu, 05 Aug 2010 06:50:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:received:date:message-id :subject:from:to:content-type; bh=xD3OESE8JgZH8SgDgoOjO1BuvDzF1IOemQx6ZToSpAc=; b=w3cj0XEVwRsFj+nvB5Q75QNwdoheCgvgt54BH6rIMolpv8aHXfnT5o32AgcoY1/pqn GiUyduZjuECzz+LKvylJSC5VrdTB6K+WFgzDczYrm3SoJLwJo/cF13WAvJ32HCCB2kAZ gZwMlCjkm2Palx404J3ccMvLdXEIgjcJylOqM= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:date:message-id:subject:from:to:content-type; b=L+h9Fk3T/gZrCFWVvA6L6Q77Cc6w4wp1/NauvSU6pXvnRH5kqdt5Do581lknvubJY9 1JngSq9RTpnmaft/9ynlIN/hd5Nu90qBUvVbGg5ZrWvevSGVaVVB3VTH9PQ/zh2bJiMs I97T5asQhQj/EwXaF7A5J1fAfoQbvw+P1e7Ak= MIME-Version: 1.0 Received: by 10.224.80.3 with SMTP id r3mr4910401qak.314.1281016217060; Thu, 05 Aug 2010 06:50:17 -0700 (PDT) Received: by 10.229.216.140 with HTTP; Thu, 5 Aug 2010 06:50:16 -0700 (PDT) Date: Thu, 5 Aug 2010 08:50:16 -0500 Message-ID: Subject: word delimiter From: j To: solr-user@lucene.apache.org Content-Type: text/plain; charset=ISO-8859-1 I have UPPER12-lower and would like to be able to find it with queries "UPPER" or "lower". What should break this up for the index? A tokenizer or a filter such as WordDelimiterFilterFactory? I have tried various combinations of parameters to WordDelimiterFilterFactory and cant get it to split properly. Here are the results from using standard tokenizer followed directly by the WordDelimiterFilterFactory markup below (from analysis.jsp): 1 | 2 UPPER12-lower | lower ----------------------- UPPER | ----------------------- 12 |