Return-Path: Delivered-To: apmail-lucene-solr-user-archive@locus.apache.org Received: (qmail 64499 invoked from network); 7 Jan 2008 22:16:36 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 7 Jan 2008 22:16:36 -0000 Received: (qmail 30159 invoked by uid 500); 7 Jan 2008 22:16:19 -0000 Delivered-To: apmail-lucene-solr-user-archive@lucene.apache.org Received: (qmail 30131 invoked by uid 500); 7 Jan 2008 22:16:19 -0000 Mailing-List: contact solr-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: solr-user@lucene.apache.org Delivered-To: mailing list solr-user@lucene.apache.org Received: (qmail 30122 invoked by uid 99); 7 Jan 2008 22:16:19 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 07 Jan 2008 14:16:19 -0800 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: local policy) Received: from [192.251.219.16] (HELO iserv1.seatimes.com) (192.251.219.16) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 07 Jan 2008 22:15:55 +0000 Received: from pexchconn.seatimes.com (pexchconn [192.251.220.55]) by iserv1.seatimes.com (8.13.4/8.13.4) with ESMTP id m07MFkaK012437 for ; Mon, 7 Jan 2008 14:15:59 -0800 (PST) Received: from PEXCHVD.seatimes.com ([10.80.10.152]) by pexchconn.seatimes.com with Microsoft SMTPSVC(6.0.3790.3959); Mon, 7 Jan 2008 14:15:19 -0800 X-MimeOLE: Produced By Microsoft Exchange V6.5 Content-class: urn:content-classes:message MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Subject: Problem with camelCase but not casing in general Date: Mon, 7 Jan 2008 14:15:10 -0800 Message-ID: <89ECDF0F8AE50C458A342C4C4F8CB842045A5467@PEXCHVD.seatimes.com> X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: Problem with camelCase but not casing in general Thread-Index: AchResSJztFm7+h6ScWyJEPkI/6R+w== From: "Benjamin Higgins" To: X-OriginalArrivalTime: 07 Jan 2008 22:15:19.0351 (UTC) FILETIME=[C9CC0470:01C8517A] X-Virus-Checked: Checked by ClamAV on apache.org Hi all, I am using a mostly out-of-the-box install of Solr that I'm using to search through our code repositories. I've run into a funny problem where searches for text that is camelCased aren't returning results unless the casing is exactly the same. =20 For example, a query for "getElementById" returns 364 results, but "getelementbyid" returns 0. There isn't a problem with all casings, though. For example, "function" and "Function" both return the same number of results, as does "FUNCTION" and "FUNCtion" (6,278 with my docs). However, "funcTION" returns only a few results--and it's where the word is actually split up (e.g. "func tion")! So it seems that something may be tokenizing words where casing appears in the middle of them! How can I get this to stop? Thanks! Ben Here's the definition for the text field type in my schema.xml: