Return-Path: X-Original-To: apmail-lucene-java-user-archive@www.apache.org Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id DF436D579 for ; Sat, 1 Dec 2012 05:33:02 +0000 (UTC) Received: (qmail 90510 invoked by uid 500); 1 Dec 2012 05:33:00 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 90248 invoked by uid 500); 1 Dec 2012 05:32:57 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 90176 invoked by uid 99); 1 Dec 2012 05:32:54 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 01 Dec 2012 05:32:54 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of trejkaz@trypticon.org designates 209.85.212.48 as permitted sender) Received: from [209.85.212.48] (HELO mail-vb0-f48.google.com) (209.85.212.48) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 01 Dec 2012 05:32:49 +0000 Received: by mail-vb0-f48.google.com with SMTP id fc21so193010vbb.35 for ; Fri, 30 Nov 2012 21:32:29 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type:x-gm-message-state; bh=AkaiEMxKGAKZgOeEUSEC5zPhQXj4O8O4TcESD29ooUc=; b=PY7rKLQ2QqUsvKajE6icoshgoDUSvJmidNK8NOyNDDy7wRgcKOo5229BACPnYHkGLf jB3P+xcjnhgtmMIQZgo4BGmG9fDGk22PfRDOnPyeCWhOJwzrFRuQeps7u8rJ1VtiiBxr VyRaoAU7BKl+gx3WyxXSlDUyQrz9OlYUQvXgDUPWno9N2wF2yNUkAnUpsXEUCjobP892 JFSSObjkYAW+hu/Isgy2oN93Zij3wP4gmsP5NhmJhrZ13VTvKREIcXG7xN43FycMcOXY 5rST0F/wA+g3kygQPjL02Rht5NZiN6ypi4QokLgUUxfIzeolU06+PF3JNgHBFdFi9M3A CKFQ== Received: by 10.58.33.66 with SMTP id p2mr3150479vei.24.1354339948908; Fri, 30 Nov 2012 21:32:28 -0800 (PST) Received: from mail-vb0-f48.google.com (mail-vb0-f48.google.com [209.85.212.48]) by mx.google.com with ESMTPS id cv19sm1679864vdb.5.2012.11.30.21.32.27 (version=SSLv3 cipher=OTHER); Fri, 30 Nov 2012 21:32:28 -0800 (PST) Received: by mail-vb0-f48.google.com with SMTP id fc21so193003vbb.35 for ; Fri, 30 Nov 2012 21:32:27 -0800 (PST) MIME-Version: 1.0 Received: by 10.220.115.138 with SMTP id i10mr3014346vcq.37.1354339947480; Fri, 30 Nov 2012 21:32:27 -0800 (PST) Received: by 10.58.249.198 with HTTP; Fri, 30 Nov 2012 21:32:27 -0800 (PST) In-Reply-To: References: Date: Sat, 1 Dec 2012 16:32:27 +1100 Message-ID: Subject: Re: Difference in behaviour between LowerCaseFilter and String.toLowerCase() From: Trejkaz To: Lucene Users Mailing List Content-Type: text/plain; charset=UTF-8 X-Gm-Message-State: ALoCoQlLjcFcg2ZTPH/6upAIXjlUM55xLWHzIXsOW9NisQxL58ijIXZmEFLV1+UWCoFOL+tdlgK3 X-Virus-Checked: Checked by ClamAV on apache.org On Fri, Nov 30, 2012 at 8:22 PM, Ian Lea wrote: > Sounds like a side effect of possibly different, locale-dependent, > results of using String.toLowerCase() and/or Character.toLowerCase(). > > http://docs.oracle.com/javase/6/docs/api/java/lang/String.html#toLowerCase() > specifically mentions Turkish. > > A Google search for "Character.toLowerCase() turkish" gets hits which > sound relevant. Certainly Turkish has special rules because of that uppercase I with dot. I was more wondering whether LowerCaseFilter was intentionally doing it differently to String.toLowerCase() or whether it was some kind of unintentional side-effect of using Character.toLowerCase() iteratively. TX --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org