Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 74237 invoked from network); 30 Dec 2008 15:36:07 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 30 Dec 2008 15:36:07 -0000 Received: (qmail 83562 invoked by uid 500); 30 Dec 2008 15:36:01 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 83517 invoked by uid 500); 30 Dec 2008 15:36:00 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 83506 invoked by uid 99); 30 Dec 2008 15:36:00 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 30 Dec 2008 07:36:00 -0800 X-ASF-Spam-Status: No, hits=2.2 required=10.0 tests=HTML_MESSAGE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of erickerickson@gmail.com designates 74.125.46.29 as permitted sender) Received: from [74.125.46.29] (HELO yw-out-2324.google.com) (74.125.46.29) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 30 Dec 2008 15:35:54 +0000 Received: by yw-out-2324.google.com with SMTP id 3so1552613ywj.5 for ; Tue, 30 Dec 2008 07:35:33 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:date:from:to :subject:in-reply-to:mime-version:content-type:references; bh=Vxf63+dbPC/R1m2eV8Dwv13/CoxCaEZ+SX0OE5ahtXc=; b=orJzC4obuguLIjuGwOkIXJqSNmD9k3HHp0nT57QvsgH1ivM5YRhtwqSjjxqF5iS7iE 3wtBsEgsaItCn09toGlvqqVvBtqpJWIw2oUVTx7LSs6qkJJsQdCEoKbxtJMktpfY4092 bA2UU8Sda8MEbFM3aKy+Z7Ke8xKcZuCaQKVnQ= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:to:subject:in-reply-to:mime-version :content-type:references; b=KgKKSeE1gPAaWogB4Bs1aD07zbUTbG16EVg8bqGw+k4gR+mjdgCysTNCaki7/6A90H uuLn+dp78QGCJvnJ8C61GfSlCGX2anpWskvyY35mzxEffOo58lZ04swyhczoqMI+LoK9 dXXOHWlde5tV/FtEMFN7YFpMA3tBXNdHdxrG4= Received: by 10.90.72.3 with SMTP id u3mr7088539aga.24.1230651333333; Tue, 30 Dec 2008 07:35:33 -0800 (PST) Received: by 10.90.93.1 with HTTP; Tue, 30 Dec 2008 07:35:33 -0800 (PST) Message-ID: <359a92830812300735k7370b915t7cde710a537225a0@mail.gmail.com> Date: Tue, 30 Dec 2008 10:35:33 -0500 From: "Erick Erickson" To: java-user@lucene.apache.org, thomaslegrand14@yahoo.fr Subject: Re: Filtering accents In-Reply-To: <547401.91928.qm@web26503.mail.ukl.yahoo.com> MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----=_Part_127318_29094443.1230651333327" References: <547401.91928.qm@web26503.mail.ukl.yahoo.com> X-Virus-Checked: Checked by ClamAV on apache.org ------=_Part_127318_29094443.1230651333327 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Content-Disposition: inline You might want to take a look at using the ISOLatinAccentFilter or similar at both index and query time. It basically folds accented characters into thei= r un-accented form. Matthew: You wrote: <<>> I also did this before realizing that the second field is unnecessary. Storing is orthogonal to indexing. That is, the filters are NOT applied to the stored data. >From the docs for Field.Store.YES (emphasis mine) <<>> I don't think it makes much/any real performance difference, but it does make the code simpler to use just one field... Best Erick On Tue, Dec 30, 2008 at 8:52 AM, legrand thomas w= rote: > Dear all, > > I'd like my lucene searches to be insensitive to (French) accents. For > example, considering a indexed term "m=E9tal", I want to get it when sear= ching > for "metal" or "m=E9tal" . I use lucene-2.3.2 and the searches are perfor= med > with: IndexSearcher.search(query,filter,sorter), Another filter is alread= y > used together with a "Sort" object. Futrhermore, I cannot use the > FrenchAnalyzer as my index does not only contain French words. > > Can anybody help ? > Thanks in advance, > Tom > > > > ------=_Part_127318_29094443.1230651333327--