Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 32536 invoked from network); 30 Jul 2007 18:18:43 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 30 Jul 2007 18:18:43 -0000 Received: (qmail 82558 invoked by uid 500); 30 Jul 2007 18:18:36 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 82521 invoked by uid 500); 30 Jul 2007 18:18:36 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 82510 invoked by uid 99); 30 Jul 2007 18:18:36 -0000 Received: from Unknown (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 30 Jul 2007 11:18:36 -0700 X-ASF-Spam-Status: No, hits=2.0 required=10.0 tests=HTML_MESSAGE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of erickerickson@gmail.com designates 209.85.134.190 as permitted sender) Received: from [209.85.134.190] (HELO mu-out-0910.google.com) (209.85.134.190) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 30 Jul 2007 18:18:27 +0000 Received: by mu-out-0910.google.com with SMTP id g7so1761392muf for ; Mon, 30 Jul 2007 11:18:06 -0700 (PDT) DKIM-Signature: a=rsa-sha1; c=relaxed/relaxed; d=gmail.com; s=beta; h=domainkey-signature:received:received:message-id:date:from:to:subject:in-reply-to:mime-version:content-type:references; b=LrroMQq8CNA0shC882s6nP2uNOTZSHGFRsWZYO3jZkdT8ixk4cG9NUQy4YqHe9UFBrnD7pQPBIUkSSdDIboL6TO2qOCs+RcCdlztfwITu1zYYgM8IetuVBsXpfib3Y9hc2tpJINzD/6VmTH/lkKcGJwqo+hVRlqx6m/a4ki9Vl8= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=beta; h=received:message-id:date:from:to:subject:in-reply-to:mime-version:content-type:references; b=Bfa5KoEM4eiQ1dLeeSip67qn1SR0JTzWn79FEUPr98DCeipwwO2DFt6REF0mKyB5WYxSRrRQdImXkPDNODXPoSogITeFsKdV2JItRDIBizZsHsJUCV0X+MsSW468awkaA2n4BF2nchdzYztwcyigzdSbOwa8LJ4NLaZjWlgwzLs= Received: by 10.82.112.3 with SMTP id k3mr4704920buc.1185819486098; Mon, 30 Jul 2007 11:18:06 -0700 (PDT) Received: by 10.82.167.3 with HTTP; Mon, 30 Jul 2007 11:18:06 -0700 (PDT) Message-ID: <359a92830707301118k2abb67a3we47ea024b249ac80@mail.gmail.com> Date: Mon, 30 Jul 2007 14:18:06 -0400 From: "Erick Erickson" To: java-user@lucene.apache.org Subject: Re: a question for french analyzer In-Reply-To: <6e3ae6310707301106p175e85dmfb9d6724ac0f9851@mail.gmail.com> MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----=_Part_21432_19254436.1185819486071" References: <6e3ae6310707301106p175e85dmfb9d6724ac0f9851@mail.gmail.com> X-Virus-Checked: Checked by ClamAV on apache.org ------=_Part_21432_19254436.1185819486071 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Content-Disposition: inline Gosh, I sure hope not, because that would mean that we rolled our own for no good reason. We wound up just collapsing the input stream by substituting plain old 'e' for all the accented variants before indexing and before searching. Be *really* careful what character set you're using. Actually, we would have still had to roll our own because the character mapping was...er...wonky .... You have to store the data raw for display purposes if you want the accents to show though... Best Erick On 7/30/07, Chris Lu wrote: > > Hi, > > I am not a French speaker, but here are some questions regarding > French analyzer: > > Is there any analyzer that can do this? Analyze accentuated letters to > non accentuated corresponding letters (=E9,=E8,=EA,=EB -> e), so that > > search "fen=EAtre" (=3Dwindow) found all docs with "fen=EAtre" or "fenetr= e" > and > search "fenetre" found the same result, all docs with "fen=EAtre" or > "fenetre" > > Current analyzers, Snowball-French and FrenchAnalyzer don't have this > feature. > > -- > Chris Lu > ------------------------- > Instant Scalable Full-Text Search On Any Database/Application > site: http://www.dbsight.net > demo: http://search.dbsight.com > Lucene Database Search in 3 minutes: > > http://wiki.dbsight.com/index.php?title=3DCreate_Lucene_Database_Search_i= n_3_minutes > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org > For additional commands, e-mail: java-user-help@lucene.apache.org > > ------=_Part_21432_19254436.1185819486071--