From java-user-return-54192-apmail-lucene-java-user-archive=lucene.apache.org@lucene.apache.org Mon Nov 19 10:12:21 2012 Return-Path: X-Original-To: apmail-lucene-java-user-archive@www.apache.org Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 7C437DC85 for ; Mon, 19 Nov 2012 10:12:21 +0000 (UTC) Received: (qmail 76559 invoked by uid 500); 19 Nov 2012 10:12:18 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 76359 invoked by uid 500); 19 Nov 2012 10:12:18 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 76337 invoked by uid 99); 19 Nov 2012 10:12:17 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 19 Nov 2012 10:12:17 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=5.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: local policy) Received: from [194.39.74.200] (HELO mailrelay.beumer.com) (194.39.74.200) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 19 Nov 2012 10:12:10 +0000 Received: from localhost (DTS-MM [127.0.0.1]) by mailrelay.beumer.com (Postfix) with ESMTP id BE8C15CA35D for ; Mon, 19 Nov 2012 11:09:24 +0100 (CET) X-Virus-Scanned: amavisd-new at beumer.com Received: from mailrelay.beumer.com ([127.0.0.1]) by localhost (vm-mailrelay1.beumer.com [127.0.0.1]) (amavisd-new, port 10000) with ESMTP id VGUnqrzGgNdH for ; Mon, 19 Nov 2012 11:09:24 +0100 (CET) Received: from R1-ExchFe1.BEUMER.com (r1-exchfe1.beumer.com [172.31.54.12]) by mailrelay.beumer.com (Postfix) with ESMTP id ADB035CA347 for ; Mon, 19 Nov 2012 11:09:24 +0100 (CET) Received: from R2-EXCHDB1.BEUMER.com ([fe80::a148:c390:d1f1:a49d]) by R1-ExchFe1.BEUMER.com ([fe80::915a:9697:9c48:44f5%16]) with mapi id 14.02.0309.002; Mon, 19 Nov 2012 11:11:50 +0100 From: "Dyga, Adam" To: "java-user@lucene.apache.org" Subject: RE: German 'ue' -> 'u' conversion Thread-Topic: German 'ue' -> 'u' conversion Thread-Index: Ac3GOSlweMcSn7rwT/CS3zxTDYR20AABCWsw///wkYD//+7dMA== Date: Mon, 19 Nov 2012 10:11:49 +0000 Message-ID: <4F7DF008E6B16049AABE8779C772586F2C1F8A78@R2-ExchDB1.BEUMER.com> References: <4F7DF008E6B16049AABE8779C772586F2C1F8A4D@R2-ExchDB1.BEUMER.com> <50AA055F.4020600@getrailo.org> In-Reply-To: <50AA055F.4020600@getrailo.org> Accept-Language: en-GB, de-DE, en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [172.31.54.18] Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-Virus-Checked: Checked by ClamAV on apache.org I did, but none of them can do it (at least in default configuration). Regards, AD -----Original Message----- From: Igal @ getRailo.org [mailto:igal@getrailo.org]=20 Sent: 19 listopada 2012 11:10 To: java-user@lucene.apache.org Subject: Re: German 'ue' -> 'u' conversion look for filters that use the ICU4J library On 11/19/2012 2:08 AM, Lutz Fechner wrote: > Hi, > > we use a modified ISOLatin1AccentFilter bit to replace German accents by = ae, oe, ue and so on for that purpose. > > In the code you will see a switch for the characters. > > > You need to change it from > > case '\u00E4' : // small =E4 > output[outputPos++] =3D 'a'; > output[outputPos++] =3D 'e'; > break; > > To something like this > > case '\u00E4' : // small =E4 > output[outputPos++] =3D 'a'; > break; > > for the characters you want to replace. > > > Best Regards > > Lutz Fechner > > > > > -----Original Message----- > From: Dyga, Adam [mailto:adam.dyga@beumergroup.com] > Sent: Montag, 19. November 2012 10:47 > To: java-user@lucene.apache.org > Subject: German 'ue' -> 'u' conversion > > Hello, > > I have two questin regarding handling German umlauts in Lucene: > > 1. I'm trying to find a way to convert German Umlauts written as 'ue', 'a= e', etc to folded form 'u', 'a' and so on. > This is done by GermanAnalyzer (and German2StemFilter used by it), but un= fortunately it also does stemming which is very undesired in my case. > Is there any other filter that can do only the 'ua' -> 'u' conversion? > > 2. Is there any filter that does '=FC' -> 'ue' (NOT 'u') conversion? What= I'm trying to achieve is that word "=FCber" should be found in the index w= henever the user searches for " =FCber" or "ueber" , but NOT "uber". > > Regards, > AD > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org > For additional commands, e-mail: java-user-help@lucene.apache.org > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org > For additional commands, e-mail: java-user-help@lucene.apache.org > > --=20 Igal Sapir Railo - Open Source CFML Engine http://getrailo.org --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org