Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 61995 invoked from network); 25 May 2006 17:52:54 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (209.237.227.199) by minotaur.apache.org with SMTP; 25 May 2006 17:52:54 -0000 Received: (qmail 85329 invoked by uid 500); 25 May 2006 17:52:48 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 85304 invoked by uid 500); 25 May 2006 17:52:48 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 85293 invoked by uid 99); 25 May 2006 17:52:48 -0000 Received: from asf.osuosl.org (HELO asf.osuosl.org) (140.211.166.49) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 25 May 2006 10:52:48 -0700 X-ASF-Spam-Status: No, hits=0.0 required=10.0 tests=HTML_MESSAGE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (asf.osuosl.org: domain of danutchi@gmail.com designates 66.249.82.207 as permitted sender) Received: from [66.249.82.207] (HELO wx-out-0102.google.com) (66.249.82.207) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 25 May 2006 10:52:47 -0700 Received: by wx-out-0102.google.com with SMTP id h28so927613wxd for ; Thu, 25 May 2006 10:52:26 -0700 (PDT) DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=beta; d=gmail.com; h=received:message-id:date:from:to:subject:in-reply-to:mime-version:content-type:references; b=HgNm698k1xu9103ix7NogZaHd1kq8tiMFcKrisD3PreJ6hWB6hrdZq1b6ecTL7Qr+sYgiSSnQh7Fh51iapMov0gC+QBAqLXE3Az1Zxn3CQUf68JmM6UgHH68zDuly/l8yi5ama/B6l2ppH+cLPdwDoFWCamYUyYlVTD15hXvXAI= Received: by 10.70.66.13 with SMTP id o13mr480349wxa; Thu, 25 May 2006 10:52:26 -0700 (PDT) Received: by 10.70.33.17 with HTTP; Thu, 25 May 2006 10:52:26 -0700 (PDT) Message-ID: Date: Thu, 25 May 2006 19:52:26 +0200 From: "Dan Wiggin" To: java-user@lucene.apache.org Subject: Re: Question about special characters In-Reply-To: MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----=_Part_137250_11266998.1148579546360" References: X-Virus-Checked: Checked by ClamAV on apache.org X-Spam-Rating: minotaur.apache.org 1.6.2 0/1000/N ------=_Part_137250_11266998.1148579546360 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: quoted-printable Content-Disposition: inline My own solution until I have another one better, I use FuzzyQuery for every term in the phrase. For example "My work is the worst" ->> My~ work~ is~ the~ worst What do you think about this uggly solution? I don't have anything more ideas. 2006/5/24, Dan Wiggin : > > I need some functionality and I don't know how to do. > The problem is special characters like =E0, =E4 , =E7 or =F1 latin charac= ters in > the text. > Now I use iso latin filter, but the problem is when I want to obtain most > term used. These term are stored without ` =B4 ^ or another "character > attribute". > For example "pl=E0nt=EFu=E7" (it isn't a real word) is stored like the te= rm > "plantiuc". > How can I do to have in term vector the word "pl=E0nt=EFu=E7". > > thks for all replies. > PD: excuse if this question is solved somewhere, but I don't saw it. > ------=_Part_137250_11266998.1148579546360--