Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 38746 invoked from network); 6 Aug 2008 12:28:11 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 6 Aug 2008 12:28:11 -0000 Received: (qmail 14958 invoked by uid 500); 6 Aug 2008 12:28:04 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 14620 invoked by uid 500); 6 Aug 2008 12:28:03 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 14609 invoked by uid 99); 6 Aug 2008 12:28:03 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 06 Aug 2008 05:28:03 -0700 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of markrmiller@gmail.com designates 64.233.184.227 as permitted sender) Received: from [64.233.184.227] (HELO wr-out-0506.google.com) (64.233.184.227) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 06 Aug 2008 12:27:07 +0000 Received: by wr-out-0506.google.com with SMTP id c30so2846065wra.21 for ; Wed, 06 Aug 2008 05:27:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:date:from :user-agent:mime-version:to:subject:references:in-reply-to :content-type:content-transfer-encoding; bh=jesDC7BhW4afqDYmcVpNk7iOO/c7FQ0luPSGdbrntZs=; b=diRk1Stp0dw288CPB41gjnag+KvGuiKtWwgFgIokkUlCJbdg2Dsgu1CLHy6TdVMPFp HSk9PrXUuVpO3Djs5E0bZs7aMCUt7HT441IZUg6arjbMbi42CIBAmvh7tihxjJVFVCoT Y0rhPOxUo0jtWm67G08M1IWRrtlrbuE5dKEuM= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:user-agent:mime-version:to:subject:references :in-reply-to:content-type:content-transfer-encoding; b=Tl6+D/Ud19jZH8jSQHCN7IZu/7ZgwXfZkH8J97hpbpJSZDwnHcthEI97RGar3Oonib +fO1PgeNgm6uxJQ5AsC0MRdDmjCA5Bv4JH6PQqYAFF/5RjUXv72X9YHg4gVvFxUsp0Nd wlSnHNqx6MhxLGpwiQ7GeLjWACPpUNmBOBIqQ= Received: by 10.90.86.10 with SMTP id j10mr3104797agb.77.1218025637334; Wed, 06 Aug 2008 05:27:17 -0700 (PDT) Received: from ?192.168.1.105? ( [69.124.234.149]) by mx.google.com with ESMTPS id 38sm748914aga.4.2008.08.06.05.27.16 (version=SSLv3 cipher=RC4-MD5); Wed, 06 Aug 2008 05:27:16 -0700 (PDT) Message-ID: <489998A6.7040005@gmail.com> Date: Wed, 06 Aug 2008 08:27:18 -0400 From: Mark Miller User-Agent: Thunderbird 2.0.0.16 (X11/20080724) MIME-Version: 1.0 To: java-user@lucene.apache.org Subject: Re: search with accent not match References: <18848522.post@talk.nabble.com> In-Reply-To: <18848522.post@talk.nabble.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Virus-Checked: Checked by ClamAV on apache.org Check out org.apache.lucene.analysis.ISOLatin1AccentFilter It will strip diacritics - just be sure to use it at index time and query time to get what you want. Also, you will no longer be able to differentiate between the two in your searching (rarely that important in my opinion, but others certainly disagree). - Mark Christophe from paris wrote: > Hello > > I'm use FrenchAnalyzer for index > > IndexWriter writer = new IndexWriter(pathOfIndex, new FrenchAnalyzer(), > true); > Document = new Document(); > doc.add(new > Field("TXT_CHARACT_VALUE",word.toLowerCase(),Field.Store.YES,Field.Index.TOKENIZED)); > writer.addDocument(doc); > > And search > > IndexReader reader = IndexReader.open(pathOfIndex); > Searcher searcher = new IndexSearcher(reader); > Analyzer analyzer = new FrenchAnalyzer(); > QueryParser parser = new QueryParser(field, analyzer); > Query query = parser.parse(motRecherche); > Hits hits = searcher.search(query); > > in my document i have the word "lumiere" and "lumière" > > when i search lumière only document match lumière but "lumiere" is not > return > > and if search "lumiere" the result is lumiere, lumieres ,lumiére,lumiéres > but not lumière > > for a total match i must search "lumiere OR limière" > but is not the best solution > --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org