Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 68588 invoked from network); 14 May 2009 14:37:49 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 14 May 2009 14:37:49 -0000 Received: (qmail 69649 invoked by uid 500); 14 May 2009 14:37:47 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 69570 invoked by uid 500); 14 May 2009 14:37:47 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 69559 invoked by uid 99); 14 May 2009 14:37:47 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 14 May 2009 14:37:47 +0000 X-ASF-Spam-Status: No, hits=2.2 required=10.0 tests=HTML_MESSAGE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of erickerickson@gmail.com designates 74.125.92.25 as permitted sender) Received: from [74.125.92.25] (HELO qw-out-2122.google.com) (74.125.92.25) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 14 May 2009 14:37:39 +0000 Received: by qw-out-2122.google.com with SMTP id 5so1147594qwd.53 for ; Thu, 14 May 2009 07:37:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:message-id:subject:from:to:content-type; bh=qX8YqxEXSQtAa4Im5u1WcTJIFiiKE3qdVN1xZzjUNS8=; b=LPwnmxcOGWArjqDJp+7Y/cIoWUzdT8w56hFXWJ8D2bxVw6X9mid+cE6uv7xYAfF0d5 joHeK0iTFJR89ojyZdRv/9yATutbvIUNmfD7kZImlzBMrbnpMu4x3GPOAL4xge1rf9yN pvmZxk9bppXht2NFvzqajOtMO3KcE6mBZQ9Zs= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; b=RK5fpDc9ztNsaY0KxSASPb7joeaRVrmgxZTASF5A3JRzrOkUd00G/SC7Rc1Y2n8XMy NrGUcHPG+3CEmbE8bOCXrm+179TkqHMQIcziFmAYrz5Sby+Ou99DG1t8M0RALrhgpAcx x2WCQ9QpDQLZBJ5w8ELl5k23XW2NSZuWKubSI= MIME-Version: 1.0 Received: by 10.220.85.67 with SMTP id n3mr3414490vcl.53.1242311837041; Thu, 14 May 2009 07:37:17 -0700 (PDT) In-Reply-To: <24f32b230905140711h61c72892j22e272d49a0b9fda@mail.gmail.com> References: <24f32b230905140711h61c72892j22e272d49a0b9fda@mail.gmail.com> Date: Thu, 14 May 2009 10:37:17 -0400 Message-ID: <359a92830905140737j5785ced2gafb8986006cb50d8@mail.gmail.com> Subject: Re: Question wrt Lucene analyzer for different language From: Erick Erickson To: java-user@lucene.apache.org Content-Type: multipart/alternative; boundary=0016e6460710bf43160469e0439d X-Virus-Checked: Checked by ClamAV on apache.org --0016e6460710bf43160469e0439d Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit No. What is "correctly"? Are you stemming? in which case using thesame analyzer on different languages will not work. This topic have been discussed on the user list frequently, so if you searched that archive (see: http://wiki.apache.org/lucene-java/MailingListArchives) you'd find a wealth of information quickly... Best Erick On Thu, May 14, 2009 at 10:11 AM, weidong sun wrote: > Hello, > > I am a newbie in Lucene world. I might ask some obvious question which > unfortunately I don't know the answer. Please help me 'grow'. > > We have a project intend to use Lucene search engine for search some user's > info stored our system. The user info might not be in English even it will > be stored in UTF-8 encoding. > > My question is, if I use one particular Lucene analyzer for a language > other > than English (e.g. ChineseAnalyzer or ArabicAnalyzer), can it still able to > handle it correctly if user info is mixed with English character/word? > > Really appreciated with any answers. > > :-) > --0016e6460710bf43160469e0439d--