Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 14376 invoked from network); 6 Sep 2005 12:32:50 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (209.237.227.199) by minotaur.apache.org with SMTP; 6 Sep 2005 12:32:50 -0000 Received: (qmail 68562 invoked by uid 500); 6 Sep 2005 12:32:45 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 68534 invoked by uid 500); 6 Sep 2005 12:32:44 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 68521 invoked by uid 99); 6 Sep 2005 12:32:44 -0000 Received: from asf.osuosl.org (HELO asf.osuosl.org) (140.211.166.49) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 06 Sep 2005 05:32:44 -0700 X-ASF-Spam-Status: No, hits=0.0 required=10.0 tests= X-Spam-Check-By: apache.org Received-SPF: pass (asf.osuosl.org: local policy) Received: from [217.73.2.150] (HELO mx-1.framfab.com) (217.73.2.150) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 06 Sep 2005 05:32:57 -0700 Received: from seex01.framfab.com (seex01.framfab.com [157.125.1.161]) by mx-1.framfab.com (Postfix) with ESMTP id 0106C3481BC for ; Tue, 6 Sep 2005 14:36:45 +0200 (CEST) Received: from UKEX01.framfab.com ([157.125.65.14]) by seex01.framfab.com with Microsoft SMTPSVC(6.0.3790.211); Tue, 6 Sep 2005 14:32:41 +0200 X-MimeOLE: Produced By Microsoft Exchange V6.5.7226.0 Content-class: urn:content-classes:message MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Subject: RE: Multiple Language Indexing and Searching Date: Tue, 6 Sep 2005 13:32:37 +0100 Message-ID: <22F9729A33EBA2458807A2EDE0EE3075E13930@UKEX01.framfab.com> X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: Multiple Language Indexing and Searching Thread-Index: AcWy3YGWkKPGzZX7SqiJvZfYXuwJOQAAJU8g From: "James Adams" To: X-OriginalArrivalTime: 06 Sep 2005 12:32:41.0383 (UTC) FILETIME=[12DCAB70:01C5B2DF] X-Virus-Checked: Checked by ClamAV on apache.org X-Spam-Rating: minotaur.apache.org 1.6.2 0/1000/N Surely it's best to have a specific analyzer for each language? Would support for multiple Analyzers with a single index require a different IndexWriter for each Analzser/language? Would you then need to manage the disk access of these regarding locking etc, so two IndexWriter's can not do so at the same time? -----Original Message----- From: Olivier Jaquemet [mailto:olivier.jaquemet@jalios.com]=20 Sent: 06 September 2005 13:21 To: java-user@lucene.apache.org Subject: Re: Multiple Language Indexing and Searching Gusenbauer Stefan wrote: >I think nutch uses ngramj for language classification but i don't know >what type of saving language information they use. In our application >for example i save the language in an extra field in the document >because lucene is supporting multiple fields with the same names we >would be able to handle different languages. but for now we don't need it > =20 > But then, if you do so, you do not benefit from any specialized Analyzer you could use for each language, do you? Then again, maybe it's not that interesting to use specialized analyzers for each language?. --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org