Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 54330 invoked from network); 18 Jan 2011 19:28:25 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 18 Jan 2011 19:28:25 -0000 Received: (qmail 82487 invoked by uid 500); 18 Jan 2011 19:28:23 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 82411 invoked by uid 500); 18 Jan 2011 19:28:22 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 82402 invoked by uid 99); 18 Jan 2011 19:28:22 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 18 Jan 2011 19:28:22 +0000 X-ASF-Spam-Status: No, hits=1.5 required=10.0 tests=FREEMAIL_FROM,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of serera@gmail.com designates 209.85.210.48 as permitted sender) Received: from [209.85.210.48] (HELO mail-pz0-f48.google.com) (209.85.210.48) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 18 Jan 2011 19:28:17 +0000 Received: by pzk28 with SMTP id 28so1323180pzk.35 for ; Tue, 18 Jan 2011 11:27:57 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:in-reply-to:references:date :message-id:subject:from:to:content-type; bh=Cdrtol4kkce9AR5BHPAak1pLUdzGtvVMXWMKYCz+61g=; b=xoLJtwXeVeM4GJzP4blE8vlwHXt/MaZ514doCbv4BSnVjeqfPFT/KdCxt1jz7P1eRD aU51kMxyS9G4uiGVjX9jhzNlg8EF9iGC8xg/nCeSL5xbHCj0cTR7j5qGFUi28DQi1r3x TvCPiIzixBHuvrco7Xj1dadFawyhTG+lH5FS8= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; b=DFkQBwddNOcV2yMALV1weWsafg/yPu28e9z7GWWmwNvu/SVU/dJCamsxmNyLsBPXLp eCOxCtNWFSV+8Xz9RDKLYl2HDy2cuti+suQrOWCUEeTFSVKCtf2o4GaEFABkJr8ijMdo 4Esagn3H0mv3zopMncfeedtMGcTYqBiAuMVs0= MIME-Version: 1.0 Received: by 10.142.239.11 with SMTP id m11mr5581499wfh.120.1295378876782; Tue, 18 Jan 2011 11:27:56 -0800 (PST) Received: by 10.143.10.11 with HTTP; Tue, 18 Jan 2011 11:27:56 -0800 (PST) In-Reply-To: References: Date: Tue, 18 Jan 2011 21:27:56 +0200 Message-ID: Subject: Re: Best practices for multiple languages? From: Shai Erera To: java-user@lucene.apache.org Content-Type: multipart/alternative; boundary=000e0cd14494ccdab0049a23e5d9 --000e0cd14494ccdab0049a23e5d9 Content-Type: text/plain; charset=ISO-8859-1 Hi There are two types of multi-language docs: 1) Docs in different languages -- every document is one language 2) Each document has fields in different languages I've dealt with both, and there are different solutions to each. Which of them is yours? Shai On Tue, Jan 18, 2011 at 7:53 PM, Clemens Wyss wrote: > What is the "best practice" to support multiple languages, i.e. > Lucene-Documents that have multiple language content/fields? > Should > a) each language be indexed in a seperate index/directory or should > b) the Documents (in a single directory) hold the diverse localized fields? > > We most often will be searching "language dependent" which (at least > performance wise) mandates one-directory-per-language... > > Any (lucene specific) white papers on this topic? > > Thx in advance > Clemens > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org > For additional commands, e-mail: java-user-help@lucene.apache.org > > --000e0cd14494ccdab0049a23e5d9--