Return-Path: Delivered-To: apmail-jakarta-lucene-user-archive@www.apache.org Received: (qmail 37842 invoked from network); 4 Nov 2004 20:31:35 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (209.237.227.199) by minotaur-2.apache.org with SMTP; 4 Nov 2004 20:31:35 -0000 Received: (qmail 15204 invoked by uid 500); 4 Nov 2004 20:31:26 -0000 Delivered-To: apmail-jakarta-lucene-user-archive@jakarta.apache.org Received: (qmail 15160 invoked by uid 500); 4 Nov 2004 20:31:25 -0000 Mailing-List: contact lucene-user-help@jakarta.apache.org; run by ezmlm Precedence: bulk List-Unsubscribe: List-Subscribe: List-Help: List-Post: List-Id: "Lucene Users List" Reply-To: "Lucene Users List" Delivered-To: mailing list lucene-user@jakarta.apache.org Received: (qmail 15146 invoked by uid 99); 4 Nov 2004 20:31:25 -0000 X-ASF-Spam-Status: No, hits=0.0 required=10.0 tests=RCVD_BY_IP,SPF_HELO_PASS,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (hermes.apache.org: domain of giulio.cesare@gmail.com designates 64.233.184.197 as permitted sender) Received: from [64.233.184.197] (HELO wproxy.gmail.com) (64.233.184.197) by apache.org (qpsmtpd/0.28) with ESMTP; Thu, 04 Nov 2004 12:31:25 -0800 Received: by wproxy.gmail.com with SMTP id 66so137470wri for ; Thu, 04 Nov 2004 12:31:22 -0800 (PST) DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=beta; d=gmail.com; h=received:message-id:date:from:reply-to:to:subject:in-reply-to:mime-version:content-type:content-transfer-encoding:references; b=usJ462rtJ0P/pthKW5Va6Sfxq64NitO8h5mIJrZREVdwQAJqTRkYo3SCDAnnZ1SElVt0iHn82Y/lKoeAwDW9+SOERLx33qyMbEGHtdrhoNHqz/xs0etQnZCHrjgddwdKjKD8dpV2/nQScVZRxcJimMj1UVdryJO78elwpuEzQ98= Received: by 10.54.30.8 with SMTP id d8mr246981wrd; Thu, 04 Nov 2004 12:31:22 -0800 (PST) Received: by 10.54.24.26 with HTTP; Thu, 4 Nov 2004 12:31:22 -0800 (PST) Message-ID: Date: Thu, 4 Nov 2004 21:31:22 +0100 From: Giulio Cesare Solaroli Reply-To: Giulio Cesare Solaroli To: Lucene Users List , javier muguruza Subject: Re: one huge index or many small ones? In-Reply-To: Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit References: X-Virus-Checked: Checked X-Spam-Rating: minotaur-2.apache.org 1.6.2 0/1000/N Hi Javier, On Thu, 4 Nov 2004 20:08:15 +0100, javier muguruza wrote: > Justin, > > Yes, I wanted as less info as possible in the index. The body and > atachemntes will be stored outside lucene. As I mentioned, I only > need to deal with the body/attachments contents with lucene, from, to, > subject, dates etc are deal with before. You probably can get away with this solution as well, but I would like to suggest you to test Lucene performance before starting optimizing. If your query on the text of the body/attachments are not huge (my user end up with rewritten query whose lengths are up to 600KBytes!!!!!!), Lucene will be probably able to return your the right result much faster than looking in different places for the same query. Don't be afraid of the number of documents either; not before testing on some real data. You could easily find that a simpler architecture can perform fast enough, and be much more easy to set up and tune. [...] Giulio Cesare --------------------------------------------------------------------- To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org For additional commands, e-mail: lucene-user-help@jakarta.apache.org