Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 76287 invoked from network); 1 Mar 2007 08:28:47 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 1 Mar 2007 08:28:47 -0000 Received: (qmail 9046 invoked by uid 500); 1 Mar 2007 08:28:48 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 9002 invoked by uid 500); 1 Mar 2007 08:28:48 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 8974 invoked by uid 99); 1 Mar 2007 08:28:48 -0000 Received: from herse.apache.org (HELO herse.apache.org) (140.211.11.133) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 01 Mar 2007 00:28:48 -0800 X-ASF-Spam-Status: No, hits=0.0 required=10.0 tests= X-Spam-Check-By: apache.org Received-SPF: pass (herse.apache.org: local policy) Received: from [132.68.238.36] (HELO mailgw4.technion.ac.il) (132.68.238.36) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 01 Mar 2007 00:28:33 -0800 Received: from localhost (localhost.localdomain [127.0.0.1]) by mailgw4.technion.ac.il (Postfix) with ESMTP id D73C5240ADD for ; Thu, 1 Mar 2007 10:28:10 +0200 (IST) X-Virus-Scanned: by amavisd-new at technion.ac.il Received: from mailgw4.technion.ac.il ([127.0.0.1]) by localhost (mailgw4.technion.ac.il [127.0.0.1]) (amavisd-new, port 10024) with LMTP id ybKxXVJOufbp for ; Thu, 1 Mar 2007 10:28:10 +0200 (IST) Received: from fermat.math.technion.ac.il (fermat.math.technion.ac.il [132.68.115.6]) by mailgw4.technion.ac.il (Postfix) with ESMTP id B39C1240AD3 for ; Thu, 1 Mar 2007 10:28:10 +0200 (IST) Received: from fermat.math.technion.ac.il (localhost [127.0.0.1]) by fermat.math.technion.ac.il (8.12.10/8.12.10) with ESMTP id l218S8Sb017583 for ; Thu, 1 Mar 2007 10:28:08 +0200 (IST) Received: (from nyh@localhost) by fermat.math.technion.ac.il (8.12.10/8.12.10/Submit) id l218S8XL017582 for java-user@lucene.apache.org; Thu, 1 Mar 2007 10:28:08 +0200 (IST) X-Authentication-Warning: fermat.math.technion.ac.il: nyh set sender to nyh@math.technion.ac.il using -f Date: Thu, 1 Mar 2007 10:28:07 +0200 From: "Nadav Har'El" To: java-user@lucene.apache.org Subject: Re: indexing performance Message-ID: <20070301082807.GA15826@fermat.math.technion.ac.il> References: Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.4.2.2i Hebrew-Date: 11 Adar 5767 X-Virus-Checked: Checked by ClamAV on apache.org On Tue, Feb 27, 2007, Saravana wrote about "indexing performance": > Hi, > > Is it possible to scale lucene indexing like 2000/3000 documents per > second? I don't know about the actual numbers, but one trick I've used in the past to get really fast indexing was to create several independent indexes in parallel. Simply, if you have, say, 4 CPUs and perhaps even several physical disks, run 4 indexing processes each indexing a 1/4 of the files and creating a separate index (on separate disks on separate IO channels, if possible). At the end, you have 4 indexes which you can actually search together without any real need to merge them, unless query performance is very important to you as well. > I need to index 10 fields each with 20 bytes long. I should be > able to search by just giving any of the field values as criteria. I need to > get the count that has same field values. You need just the counts? And you want to do just whole-field matching, not word matching? In that case, Lucene might be an overkill for you. Or, if you do use Lucene, make sure to use "keyword" (untokenized) fields, not "tokenized" fields. -- Nadav Har'El | Thursday, Mar 1 2007, 11 Adar 5767 IBM Haifa Research Lab |----------------------------------------- |Open your arms to change, but don't let http://nadav.harel.org.il |go of your values. --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org