Return-Path: X-Original-To: apmail-lucene-solr-user-archive@minotaur.apache.org Delivered-To: apmail-lucene-solr-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 5F9E715AC for ; Tue, 26 Apr 2011 18:33:08 +0000 (UTC) Received: (qmail 88146 invoked by uid 500); 26 Apr 2011 18:33:05 -0000 Delivered-To: apmail-lucene-solr-user-archive@lucene.apache.org Received: (qmail 88107 invoked by uid 500); 26 Apr 2011 18:33:05 -0000 Mailing-List: contact solr-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: solr-user@lucene.apache.org Delivered-To: mailing list solr-user@lucene.apache.org Received: (qmail 88099 invoked by uid 99); 26 Apr 2011 18:33:05 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 26 Apr 2011 18:33:05 +0000 X-ASF-Spam-Status: No, hits=0.7 required=5.0 tests=RCVD_IN_DNSWL_NONE,SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (nike.apache.org: local policy) Received: from [209.85.220.176] (HELO mail-vx0-f176.google.com) (209.85.220.176) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 26 Apr 2011 18:32:56 +0000 Received: by vxa37 with SMTP id 37so1026823vxa.35 for ; Tue, 26 Apr 2011 11:32:33 -0700 (PDT) Received: by 10.52.95.234 with SMTP id dn10mr1711184vdb.66.1303842752904; Tue, 26 Apr 2011 11:32:32 -0700 (PDT) Received: from [192.168.0.25] (pool-108-5-120-77.nwrknj.fios.verizon.net [108.5.120.77]) by mx.google.com with ESMTPS id cd8sm1303890vdb.8.2011.04.26.11.32.30 (version=SSLv3 cipher=OTHER); Tue, 26 Apr 2011 11:32:31 -0700 (PDT) From: Charles Wardell Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: quoted-printable Subject: Question on Batch process Date: Tue, 26 Apr 2011 14:32:29 -0400 Message-Id: <3CE71CB6-F237-44AA-9FE2-F04C78A6EDCA@bcsolution.com> To: solr-user@lucene.apache.org Mime-Version: 1.0 (Apple Message framework v1084) X-Mailer: Apple Mail (2.1084) X-Virus-Checked: Checked by ClamAV on apache.org I am sure that this question has been asked a few times, but I can't = seem to find the sweetspot for indexing. I have about 100,000 files each containing 1,000 xml documents ready to = be posted to Solr. My desire is to have it index as quickly as possible = and then once completed the daily stream of ADDs will be small in = comparison. The individual documents are small. Essentially web postings from the = net. Title, postPostContent, date.=20 What would be the ideal configuration? For RamBufferSize, mergeFactor, = MaxbufferedDocs, etc.. My machine is a quad core hyper-threaded. So it shows up as 8 cpu's in = TOP I have 16GB of available ram. Thanks in advance. Charlie=