Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 28695 invoked from network); 31 Jul 2009 20:44:47 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 31 Jul 2009 20:44:47 -0000 Received: (qmail 28981 invoked by uid 500); 31 Jul 2009 20:44:45 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 28927 invoked by uid 500); 31 Jul 2009 20:44:45 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 28917 invoked by uid 99); 31 Jul 2009 20:44:45 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 31 Jul 2009 20:44:45 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: local policy) Received: from [68.230.240.8] (HELO eastrmmtao102.cox.net) (68.230.240.8) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 31 Jul 2009 20:44:33 +0000 Received: from eastrmimpo03.cox.net ([68.1.16.126]) by eastrmmtao102.cox.net (InterMail vM.7.08.02.01 201-2186-121-102-20070209) with ESMTP id <20090731204412.KSPW22890.eastrmmtao102.cox.net@eastrmimpo03.cox.net> for ; Fri, 31 Jul 2009 16:44:12 -0400 Received: from eastrmwbtp01 ([172.18.18.217]) by eastrmimpo03.cox.net with bizsmtp id NkkC1c0064h0NJL02kkCpK; Fri, 31 Jul 2009 16:44:12 -0400 X-VR-Score: -100.00 X-Authority-Analysis: v=1.0 c=1 a=2-7Xn7-Y4lcA:10 a=n-kJSqksAAAA:8 a=mV9VRH-2AAAA:8 a=M8CpLEyLbinQWswDTfUA:9 a=DwEfrlBosXizPBjv2mQA:7 a=9Y2bHKf0tRr9zUeVb98iE5U5HdgA:4 a=98jSFH7WqmUA:10 X-CM-Score: 0.00 Received: from [72.196.195.196] by webtop.east.cox.net with HTTP; Fri, 31 Jul 2009 16:44:12 -0400 Date: Fri, 31 Jul 2009 16:44:12 -0400 (EDT) From: ohaya@cox.net To: java-user@lucene.apache.org Message-ID: <12747033.846.1249073052399.JavaMail.ohaya@127.0.0.1> Subject: Re: ThreadedIndexWriter vs. IndexWriter MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed; delsp=no Content-Transfer-Encoding: 7bit User-Agent: Laszlo Mail 2 X-SID: 0 X-Originating-IP: [72.196.195.196] Content-Disposition: inline X-Virus-Checked: Checked by ClamAV on apache.org Hi, Sorry to jump in, but I've been following this thread with interest :)... Am I misunderstanding your original observation, that ThreadedIndexWriter produced smaller index? Did the ThreadedIndexWriter also finish faster (I'm assuming that it should)? If the index is smaller, and everything else being good and equal, doesn't that mean that using ThreadedIndexWriter is a good thing? Anyway, aside from checking that the # of documents were the same, have you looked at the index using something like Luke? Does the contents of the index look the same in both cases, or were they different? If different, how so (e.g., missing terms, etc.)? Later, Jim On Fri, Jul 31, 2009 at 2:38 PM , Jibo John wrote: > Number of docs are the same in the index for both the cases (200,000). > I haven't altered the benchmark/ code, but, used a profiler to verify > that Benchmark main thread is closed only after all other threads > are closed. > > Thanks, > -Jibo > > > On Jul 31, 2009, at 2:34 AM, Michael McCandless wrote: > >> Hmm... this doesn't sound right. >> >> That example (ThreadedIndexWriter) is meant to be a drop-in >> replacement, wherever you use an IndexWriter, that keeps an >> under-the-hood thread pool (using java.util.concurrent.*) to >> add/update documents with multiple threads. >> >> It should not result in a smaller index. >> >> Can you sanity check the index? Eg is numDocs() the same for both? >> You definitely called close() on the writer, right? That method >> waits >> for all threads to finish their work before actually closing. >> >> Mike >> >> On Thu, Jul 30, 2009 at 8:01 PM, Jibo John wrote: >>> While trying out a few tuning options using contrib/benchmak as >>> described in >>> LIA (2nd edition) book, I had an interesting observation. >>> >>> If I use a ThreadedIndexWriter (picked the example from lia2e, page >>> 356) >>> instead of IndexWriter, the index size got reduced by 40% compared >>> to using >>> IndexWriter. >>> Index related configuration were the same for both the tests in the >>> alg >>> file. >>> >>> I am curious how come using a threaded index writer will have an >>> impact on >>> the index size. >>> >>> Appreciate your input. >>> >>> Thanks, >>> -Jibo >>> >>> >>> --------------------------------------------------------------------- >>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org >>> For additional commands, e-mail: java-user-help@lucene.apache.org >>> >>> >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org >> For additional commands, e-mail: java-user-help@lucene.apache.org >> > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org > For additional commands, e-mail: java-user-help@lucene.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org