Return-Path: Delivered-To: apmail-jakarta-lucene-dev-archive@www.apache.org Received: (qmail 37226 invoked from network); 8 Mar 2004 21:24:30 -0000 Received: from daedalus.apache.org (HELO mail.apache.org) (208.185.179.12) by minotaur-2.apache.org with SMTP; 8 Mar 2004 21:24:30 -0000 Received: (qmail 9368 invoked by uid 500); 8 Mar 2004 21:24:16 -0000 Delivered-To: apmail-jakarta-lucene-dev-archive@jakarta.apache.org Received: (qmail 9343 invoked by uid 500); 8 Mar 2004 21:24:16 -0000 Mailing-List: contact lucene-dev-help@jakarta.apache.org; run by ezmlm Precedence: bulk List-Unsubscribe: List-Subscribe: List-Help: List-Post: List-Id: "Lucene Developers List" Reply-To: "Lucene Developers List" Delivered-To: mailing list lucene-dev@jakarta.apache.org Received: (qmail 9329 invoked from network); 8 Mar 2004 21:24:16 -0000 Received: from unknown (HELO c000.snv.cp.net) (209.228.32.71) by daedalus.apache.org with SMTP; 8 Mar 2004 21:24:16 -0000 Received: (cpmta 5076 invoked from network); 8 Mar 2004 13:24:21 -0800 Received: from 216.12.13.89 (HELO ?192.168.0.13?) by smtp.hatcher.net (209.228.32.71) with SMTP; 8 Mar 2004 13:24:21 -0800 X-Sent: 8 Mar 2004 21:24:21 GMT Mime-Version: 1.0 (Apple Message framework v612) In-Reply-To: <404CD6AB.6070707@apache.org> References: <20040308193436.77055.qmail@web12706.mail.yahoo.com> <237201c40545$c7959d90$6501a8c0@POWERPACK> <404CD6AB.6070707@apache.org> Content-Type: text/plain; charset=US-ASCII; delsp=yes; format=flowed Message-Id: Content-Transfer-Encoding: 7bit From: Erik Hatcher Subject: Re: compound format as default in 1.4? Date: Mon, 8 Mar 2004 16:24:16 -0500 To: "Lucene Developers List" X-Mailer: Apple Mail (2.612) X-Spam-Rating: daedalus.apache.org 1.6.2 0/1000/N X-Spam-Rating: minotaur-2.apache.org 1.6.2 0/1000/N Just to weigh in with my opinion... the compound file format proves fine in my use of Lucene and I use it 'by default' already. So I'm +1 on making it the default behavior. Erik On Mar 8, 2004, at 3:25 PM, Doug Cutting wrote: > [ I moved this discussion to the developer list.] > > My metric here is the rate of complaint. > > I'm tired of hearing about "too many file handles" problems. Ususally > it is caused by folks opening a new searcher for each query, and the > garbage collector not collecting and closing the old ones fast enough, > so it signals other problems with the application, but it is still > annoying, and could be largely quashed. > > By some definition, anything which causes so many repeated complaints > is a bug, and should be fixed. Even if it's really not a bug. It > pains users of Lucene. It annoys developers of Lucene. > > Think of it like mergeFactor, etc.: the default setting may not be the > absolute fastest, but it is one that is likely to run well in most > configurations and cause the least confusion. > > Doug > > Terry Steichen wrote: >> I tend to agree (but with the same uncertainty as to why I feel that >> way). >> Regards, >> Terry >> ----- Original Message ----- From: "Otis Gospodnetic" >> >> To: "Lucene Users List" >> Sent: Monday, March 08, 2004 2:34 PM >> Subject: Re: Sys properties Was: java.io.tmpdir as lock dir .... once >> again >>> I can't explain why, but I feel like the old index format should stay >>> by default. I feel like I'd rather a (slightly) faster index, and >>> switch to the compound one when/IF I encounter problems, than have a >>> safer, but slower index, and never realize that there is a faster >>> option available. >>> >>> Weak argument, I know, but some instinct in me thinks that the >>> current >>> mode should remain. >>> >>> Otis >>> >>> >>> --- Doug Cutting wrote: >>> >>>> hui wrote: >>>> >>>>> Index time: compound format is 89 seconds slower. >>>>> >>>>> compound format: >>>>> 1389507 total milliseconds >>>>> non-compound format: >>>>> 1300534 total milliseconds >>>>> >>>>> The index size is 85m with 4 fields only. The files are stored in >>>> >>>> the index. >>>> >>>>> The compound format has only 3 files and the other has 13 files. >>>> >>>> Thanks for performing this benchmark! >>>> >>>> It looks like the compound format is around 7% slower when >>>> indexing. To my thinking that's acceptable, given the dramatic >>>> reduction in file handles. If folks really need maximal indexing >>>> performance, then >>>> they can explicitly disable the compound format. >>>> >>>> Would anyone object to making compound format the default for >>>> Lucene 1.4? This is an incompatible change, but I don't think it >>>> should >>>> break applications. >>>> >>>> Doug >>>> >>>> -------------------------------------------------------------------- >>>> - >>>> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org >>>> For additional commands, e-mail: lucene-user-help@jakarta.apache.org >>>> >>> >>> >>> --------------------------------------------------------------------- >>> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org >>> For additional commands, e-mail: lucene-user-help@jakarta.apache.org >>> >>> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org >> For additional commands, e-mail: lucene-user-help@jakarta.apache.org > > --------------------------------------------------------------------- > To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org > For additional commands, e-mail: lucene-dev-help@jakarta.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org For additional commands, e-mail: lucene-dev-help@jakarta.apache.org