Return-Path: X-Original-To: apmail-lucene-dev-archive@www.apache.org Delivered-To: apmail-lucene-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 915114BEF for ; Sat, 21 May 2011 10:47:06 +0000 (UTC) Received: (qmail 70039 invoked by uid 500); 21 May 2011 10:47:05 -0000 Delivered-To: apmail-lucene-dev-archive@lucene.apache.org Received: (qmail 69989 invoked by uid 500); 21 May 2011 10:47:05 -0000 Mailing-List: contact dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@lucene.apache.org Delivered-To: mailing list dev@lucene.apache.org Received: (qmail 69982 invoked by uid 99); 21 May 2011 10:47:05 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 21 May 2011 10:47:05 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=5.0 tests=RCVD_IN_DNSWL_LOW,SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (nike.apache.org: local policy) Received: from [74.125.82.48] (HELO mail-ww0-f48.google.com) (74.125.82.48) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 21 May 2011 10:46:59 +0000 Received: by wwi18 with SMTP id 18so3235265wwi.5 for ; Sat, 21 May 2011 03:46:38 -0700 (PDT) MIME-Version: 1.0 Received: by 10.227.199.207 with SMTP id et15mr500798wbb.56.1305974798140; Sat, 21 May 2011 03:46:38 -0700 (PDT) Received: by 10.227.24.11 with HTTP; Sat, 21 May 2011 03:46:38 -0700 (PDT) In-Reply-To: <47316FE3F6BA0D4DADF99663512552680B10035BA5@ITCS-ECLS-1-VS3.adsroot.itcs.umich.edu> References: <47316FE3F6BA0D4DADF99663512552680B0F373C74@ITCS-ECLS-1-VS3.adsroot.itcs.umich.edu> <47316FE3F6BA0D4DADF99663512552680B0F373E27@ITCS-ECLS-1-VS3.adsroot.itcs.umich.edu> <47316FE3F6BA0D4DADF99663512552680B0F3743BC@ITCS-ECLS-1-VS3.adsroot.itcs.umich.edu> <47316FE3F6BA0D4DADF99663512552680B10035BA5@ITCS-ECLS-1-VS3.adsroot.itcs.umich.edu> Date: Sat, 21 May 2011 06:46:38 -0400 Message-ID: Subject: Re: MergePolicy Thresholds From: Michael McCandless To: "Burton-West, Tom" Cc: "dev@lucene.apache.org" Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: quoted-printable X-Virus-Checked: Checked by ClamAV on apache.org Thanks Tom! Sounds like great fun working with such massive data sets :) Mike http://blog.mikemccandless.com On Fri, May 20, 2011 at 7:03 PM, Burton-West, Tom wrot= e: > Hi Mike and Shai, > > > > I was able to index=A0 a few documents with the tieredMergePolicy but I w= as > hoping to build a large test index of about 700,000 documents to compare = the > performance against our previous runs.=A0 I was hoping I would be able to > report on my results in time for the Lucene Revolution conference. > Unfortunately there was a power outage at our data center last week which > resulted in a node failure in one of our storage nodes and node rebalanci= ng > for a cluster of 500 terabytes takes quite a while and totally messes up > performance measurements.=A0 (Our 6-8 terabytes of large scale search ind= exes > shares storage with the repository that holds the 480+ terabytes of page > images and metadata for the 8 million+ books).=A0 =A0Hopefully I will be = able to > run the tests when I get back. > > > > Tom > > > > From: Burton-West, Tom [mailto:tburtonw@umich.edu] > Sent: Monday, May 09, 2011 4:10 PM > > To: dev@lucene.apache.org > Subject: RE: MergePolicy Thresholds > > > > Thanks again Shai and Mike. > > > > Am in the process of downloading and building =A0=A0r1099998.=A0 Should b= e able to > build a test index sometime this week.=A0 I=92ll make some guesses on wha= t > parameters to use based on our previous tests. > > > > Tom > > From: Shai Erera [mailto:serera@gmail.com] > Sent: Saturday, May 07, 2011 11:33 PM > To: dev@lucene.apache.org > Subject: Re: MergePolicy Thresholds > > > > Hey Tom, > > Mike back-ported the changes to 3x, so you can try it out. > > FYI, > Shai > > On Tue, May 3, 2011 at 9:33 PM, Burton-West, Tom wro= te: > > Thanks Shai and Mike! > > I'll keep an eye on LUCENE-1076. > > Tom > > -----Original Message----- > From: Michael McCandless [mailto:lucene@mikemccandless.com] > > Sent: Tuesday, May 03, 2011 11:15 AM > To: dev@lucene.apache.org > Subject: Re: MergePolicy Thresholds > > Thanks Shai! > > I'm way behind on my 3.x backports -- I'll try to do this soon. > > Mike > > http://blog.mikemccandless.com > > On Tue, May 3, 2011 at 8:10 AM, Shai Erera wrote: >> I uploaded a patch to LUCENE-1076. >> >> Tom, apparently the patch I've attached before cannot be used, because >> there >> are dependencies (in earlier commits on LUCENE-1076) that need to be >> back-ported as well. So stay tuned on LUCENE-1076 for when it is safe to >> use >> this new MP. >> >> Shai >> >> On Tue, May 3, 2011 at 1:00 PM, Michael McCandless >> wrote: >>> >>> That'd be great, thanks :) >>> >>> Yes, let's iterate on the issue! =A0But: it should still be open, I hop= e >>> (I didn't mean to close it yet, since it's not back ported)... >>> >>> Mike >>> >>> http://blog.mikemccandless.com >>> >>> On Tue, May 3, 2011 at 5:51 AM, Shai Erera wrote: >>> > Mike, if you want, I can back-port it, as I've already started this >>> > when >>> > preparing the patch. >>> > >>> > I noticed that you added a "throws IOE" to IW.setInfoStream -- is it = ok >>> > on >>> > 3x too? It'll be a backwards change. >>> > >>> > Maybe we should iterate on the issue? I can reopen. >>> > >>> > Shai >>> > >>> > On Tue, May 3, 2011 at 12:36 PM, Michael McCandless >>> > wrote: >>> >> >>> >> Looks good Shai! >>> >> >>> >> Comments below too: >>> >> >>> >> On Tue, May 3, 2011 at 5:29 AM, Shai Erera wrote: >>> >> > Hi >>> >> > >>> >> > I looked into porting it to 3x, and prepared the attached patch. I= t >>> >> > only >>> >> > contains the new TieredMP and Test, as well as the necessary chang= es >>> >> > to >>> >> > LuceneTestCase and IndexWriter. I guess you can start with it (eve= n >>> >> > just >>> >> > the >>> >> > MP and IW changes) to test it on your indexes. >>> >> > >>> >> > Mike, I saw that there were many more changes, as part of >>> >> > LUCENE-1076, >>> >> > done >>> >> > to the code. In particular, this MP is now the default (on trunk), >>> >> > so >>> >> > I >>> >> > guess many changes (to tests) were needed because of that. Do you >>> >> > remember, >>> >> > if apart from the changes I've included in the patch, other >>> >> > important >>> >> > changes w.r.t. this code? >>> >> >>> >> The only other changes I can think of were some verbosity improvemen= ts >>> >> to IndexWriter, to support the python script that can make a merge >>> >> movie from an infoStream output; but that can wait for when I >>> >> back-port to 3.x... >>> >> >>> >> > As we won't change the default MP on 3x, I'm guessing I don't need >>> >> > to >>> >> > port >>> >> > all the changes to 3x. >>> >> >>> >> Right, I think. >>> >> >>> >> Mike >>> >> >>> >> --------------------------------------------------------------------= - >>> >> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org >>> >> For additional commands, e-mail: dev-help@lucene.apache.org >>> >> >>> > >>> > >>> >>> --------------------------------------------------------------------- >>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org >>> For additional commands, e-mail: dev-help@lucene.apache.org >>> >> >> > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org > For additional commands, e-mail: dev-help@lucene.apache.org > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org > For additional commands, e-mail: dev-help@lucene.apache.org > > --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For additional commands, e-mail: dev-help@lucene.apache.org