Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 9243 invoked from network); 13 Jan 2010 22:36:00 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 13 Jan 2010 22:36:00 -0000 Received: (qmail 86073 invoked by uid 500); 13 Jan 2010 22:35:58 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 86005 invoked by uid 500); 13 Jan 2010 22:35:57 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 85995 invoked by uid 99); 13 Jan 2010 22:35:57 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 13 Jan 2010 22:35:57 +0000 X-ASF-Spam-Status: No, hits=0.0 required=10.0 tests=FUZZY_VLIUM,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of jason.rutherglen@gmail.com designates 209.85.222.194 as permitted sender) Received: from [209.85.222.194] (HELO mail-pz0-f194.google.com) (209.85.222.194) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 13 Jan 2010 22:35:49 +0000 Received: by pzk32 with SMTP id 32so1379269pzk.29 for ; Wed, 13 Jan 2010 14:35:28 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:message-id:subject:from:to:content-type :content-transfer-encoding; bh=L/dgl1zmAc5KMFP6RCZBY83Ni5B+ejBkddZ7toFRkQM=; b=n054aTKw+CyzxrLHtnoQsn6j9b4rUwxG1fcHPghOlTGZSjo0sMvsiJ8nTR07EyOJ5T sMA/r2PJmLFQzCZC9z1/TvoT8bBwtimY46heV0PZoRGnIdlX9YE9OqliVoTOv7kwm7pI mFnHubCy/C6zUyZQsRCyecFzQJupYTTTt+0KI= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type:content-transfer-encoding; b=GmLLWDEo4Hon613ZNGF5h/lv8dL++loN0WinoJTCzAXOf3ngycbOWF1jWwygo8j2Of KX3R57pou2Nq2ohv8ZZ+8+0vFdBX+hnPYkSFyBTP8T+TNAvnRBKHV3gt/2PCGgMQROFE +UFCVLsxXmiFv8YVLfLTPLOVZpXBk5oN5bfg4= MIME-Version: 1.0 Received: by 10.141.107.13 with SMTP id j13mr4710464rvm.172.1263422128476; Wed, 13 Jan 2010 14:35:28 -0800 (PST) In-Reply-To: <79015b391001131429p729c967oaa5e4ab770854e@mail.gmail.com> References: <79015b391001131336t81cc1a9xa15267f5249244e@mail.gmail.com> <85d3c3b61001131343o63023bc9tb772770c12d9c00f@mail.gmail.com> <79015b391001131349j7158b650k7f28e814955377f@mail.gmail.com> <85d3c3b61001131357pcf2625ah9fcefb004646733e@mail.gmail.com> <79015b391001131429p729c967oaa5e4ab770854e@mail.gmail.com> Date: Wed, 13 Jan 2010 14:35:28 -0800 Message-ID: <85d3c3b61001131435j92d28f1j9130abadcd72df3@mail.gmail.com> Subject: Re: Max Segmentation Size when Optimizing Index From: Jason Rutherglen To: java-user@lucene.apache.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-Virus-Checked: Checked by ClamAV on apache.org There's a different method in LogMergePolicy that performs the optimize... Right, so normal merging uses the findMerges method, then there's a findMergeOptimize (method names could be inaccurate). On Wed, Jan 13, 2010 at 2:29 PM, Trin Chavalittumrong wr= ote: > Do you mean MergePolicy is only used during index time and will be ignore= d > by by the Optimize() process? > > > On Wed, Jan 13, 2010 at 1:57 PM, Jason Rutherglen < > jason.rutherglen@gmail.com> wrote: > >> Oh ok, you're asking about optimizing... I think that's a different >> algorithm inside LogMergePolicy. =A0I think it ignores the maxMergeMB >> param. >> >> On Wed, Jan 13, 2010 at 1:49 PM, Trin Chavalittumrong >> wrote: >> > Thanks, Jason. >> > >> > Is my understanding correct that >> LogByteSizeMergePolicy.setMaxMergeMB(100) >> > will prevent >> > merging of two segments that is larger than 100 Mb each at the optimiz= ing >> > time? >> > >> > If so, why do think would I still see segment that is larger than 200 = MB? >> > >> > >> > >> > On Wed, Jan 13, 2010 at 1:43 PM, Jason Rutherglen < >> > jason.rutherglen@gmail.com> wrote: >> > >> >> Hi Trin, >> >> >> >> There was recently a discussion about this, the max size is >> >> for the before merge segments, rather than the resultant merged >> >> segment (if that makes sense). It'd be great if we had a merge >> >> policy that limited the resultant merged segment, though that'd >> >> by a rough approximation at best. >> >> >> >> Jason >> >> >> >> On Wed, Jan 13, 2010 at 1:36 PM, Trin Chavalittumrong > > >> >> wrote: >> >> > Hi, >> >> > >> >> > >> >> > >> >> > I am trying to optimize the index which would merge different segme= nt >> >> > together. Let say the index folder is 1Gb in total, I need each >> >> segmentation >> >> > to be no larger than 200Mb. I tried to use *LogByteSizeMergePolicy >> *and >> >> > setMaxMergeMB(100) to ensure no segment after merging would be 200M= b. >> >> > However, I still see segment that are larger than 200Mb. I did call >> >> > IndexWriter.optimize(20) to make sure there are enough number >> >> segmentation >> >> > to allow each segment to be under 200Mb. >> >> > >> >> > >> >> > >> >> > Can someone let me know if I am using this right? Or any suggestion= on >> >> how >> >> > to tackle this would be helpful. >> >> > >> >> > >> >> > >> >> > Thanks, >> >> > >> >> > Trin >> >> > >> >> >> >> --------------------------------------------------------------------- >> >> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org >> >> For additional commands, e-mail: java-user-help@lucene.apache.org >> >> >> >> >> > >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org >> For additional commands, e-mail: java-user-help@lucene.apache.org >> >> > --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org