lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jason Rutherglen <jason.rutherg...@gmail.com>
Subject Re: Max Segmentation Size when Optimizing Index
Date Wed, 13 Jan 2010 22:54:38 GMT
Yes... You could hack LogMergePolicy to do something else.

I use optimise(numsegments:5) regularly on 80GB indexes, that if
optimized to 1 segment, would thrash the IO excessively.  This works
fine because 15-20GB indexes are plenty large and fast.

On Wed, Jan 13, 2010 at 2:44 PM, Trin Chavalittumrong <mrtrin@gmail.com> wrote:
> Seems like optimize() only cares about final number of segments rather than
> the size of the segment. Is it so?
>
> On Wed, Jan 13, 2010 at 2:35 PM, Jason Rutherglen <
> jason.rutherglen@gmail.com> wrote:
>
>> There's a different method in LogMergePolicy that performs the
>> optimize... Right, so normal merging uses the findMerges method, then
>> there's a findMergeOptimize (method names could be inaccurate).
>>
>> On Wed, Jan 13, 2010 at 2:29 PM, Trin Chavalittumrong <mrtrin@gmail.com>
>> wrote:
>> > Do you mean MergePolicy is only used during index time and will be
>> ignored
>> > by by the Optimize() process?
>> >
>> >
>> > On Wed, Jan 13, 2010 at 1:57 PM, Jason Rutherglen <
>> > jason.rutherglen@gmail.com> wrote:
>> >
>> >> Oh ok, you're asking about optimizing... I think that's a different
>> >> algorithm inside LogMergePolicy.  I think it ignores the maxMergeMB
>> >> param.
>> >>
>> >> On Wed, Jan 13, 2010 at 1:49 PM, Trin Chavalittumrong <mrtrin@gmail.com
>> >
>> >> wrote:
>> >> > Thanks, Jason.
>> >> >
>> >> > Is my understanding correct that
>> >> LogByteSizeMergePolicy.setMaxMergeMB(100)
>> >> > will prevent
>> >> > merging of two segments that is larger than 100 Mb each at the
>> optimizing
>> >> > time?
>> >> >
>> >> > If so, why do think would I still see segment that is larger than 200
>> MB?
>> >> >
>> >> >
>> >> >
>> >> > On Wed, Jan 13, 2010 at 1:43 PM, Jason Rutherglen <
>> >> > jason.rutherglen@gmail.com> wrote:
>> >> >
>> >> >> Hi Trin,
>> >> >>
>> >> >> There was recently a discussion about this, the max size is
>> >> >> for the before merge segments, rather than the resultant merged
>> >> >> segment (if that makes sense). It'd be great if we had a merge
>> >> >> policy that limited the resultant merged segment, though that'd
>> >> >> by a rough approximation at best.
>> >> >>
>> >> >> Jason
>> >> >>
>> >> >> On Wed, Jan 13, 2010 at 1:36 PM, Trin Chavalittumrong <
>> mrtrin@gmail.com
>> >> >
>> >> >> wrote:
>> >> >> > Hi,
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >> > I am trying to optimize the index which would merge different
>> segment
>> >> >> > together. Let say the index folder is 1Gb in total, I need
each
>> >> >> segmentation
>> >> >> > to be no larger than 200Mb. I tried to use *LogByteSizeMergePolicy
>> >> *and
>> >> >> > setMaxMergeMB(100) to ensure no segment after merging would
be
>> 200Mb.
>> >> >> > However, I still see segment that are larger than 200Mb. I
did call
>> >> >> > IndexWriter.optimize(20) to make sure there are enough number
>> >> >> segmentation
>> >> >> > to allow each segment to be under 200Mb.
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >> > Can someone let me know if I am using this right? Or any suggestion
>> on
>> >> >> how
>> >> >> > to tackle this would be helpful.
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >> > Thanks,
>> >> >> >
>> >> >> > Trin
>> >> >> >
>> >> >>
>> >> >> ---------------------------------------------------------------------
>> >> >> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> >> >> For additional commands, e-mail: java-user-help@lucene.apache.org
>> >> >>
>> >> >>
>> >> >
>> >>
>> >> ---------------------------------------------------------------------
>> >> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> >> For additional commands, e-mail: java-user-help@lucene.apache.org
>> >>
>> >>
>> >
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message