lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael McCandless <luc...@mikemccandless.com>
Subject Re: is there a way to control when merges happen?
Date Fri, 15 May 2009 21:00:27 GMT
You're welcome!

SegmentInfo exposes a sizeInBytes() method, so you can sum up that
result for all segments in the merge.

But NOTE: your merge scheduler must be located in the
org.apache.lucene.index package (this API is currently package
private).

Mike

On Fri, May 15, 2009 at 4:56 PM, Dan OConnor <doconnor@acquiremedia.com> wrote:
> Mike,
> Thank for the reply.
>
> A follow up question.
> How can I tell the big merges from the small ones?
>
> Regards,
> Dan
>
> ----- Original Message -----
> From: Michael McCandless <lucene@mikemccandless.com>
> To: java-user@lucene.apache.org <java-user@lucene.apache.org>
> Sent: Fri May 15 16:50:27 2009
> Subject: Re: is there a way to control when merges happen?
>
> I think you could subclass ConcurrentMergeScheduler, overriding
> merge() to only call super.merge() if the time is right?  (And just
> return right away if it's not the right time).
>
> Though you might want to allow small merges to run in real-time, and
> big merges to wait until after hours.
>
> Mike
>
> On Fri, May 15, 2009 at 4:41 PM, Dan OConnor <doconnor@acquiremedia.com> wrote:
>> All:
>>
>> I would like to be able to control when an index merge happens (by wall clock time)
so that merges do not occur in the middle of the business day.
>>
>> I have a lucene system based on v2.3.2 and we add a couple hundred thousand documents
per day - and we allow searching while documents are being added - we reopen an IndexReader
periodically to expose newly arrived contents.
>>
>> There are times when merging causes significant performance impacts on search results
- I've seen cases where merging will cause 200% load on a system (dual quad core x86_64 running
Centos) with a raid-5 disk subsystem of 15k drives.
>>
>> I've seen some info on the MergeScheduler and ConcurrentMergeScheduler but not necessarily
enough to attempt a coding effort.
>>
>> Looking through the code for ConcurrentMergeScheduler.java, is it as straightforward
as over-riding the mergeScheduler.merge() method with a method that checks to see if a merge
is allowed (by wall clock time)?  If a merge is not allowed at that time, can I just return();?
Or do I have to sleep the thread until the merge is allowed?
>>
>> Thanks,
>> Dan
>>
>>
>> Dan O'Connor
>> SVP, Engineering
>> Acquire Media<http://www.acquiremedia.com/>
>> 77 South Bedford Street, Suite 350<http://maps.google.com/maps?f=q&hl=en&geocode=&q=77+S+Bedford+St,+Burlington,+MA+01803&sll=37.0625,-95.677068&sspn=32.472848,80.859375&ie=UTF8&ll=42.485517,-71.197935&spn=0.002287,0.005193&t=h&z=18>
>> Burlington, MA 01803<http://maps.google.com/maps?f=q&hl=en&geocode=&q=77+S+Bedford+St,+Burlington,+MA+01803&sll=37.0625,-95.677068&sspn=32.472848,80.859375&ie=UTF8&ll=42.485517,-71.197935&spn=0.002287,0.005193&t=h&z=18>
>> e: doconnor@acquiremedia.com<mailto:doconnor@acquiremedia.com>
>> o: 781-250-0565
>> f: 877-861-7724
>>
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message