lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Earwin Burrfoot <>
Subject Re: ConcurrentMergeScheduler and MergePolicy question
Date Sat, 08 Aug 2009 23:26:39 GMT
> Perhaps the ideal search system architecture that requires
> optimizing is to dedicate a server to it, copy the index to the
> optimize server, do the optimize, copy the index off (to a
> search server) and start again for the next optimize task.
> I wonder how/if this would work with Hadoop/HDFS as copying
> 100GB around would presumably tie up the network? Also, I've
> found rsyncing large optimized indexes to be time consuming and
> wreaks havoc on the searcher server's IO subsystem. Usually this
> is unacceptable for the user as the queries will suddenly
> degrade.
You don't have to copy. You can have one machine optimize your indexes
whilst other serves user requests, then they switch roles, rinse,
repeat. This approach also works with sharding, and more than 2-way

Kirill Zakharenko/Кирилл Захаренко (
Home / Mobile: +7 (495) 683-567-4 / +7 (903) 5-888-423
ICQ: 104465785

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message