hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Dunning <tdunn...@veoh.com>
Subject Re: Lucene-based Distributed Index Leveraging Hadoop
Date Thu, 07 Feb 2008 00:15:04 GMT

We have quite a few serving the load, but if we are trying to update
relatively often (say every 30 minutes), then having a server out of action
for several minutes really hurts.  The outage is that long because you have

A) turn off traffic
B) wait for traffic to actually stop
C) move the multi-gigabyte index to the machine
D) warm up the new index
E) start traffic
F) wait for traffic to actually fully start
G) declare switch-over complete

Depending on your update interval, this can easily each 30-40% of your
capacity which seems absurd since a hot search engine rarely tries to read
from disk at all.

On 2/6/08 3:56 PM, "Ning Li" <ning.li.00@gmail.com> wrote:

> How many shard servers are serving each shard? If it's more than one,
> you can have the rest of the shard servers sharing the query workload
> while one shard server loads a new version of a shard.

View raw message