Mailing-List: contact solr-user-help@lucene.apache.org; run by ezmlm
Precedence: bulk
Reply-To: solr-user@lucene.apache.org
Received-SPF: pass (nike.apache.org: domain of goksron@gmail.com designates
 74.125.82.176 as permitted sender)
DomainKey-Signature: a=rsa-sha1; c=nofws;
        d=gmail.com; s=gamma;
        h=mime-version:in-reply-to:references:date:message-id:subject:from:to
         :content-type:content-transfer-encoding;
        b=CxY/wP4YIZ8FGCQkvQhvRCAGCqkKw47BG1UO/61YCP7ecrQi96QESeLD8l2qVlSbMw
         i3am5NRnAdz4vYb4ubotFJA8boBWhCGL0JkU6MPvq9CQrWwXrkllI8RE7dl0dfCGed4b
         uzI29W1QN2isxLascn2+XIwJmvQ6+6/VYDGKA=
MIME-Version: 1.0
In-Reply-To: <4C7FC360.9070405@gmail.com>
References: <AANLkTinEa5XJ-Tz49CzHOdp6UPCpZs49N3M_Mfw6CaxV@mail.gmail.com>
	<AANLkTi=y3SXaDepcAfqfKN-dXcAYE_7JNxxkpPmRtxw1@mail.gmail.com>
	<4C4E1E50.6080507@gmail.com>
	<4C4E1EDB.8010402@gmail.com>
	<alpine.DEB.1.10.1008061659040.10784@radix.cryptio.net>
	<4C7FC360.9070405@gmail.com>
Date: Thu, 2 Sep 2010 18:11:42 -0700
Message-ID: <AANLkTimgYB_vFK9YX=mZYLXbYePew6m8rp30pY1jOev2@mail.gmail.com>
Subject: Re: Solr crawls during replication
From: Lance Norskog <goksron@gmail.com>
To: solr-user@lucene.apache.org
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

Yes, the rsync scripts are still there. And they still work fine. It
definitely helps to be a Unix shell wiz.

You would add an option to the rsync call in the scripts that does
rsync throttling.

Rsync is just a standard copying tool in the SSH toolsuite. It's 12
years old and works quite well.

Lance

On Thu, Sep 2, 2010 at 8:31 AM, Mark <static.void.dev@gmail.com> wrote:
> =C2=A0On 8/6/10 5:03 PM, Chris Hostetter wrote:
>>
>> : We have an index around 25-30G w/ 1 master and 5 slaves. We perform
>> : replication every 30 mins. During replication the disk I/O obviously
>> shoots up
>> : on the slaves to the point where all requests routed to that slave tak=
e
>> a
>> : really long time... sometimes to the point of timing out.
>> :
>> : Is there any logical or physical changes we could make to our
>> architecture to
>> : overcome this problem?
>>
>> If the problem really is disk I/O then perhaps you don't have enough RAM
>> set asside for the filesystem cache to keep the "current" index in memor=
y?
>>
>> I've seen people have this type of problem before, but usually it's
>> network I/O that is the bottleneck, in which case using multiple NICs on
>> your slaves (one for client requests, one for replication)
>>
>> I think at one point there was also talk about leveraging an rsync optio=
n
>> to force snappuller to throttle itself an only use a max amount of
>> bandwidth -- but then we moved away from script based replication to jav=
a
>> based replication and i don't think the Java Network/IO system suports
>> that type of throttling. =C2=A0However: you might be able to configure i=
t in
>> your switches/routers (ie: only let the slaves use X% of their total
>> badwidth to talk to the master)
>>
>>
>> -Hoss
>>
> Thanks for the suggestions. Our slaves have 12G with 10G dedicated to the
> JVM.. too much?
>
> Are the rysnc snappuller featurs still available in 1.4.1? I may try that=
 to
> see if helps. Configuration of the switches may also be possible.
>
> Also, would you mind explaining your second point... using dual NIC cards=
.
> How can this be accomplished/configured. Thanks for you help
>


--=20
Lance Norskog
goksron@gmail.com