lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shawn Heisey <s...@elyograg.org>
Subject Re: ReplicationHandler - SnapPull failed to download a file completely.
Date Wed, 30 Oct 2013 20:00:27 GMT
On 10/30/2013 1:49 PM, Shalom Ben-Zvi Kazaz wrote:
> we are continuously getting this exception during replication from
> master to slave. our index size is 9.27 G and we are trying to replicate
> a slave from scratch.
> Its a different file each time , sometimes we get to 60% replication
> before it fails and sometimes only 10%, we never managed a successful
> replication.

<snip>

> this is the master setup:
>
> |<requestHandler name="/replication" class="solr.ReplicationHandler" >
>     <lst name="master">
>       <str name="replicateAfter">commit</str>
>       <str name="replicateAfter">startup</str>
>       <str name="confFiles">stopwords.txt,spellings.txt,synonyms.txt,protwords.txt,elevate.xml,currency.xml</str>
>       <str name="commitReserveDuration">00:00:50</str>
>     </lst>
> </requestHandler>

I assume that you're probably doing commits fairly often, resulting in a 
lot of merge activity that frequently deletes segments.  That 
"commitReserveDuration" parameter needs to be made larger.  I would 
imagine that it takes a lot more than 50 seconds to do the replication - 
even if you've got an extremely fast network, replicating 9.7GB probably 
takes several minutes.

 From the wiki page on replication:  "If your commits are very frequent 
and network is particularly slow, you can tweak an extra attribute 
<str name="commitReserveDuration">00:00:10</str>. This is roughly the 
time taken to download 5MB from master to slave. Default is 10 secs."

http://wiki.apache.org/solr/SolrReplication#Master

You've said that your network is not slow, but with that much data, all 
networks are slow.

Thanks,
Shawn


Mime
View raw message