lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jamie Johnson <jej2...@gmail.com>
Subject Re: Replication issues after machine failure
Date Sun, 13 May 2012 02:35:23 GMT
I have not tried to reproduce as of yet but hope to do so Monday. The
machine that had the issue was a vm out of my control so I'm not certain
how it was restored. I am using a fairly recent nightly build within the
last few weeks

On Friday, May 11, 2012, Mark Miller <markrmiller@gmail.com> wrote:
> So it's easy to reproduce? What do you mean restored from a prior state?
>
> What snapshot are you on these days for future ref?
>
> You have double checked to make sure that shard is listed as ACTIVE right?
>
> On May 11, 2012, at 4:55 PM, Jamie Johnson wrote:
>
>> I've had a few instances where a machine has needed to be restored
>> from a prior state.  After doing so and firing up solr again I've had
>> instances where replication doesn't seem to be working properly.  I
>> have not seen any failures in logs (will have to keep a closer eye on
>> this) but when this happens and I execute a query against each with
>> distrib=false I am seeing the following counts
>>
>> Shard @ host1(shard1) returned 95150
>> Shard @ host2(shard1) returned 95150
>> Shard @ host2(shard4) returned 94311
>> Shard @ host3(shard4) returned 8468
>> Shard @ host3(shard5) returned 8303
>> Shard @ host1(shard5) returned 96054
>> Shard @ host1(shard2) returned 95620
>> Shard @ host2(shard2) returned 95620
>> Shard @ host2(shard3) returned 93195
>> Shard @ host3(shard3) returned 8336
>> Shard @ host3(shard6) returned 8309
>> Shard @ host1(shard6) returned 96036
>>
>>
>> in this case host3 is what failed and as you can see everything on
>> host3 is significantly less than what the leader has.  Has anyone else
>> experienced this?
>
> - Mark Miller
> lucidimagination.com
>
>
>
>
>
>
>
>
>
>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message