incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From aaron morton <aa...@thelastpickle.com>
Subject Re: repair hangs
Date Thu, 14 Mar 2013 13:34:19 GMT
> 1. is this a nodetool bug?  is there any way to propagate the
> java.io.IOException back to nodetool?
The repair continues to work even if nodetool fails, it's a server side thing. 

> 2. network problems on EC2, I'm shocked!  are there recommended
> network settings for EC2?
Streaming does not put a timeout on the socket, in this case check the 10.82.233.59 node to
see why the pipe broke. 

Cheers
-----------------
Aaron Morton
Freelance Cassandra Consultant
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 13/03/2013, at 4:28 PM, Dane Miller <dane@optimalsocial.com> wrote:

> On Wed, Mar 13, 2013 at 12:39 PM, Wei Zhu <wz1975@yahoo.com> wrote:
>> My guess would be there is some exception during the repair and your session is aborted.
>> Here is the code of doing repair:
>> 
>> https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/service/AntiEntropyService.java
>> 
>> looking for
>> 
>> logger.info
>> 
>> Compare that with your log file, it should give you a rough idea in which stage repaired
died.
> 
> Thanks for the link to the source.  That's a little hard to grok, but
> your suggestion to examine the logs more thoroughly was helpful.  I
> was able to determine that repair hung due to connection errors during
> streaming.  I'll include log snippets below, but this leads me to
> other more important questions...
> 
> 1. is this a nodetool bug?  is there any way to propagate the
> java.io.IOException back to nodetool?
> 2. network problems on EC2, I'm shocked!  are there recommended
> network settings for EC2?
> 
> Dane
> 
> Here are the relevant logs showing (A) repair progress, and (B)
> java.io.IOExceptions
> 
> (A) repair progress
> INFO [Thread-5314] 2013-03-11 23:29:28,866 StorageService.java (line
> 2364) Starting repair command #9, repairing 1 ranges for keyspace
> OpsCenter
> INFO [AntiEntropySessions:13] 2013-03-11 23:29:28,867
> AntiEntropyService.java (line 652) [repair
> #84e86020-8aa3-11e2-abb2-17112e360b9a] new session: will sync
> /10.34.37.195, /10.82.233.59 on range
> (0,28356863910078205288614550619314017621] for OpsCenter.[events,
> rollups60, settings, pdps, rollups86400, events_timeline, rollups300,
> rollups7200]
> INFO [Thread-5320] 2013-03-11 23:29:29,198 AntiEntropyService.java
> (line 765) [repair #84e86020-8aa3-11e2-abb2-17112e360b9a] events is
> fully synced (7 remaining column family to sync for this session)
> INFO [AntiEntropyStage:1] 2013-03-11 23:38:02,198
> AntiEntropyService.java (line 765) [repair
> #84e86020-8aa3-11e2-abb2-17112e360b9a] settings is fully synced (6
> remaining column family to sync for this session)
> INFO [AntiEntropyStage:1] 2013-03-11 23:38:02,617
> AntiEntropyService.java (line 765) [repair
> #84e86020-8aa3-11e2-abb2-17112e360b9a] pdps is fully synced (5
> remaining column family to sync for this session)
> INFO [Streaming to /10.82.233.59:34] 2013-03-11 23:38:12,491
> AntiEntropyService.java (line 765) [repair
> #84e86020-8aa3-11e2-abb2-17112e360b9a] rollups86400 is fully synced (4
> remaining column family to sync for this session)
> INFO [Streaming to /10.82.233.59:36] 2013-03-11 23:39:55,886
> AntiEntropyService.java (line 765) [repair
> #84e86020-8aa3-11e2-abb2-17112e360b9a] rollups7200 is fully synced (3
> remaining column family to sync for this session)
> 
> 
> (B) java.io.IOException
> # grep -A1 ERROR /var/log/cassandra/system.log.2
> ERROR [Streaming to /10.82.233.59:34] 2013-03-11 23:38:12,654
> CassandraDaemon.java (line 132) Exception in thread Thread[Streaming
> to /10.82.233.59:34,5,main]
> java.lang.RuntimeException: java.io.IOException: Connection reset by peer
> --
> ERROR [Streaming to /10.82.233.59:35] 2013-03-11 23:38:12,692
> CassandraDaemon.java (line 132) Exception in thread Thread[Streaming
> to /10.82.233.59:35,5,main]
> java.lang.RuntimeException: java.io.IOException: Broken pipe
> --
> ERROR [Streaming to /10.82.233.59:36] 2013-03-11 23:39:55,932
> CassandraDaemon.java (line 132) Exception in thread Thread[Streaming
> to /10.82.233.59:36,5,main]
> java.lang.RuntimeException: java.io.IOException: Broken pipe


Mime
View raw message